Dear Buddies,
Need ur help once again.
I have a flat file with around 20 million lines (Huge file it is). However, many of the lines are of no use hence I want to remove it. To find and delete such lines we have certain codes written at the starting of each line. Basis that we can delete the lines. However its not as easy as it is looking.
Following is the example.
consumer_list.temp
Name: Anushree Aggarwal
Tel1: 022-42158473
Tel2: 9965821475
Add1: Blah blah blah blah
Add2: Blah blah
Add3: Blah blah blah
Gndr: Female
Name: Rucha Chheda
Tel1: 022-42158499
Tel2: 8325698501
Add1: Blah blah
Add2: Blah blah blah
Add3: Blah blah blah blah
Gndr: Female
Name: Priyanka Rathi
Tel1: 022-42158482
Tel2: 9658231492
Tel3: 021-23654125
Add1: Blah blah blah blah
Add2: Blah blah
Add3: Blah blah blah
Add4: Blah blah
Gndr: Female
In above mentioned example multiple telephones and addresses are given. In output I want to take only one out of what ever number of telephones and addresses are provided.
Output should be as follows.
consumer_sorted_list.temp
Name: Anushree Aggarwal
Tel1: 022-42158473
Add1: Blah blah blah blah
Gndr: Female
Name: Rucha Chheda
Tel1: 022-42158499
Add1: Blah blah
Gndr: Female
Name: Priyanka Rathi
Tel1: 022-42158482
Add1: Blah blah blah blah
Gndr: Female
In short between "Name" and "Gndr" whatever information is provided it should appear only once.I am unable to think for a logic.
Need your help.
Thanks
Anu.