Extract data based on specific search criteria

I have a huge file (about 2 millions records) contains data separated by �,� (comma). As part of the requirement, I can't change the format. The objective is to remove some of the records with the following condition. If the 23rd field on each line start with 302 , I need to remove that from the original file. Simple grep command like �grep �v ^302� but here 302 is actually 23rd field and separated by comma. Please see the sample input and expected out. Your immediate help is really appreciated.

Data,4l4680,71130,2010,277,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771130,0,4l4680,Call,302619988771130,99988771130,1,
Data,4l4680,1132,2010,176,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771132,0,14680,Call,302619988771132,99988771132,1,
Data,4l3689,1133,2010,1574,,1,1,1,,,2,0,,,,,0,0,,0,,302619988871133,0,12689,_Call,302619988871133,99988871133,1,
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,And�,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,932356909,1,

Output:

Data,4l4680,71130,2010,277,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771130,0,4l4680,Call,302619988771130,99988771130,1,
Data,4l4680,1132,2010,176,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771132,0,14680,Call,302619988771132,99988771132,1,
Data,4l3689,1133,2010,1574,,1,1,1,,,2,0,,,,,0,0,,0,,302619988871133,0,12689,_Call,302619988871133,99988871133,1,

Try this

awk -F, '$23!~/^302/' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,And�,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
1 Like

Or Perl -

$
$ perl -F, -lane 'print if substr($F[22],0,3) ne "302"' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,And,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
$
$
$ perl -F, -lane 'print if not $F[22] =~ /^302/' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,And,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
$
$
$ perl -F, -lane 'print if $F[22] !~ /^302/' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,And,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
$
$

tyler_durden

1 Like
ruby -F"," -ane  'print if $F[22]!~/^302/' file
1 Like

Greatly appreciate your quick response. Keep up your good work guys.