Extract file records based on some field conditions

mehimadri · November 29, 2010, 11:44pm

Hello Friends,

Now for this file, I have to get those records which fulfil the following condition:
If "ColB" is NOT NULL and "ColD" has values one of the following ('A','B','C') the show the records

If the aforementioned condition is satisfied then I need to have those records only in a separate file.

Please help me in this! Thanks in Advance!!

with warm regards

kevintse · November 30, 2010, 12:03am

ABC ↩︎

mehimadri · November 30, 2010, 1:14am

Hello kevintse,

Sorry I failed to mentioned that the values I have to check are not 'A','B','C' but string values like 'Asia','Casablanca','Brando'

So the condition should be:
If "ColB" is NOT NULL and "ColD" has values one of the following ('Asia','Casablanca','Brando') the show the records

Sorry for the goof-up.
In this case how will teh awk command be..

Thanks in Advance!!

Chubler_XL · November 30, 2010, 1:34am

awk -F"|" 'BEGIN { B["Asia"]=B["Casablanca"]=B["Brando"]=1; }  $2 && ($4 in B) ' InputFile.csv > outfile

If you have many more than 3 criteria (our you may change them) you should consider loading the criteria from another file:

awk -F"|" 'NR==FNR { B[$1]=1; Next }  $2 && ($4 in B) ' CritFile.txt InputFile.csv > outfile

mehimadri · November 30, 2010, 2:16am

Hi,

When I tried the command I received the following error:

$ cat InputFile.csv
ColA|ColB|ColC|ColD|ColE|ColF
1|A|B|Asia|E|F
1||C|Asia|E|F
2|B|C|Casablanca|E|F
3|B|C|Brando|E|F

$ awk -F"|" 'BEGIN { B["Asia"]=B["Casablanca"]=B["Brando"]=1; } $2 && ($4 in B) ' InputFile.csv > outfile
awk: syntax error near line 1
awk: bailing out near line 1

Please help me in this. Thanks in Advance!!

Scrutinizer · November 30, 2010, 2:46am

nawk -F\| '$2&&$4~/^(Brando|Casablanca|Asia)$/' InputFile.csv > outfile

ygemici · November 30, 2010, 5:27am

x=($(sed '=' infile | sed -n 'N;s/\n/ /;p'|sed -n 's/[^ ]|*[^|]*|*[^ ]|*\(Asia\|Casablanca\|Brando\).*/\1/p'|sed 's/[^0-9]\|//g;/^$/d') )
for i in ${x[@]} ; do sed -n "$i p" infile ; done
1|A|B|Asia|E|F
2|B|C|Casablanca|E|F
3|B|C|Brando|E|F

rdcwayx · November 30, 2010, 5:42am

I knew you try to short the code, but there is a bug in it, if $2=0, it is not null

$ nawk -F\| '$2&&$4~/^(Brando|Casablanca|Asia)$/' infile

Scrutinizer · November 30, 2010, 5:44am

Well spotted.. Thanks

nawk -F\| '$2""&&$4~/^(Brando|Casablanca|Asia)$/' InputFile.csv > outfile

Chubler_XL · November 30, 2010, 12:22pm

The original requirement was for 2nd field is non-null, lol