Parsing through Awk

tostay2003 · March 3, 2011, 9:58pm

Hi All,

I have an input file something like this:

Line1
Line2
....
LineN

Identifier
( Field1a, Field1b;
  Field2a, Field1b;
  Field3a, Field1b;
  .....
)
LineN+1
LineN+2
etc..

I basically need Field1a, Field2a, Field3a.... from the above file

I am trying the below command

cat FileName | awk '/^Identifier/,/\(/ { flg=1;}' '/\)/ { flg=0;}' { If flg=1 print $0 };

the above command fetches

Identifier
( Field1a, Field1b;
  Field2a, Field1b;
  Field3a, Field1b;
  .....

I need to parse again using grep, cut commands:

Is there any better approach for this one?

Thanks

yinyuemi · March 3, 2011, 10:08pm

awk '/Identifier/{p=1;next}/\)/{p=0}p==1{print $2}' file

or:

sed -nr '/Identifier/,/\)/{s/.*(Field..), (Field..;)*/\1/p}' file

justlooks · March 3, 2011, 10:17pm

try this

grep -o "Field[0-9]\+a" ufile

Chubler_XL · March 3, 2011, 10:43pm

--Assumption here is fields in this file aren't actually named Field1a, Field2a, etc. but have real names.

awk -F'[;,]' '/^\)/{p=0}p==1{sub(/^[( ]*/,"",$1); print $1}/^Identifier/{p=1}' infile

sed -nr '/^Identifier/,/^\)/{s/^[ (]*([^,;]*)[,;]( *[^,;]*[;,])*$/\1/p}' infile

kurumi · March 5, 2011, 9:49am

$ ruby -0777 -ne 'puts $_.scan(/(?:Identifier.*\()(.[^)]*)\)/m)[0].join("\n").scan(/(.*),/)' file
 Field1a
  Field2a
  Field3a

Scrutinizer · March 6, 2011, 4:09am

awk -F'[ \t,]*' '/\)/{p=0}p{print $2}/^Identifier/{p=1}' file

Field1a
Field2a
Field3a
.....

Chubler_XL · March 6, 2011, 6:21pm

Here is a little sed script thats a lot more robust; it can even deal with input like this:

Identifier ( Field1a,Field1b, Field1c; Field2a, Field2b;
             Field3a; ..... )

sed -nr '/Identifier /!b
:j
H;n;/\)/!bj;H
:a
x;s/.*\(//;s/\).*$//;s/ *[\n]//g
:p
s/ *([^,;]*),*[^;]*; */\1\
/;tp;p;q' infile