Print values within groups of lines with awk

Ophiuchus · October 23, 2015, 3:16pm

Hello to all,

I'm trying to print the value corresponding to the words A, B, C, D, E. These words could appear sometimes and sometimes not inside each group of lines. Each group of lines begins with "ZYX".

My issue with current code is that should print values for 3 groups and only is printing for 2 groups (group 1 and group 3) and I'm not sure why.

What is missing in my current code to fix this?

The current output is:

123|3|22|56|881
711||988||444

and desired ouput:

123|3|22|56|881
332|453|11||
711||988||444

My input file is:

ZYX
A = 123
B = 3
C = 22
D = 56
E = 881
ZYX
A = 332
B = 453
C = 11
ZYX
A = 711
C = 988
E = 444

My current code is:

awk '/ZYX/{a="";b="";c="";d="";e=""} 
         /A/ {a=$3}
         /B/ {b=$3}
         /C/ {c=$3} 
         /D/ {d=$3}
         /E/ {e=$3; print a"|"b"|"c"|"d"|"e}' file

Thaaks for any help.

Regards

MadeInGermany · October 23, 2015, 5:12pm

It prints when E is met. If there is no E nothing is printed.
Instead it must print at the end of each block, that is either when ZYX is met or at the END, and if it's not line 1.
Consider a

function prt() {if (NR>1) print a"|"b"|"c"|"d"|"e}

Ophiuchus · October 23, 2015, 6:42pm

Thanks MadeInGermany, your're right.

I've added the END and print when each block begins and at the END and works now and adding the function as you sugested.

awk '
 function prt(){if (NR>1) {print a"|"b"|"c"|"d"|"e; a="";b="";c="";d="";e=""}}
 /ZYX/{prt()}
         /A/ {a=$3}
         /B/ {b=$3}
         /C/ {c=$3}
         /D/ {d=$3}
         /E/ {e=$3}
 END{prt()}' file.txt

Thanks again.

Best regards

Don_Cragun · October 24, 2015, 1:02am

Thank you for sharing your results. It will help other people reading this thread understand how you solved your problem.

For future reference, note that as long as a list of variables are all being set to the same value in awk , you can simplify the construct:

a="";b="";c="";d="";e=""

to just:

a=b=c=d=e=""

looney · October 24, 2015, 2:53am

Hi could you please explain how your code is working , specially

a="";b="";c="";d="";e=""

Thanks,

Don_Cragun · October 24, 2015, 3:56am

When a header is found, all of the data fields are cleared (i.e., set to empty strings) by the above statement.
The individual data fields are filled in as they are found by the statements on the following lines in the script. All of the fields have to be cleared so data from an earlier header isn't printed as data from a later header that did not contain entries for some fields.

Ophiuchus · October 25, 2015, 9:32pm

don cragun:

Thank you for sharing your results. It will help other people reading this thread understand how you solved your problem.

For future reference, note that as long as a list of variables are all being set to the same value in awk , you can simplify the construct:
a="";b="";c="";d="";e=""
to just:
a=b=c=d=e=""

Hi Don,

Excellent advice. Thanks so much for comment and add extra information even when the question was solved:b:.

With your advice the code looks like this:

awk '
 function prt(){if (NR>1) {print a"|"b"|"c"|"d"|"e; a=b=c=d=e=""}}
 /ZYX/{prt()}
         /A/ {a=$3}
         /B/ {b=$3}
         /C/ {c=$3}
         /D/ {d=$3}
         /E/ {e=$3}
 END{prt()}' file.txt

Best regards