Getting lines between patterns and perform operations

Hi, I'm currently dev'ing using awk and I'm currently stuck.

Here's the file, with comments on "<--- ":

Record            <--- First Pattern
Amount 1          <--- Amount on first transaction
TotalSales 0      <--- Total Sales Prior from previous transactions
Time 1:00:00      <--- Time
Orange
Record            <--- Second Pattern, First Pattern for next Record
Amount 10         <--- Amount on next transaction
TotalSales 1      <--- Total Sales from previous transactions
Time 1:00:00
Orange
Grapes
Record            <--- Second Pattern, First Pattern for next Record
Amount 100        
TotalSales 11   <--- Total Sales from previous transactions
Time 1:00:00
Grapes
Record            <--- Second Pattern, First Pattern for next Record
Amount 10       <--- Missing total sales, ignored set
Time 1:00:00
Grapes
Record            <--- Second Pattern, First Pattern for next Record
TotalSales 111   <--- Missing amount, ignored set
Time 1:00:00
Grapes
Record            <--- Second Pattern, First Pattern for next Record

Basically, I want to extract the lines in between the pattern match, then perform operation such as addition and validation for the lines between the pattern. If the line does not have Amount or TotalSales, it will ignore the pattern and move to the next occurrence.

Output pattern:

Amount, TotalSales, Time

Here's a sample output:

1,0,1:00:00 
10,1,1:00:00 
100,11,1:00:00 
2,111,1:00:00

So here's my current code, however it is printing to many duplicate lines:

awk '/Record/ { start = 1; next }
/Record/ { start = 0 ; exit }
(/Amount/) && start == 1 { amount=$8 };
(/TotalSales/) && start == 1 { totalSales=$8 };
(/Time/) && start == 1 { time=$8 };
{ print amount,totalSales,time}' record.csv

Thanks!

awk '{if($1=="Record"){print a;a="Record"} else {a=a" "$0}}' record.csv | awk 'BEGIN{print "Amount, TotalSales, Time";OFS=","} /Amount/&&/TotalSales/&&/Time/ {print $3,$5,$7}'
1 Like

Another awk

awk '/Record/ {getline;if ($1!~/Amount/) next;a=$2",";getline;if ($1!~/Total/) next;a=a$2",";getline;print a$2}' file
1,0,1:00:00
10,1,1:00:00
100,11,1:00:00
1 Like

gawk, mawk:

awk '/Amount/ && /TotalSales/{print $2,$4,$6}' RS=Record OFS=, file

--
other awks:

awk '/Amount/{a=$2} /TotalSales/{t=$2} /Time/{if(a!="" && t!="") print a,t,$2; a=t=""}' OFS=, file
1 Like

Holy crap! Good stuff guys.

Thanks!

@Scrutinizer
I do need to remember that changing the Record separator i a very powerful way to do it. Tanks :slight_smile:

You can even save some more:

awk '/Amount.+TotalSales/{print $2,$4,$6}' RS=Record OFS=, file

Just to show other what is going on:

awk '{$1=$1;print}' RS=Record OFS=,  file

Amount,1,TotalSales,0,Time,1:00:00,Orange
Amount,10,TotalSales,1,Time,1:00:00,Orange,Grapes
Amount,100,TotalSales,11,Time,1:00:00,Grapes
Amount,10,Time,1:00:00,Grapes
TotalSales,111,Time,1:00:00,Grapes
1 Like

@Jotne :slight_smile: Yes, but only with gawk and mawk . Regular awk can only use a single character as a record separator (or two consecutive new lines if left blank)...

1 Like

This part of awk:

/Amount/ && /TotalSales/

Does it mean that if one of the pattern is absent, then it will ignore the entire line? I've never heard of using the && for awk. Powerful stuff!

Yes correct.
/Amount/ && /TotalSales/
Means line that have both Amount and TotalSales in it.
Logical and . Look at output of data in my post #6
Logical or can be used this way /Amount/ || /TotalSales/ or /Amount|TotalSales/

@Scrutinizer
Here we could have set RS=R , since capital R is only used in the Record

awk '/Amount.+TotalSales/{print $3,$5,$7}' RS=R OFS=, 
1 Like

@Jotne, yes we could. Usually that would be too thin a solution, but in this case I guess it could sort of work... If they decided to sell Raspberries that would be regarded as a new record, but since it would not get printed the output would still be OK with this specific input...