awk - Pick last value from set of rows

Input data

 
COL_1,COL_2,COL_3,COL_4,COL_5,COL_6,COL_7,COL_8,COL_9,COL_10,COL_11,COL_12,COL_13
C,ABC,ABCD,3,ZZ,WLOA,2015-12-01,2016-12-01,975.73,ZZZ,P,111111.00,Y1   **GROUP1**
C,ABC,ABCD,3,ZZ,WLOA,2015-12-01,2016-12-01,975.73,ZZZ,P,222222.00,Y1   **GROUP1**
C,ABC,ABCD,3,ZZ,WLOA,2015-12-01,2016-12-01,975.73,ZZZ,P,333333.00,Y1   **GROUP1**
C,ABC,ABCD,3,ZZ,WLOA,2016-12-01,2017-12-01,898.86,ZZZ,P,444444.00,Y2   **GROUP2** 
C,ABC,ABCD,3,ZZ,WLOA,2016-12-01,2017-12-01,898.86,ZZZ,P,555555.00,Y2   **GROUP2**
C,ABC,ABCD,3,ZZ,WLOA,2016-12-01,2017-12-01,898.86,ZZZ,P,666666.00,Y2   **GROUP2**
C,ABC,ABCD,3,ZZ,WLOA,2017-12-01,2018-12-01,788.81,ZZZ,P,777777.10,Y3   **GROUP3**
C,ABC,ABCD,3,ZZ,WLOA,2019-12-01,2020-12-01,788.81,ZZZ,P,888888.10,Y5   **GROUP4**
C,ABC,ABCD,3,ZZ,WLOA,2020-12-31,2021-10-29,788.81,ZZZ,P,999999.10,Y5+  **GROUP5**
C,ABC,ABCD,4,ZZ,WLOA,2015-12-01,2016-12-01,788.81,ZZZ,P,101010.10,Y1   **GROUP6**
C,ABC,ABCD,4,ZZ,WLOA,2015-12-01,2016-12-01,788.81,ZZZ,P,123123.10,Y1   **GROUP6**
 

I am using the following command to print out one record where COL_4, COL_7, COL_8 and COL_13 match - It then sums up all the values from COL_9 and prints into the one record

Code and output;

 
awk 'BEGIN{FS=OFS=","}
NR==1{print;next}
{a[$4";"$7";"$8";"$13]+=$9}
END{for(i in a)print $1, $2, $3, substr(i,0,1), $5, $6, substr(i,3,10), substr(i,14,10), a, $10, $11, $12}' BUCKETED.csv > COL_TOTAL.csv
  
 
COL_1,COL_2,COL_3,COL_4,COL_5,COL_6,COL_7,COL_8,COL_9,COL_10,COL_11,COL_12,COL_13
C,ABC,ABCD,3,ZZ,WLOA,2019-12-01,2020-12-01,788.81,ZZZ,P,123123.10,Y4     **GROUP4 Record**
C,ABC,ABCD,3,ZZ,WLOA,2017-12-01,2018-12-01,788.81,ZZZ,P,123123.10,Y3     **GROUP3 Record**
C,ABC,ABCD,3,ZZ,WLOA,2016-12-01,2017-12-01,2696.58,ZZZ,P,123123.10,Y2    **GROUP2 Record**
C,ABC,ABCD,3,ZZ,WLOA,2015-12-01,2016-12-01,2927.19,ZZZ,P,123123.10,Y1    **GROUP1 Record**
C,ABC,ABCD,3,ZZ,WLOA,2020-12-31,2021-10-29,788.81,ZZZ,P,123123.10,Y5     **GROUP5 Record**
C,ABC,ABCD,4,ZZ,WLOA,2015-12-01,2016-12-01,1577.62,ZZZ,P,123123.10,Y6    **GROUP6 Record**
 

The issue I have is with COL_12 and bringing back the same value everytime (123123.10) - The desired output would be to bring back only the final record from each group, so for example we would bring back the following records from the groups

 
GROUP1=333333.00
GROUP2=666666.00
GROUP3=777777.10
GROUP4=888888.10
GROUP5=999999.10
GROUP6=123123.10
 

The correct output therefore would show as;

 
C,ABC,ABCD,3,ZZ,WLOA,2019-12-01,2020-12-01,788.81,ZZZ,P,888888.10,,Y4
C,ABC,ABCD,3,ZZ,WLOA,2017-12-01,2018-12-01,788.81,ZZZ,P,777777.10,Y3
C,ABC,ABCD,3,ZZ,WLOA,2016-12-01,2017-12-01,2696.58,ZZZ,P,666666.00,Y2
C,ABC,ABCD,3,ZZ,WLOA,2015-12-01,2016-12-01,2927.19,ZZZ,P,333333.00,Y1
C,ABC,ABCD,3,ZZ,WLOA,2020-12-31,2021-10-29,788.81,ZZZ,P,999999.10,Y5
C,ABC,ABCD,4,ZZ,WLOA,2015-12-01,2016-12-01,1577.62,ZZZ,P,123123.10,Y6
 

Is there any way of bringing back just this record to be printed in the awk command?

Thanks for your help.

How about

    I    /tmp/bash-fc-763387004                                                                                                          Row 1    Col 1    3:16  Ctrl-K H for help
awk '
BEGIN   {FS = OFS = ","
         SUBSEP = ";"
        }
NR==1   {print
         next
        }
        {a[$4, $7, $8, $13] += $9
         b[$4, $7, $8, $13] =  $12
        }
END     {for (i in a) print $1, $2, $3, substr(i,0,1), $5, $6, substr(i,3,10), substr(i,14,10), a, $10, $11, b
        }
' file
COL_1,COL_2,COL_3,COL_4,COL_5,COL_6,COL_7,COL_8,COL_9,COL_10,COL_11,COL_12,COL_13
C,ABC,ABCD,,ZZ,WLOA,2015-12-01,2016-12-01,1577.62,ZZZ,P,123123.10
C,ABC,ABCD,,ZZ,WLOA,2019-12-01,2020-12-01,788.81,ZZZ,P,888888.10
C,ABC,ABCD,,ZZ,WLOA,2016-12-01,2017-12-01,2696.58,ZZZ,P,666666.00
C,ABC,ABCD,,ZZ,WLOA,2017-12-01,2018-12-01,788.81,ZZZ,P,777777.10
C,ABC,ABCD,,ZZ,WLOA,2020-12-31,2021-10-29,788.81,ZZZ,P,999999.10
C,ABC,ABCD,,ZZ,WLOA,2015-12-01,2016-12-01,2927.19,ZZZ,P,333333.00

Please not that field 4 is empty as substr indices start at 1, not 0.

1 Like

Works perfect - thank you for your help!