Get the total of a field in all the lines of a group

Hi
I have a Fixed format data file where I need to to get the total of the field at certain position in a file for a group of lines.

In this data file I need the total of all the field ats position 30:39 for each line starting with 6 and for each group startign with 5. Which means for every line starting with 5 I need the total of the fields in line 6 respecitvely. So I would get 2 totals in this case as I have two lines starting with 5.

With this script I am getting for all lines with 6 I need it to geoup by line 5.
How to do it? Please help

echo $(awk '{ if (substr($0,1,1)=="6") { entry_account=substr($0,30,10); s=sprintf("%s",entry_account);print s;}}' 101408)

101 12110825030430021170810060810A094101BANK OF AMERICA
5200AMERICAN EXP 144892 3043002117CCDOCT06 2008
62201100123409-1960 000013768460000526 AMERICAN
62201100123409-1960 000085263060000526 AMERICAN
62201100123409-1960 000004188960000526 AMERICAN
62201100123409-1960 000025960260000526 AMERICAN
62201100123409-1960 000009390060000526 AMERICAN
62201100123409-1960 000028661060000526 AMERICAN
62201100123409-1960 000006656060000526 AMERICAN
62201100123409-1960 000004047360000526 AMERICAN
62201100123409-1960 000002500060000526 AMERICAN
5200AMERICAN EXP 144892 3043002117CCDOCT07 2008
62201100123409-1960 000023768460000526 AMERICAN
62201100123409-1960 000025263060000526 AMERICAN
62201100123409-1960 000024188960000526 AMERICAN
705RMR*11*379477306771000*0000133379

Base on your sample data what should be the expected output?

Hi
The output should be like this. The totals should go where it is bold

101 12110825030430021170810060810A094101BANK OF AMERICA
5200AMERICAN EXP 144892 3043002117CCDOCT06 2008
62201100123409-1960 000013768460000526 AMERICAN
5200AMERICAN EXP 144892 3043002117CCDOCT07 2008
62201100123409-1960 000023768460000526 AMERICAN
705RMR*11*379477306771000*0000133379

This code will give you the required output.

awk '{if(substr($0,1,1)==a) next;a=substr($0,1,1)}1' file

Hi
I want the total of the field at position 30-39 on all Lines 6 to be displayed. so Every 5 line will have only one 6 line with the total of all the lines 6

Thanks
Pls help
so the total sum of of all the values in quotes " " should be dispalyed in the first 6 line and delete remaining. similarly for the other 5 line do the same.

101 12110825030430021170810060810A094101BANK OF AMERICA
5200AMERICAN EXP 144892 3043002117CCDOCT06 2008
62201100123409-1960 "0000137684"60000526 AMERICAN
62201100123409-1960 "0000852630"60000526 AMERICAN
62201100123409-1960 "0000041889"60000526 AMERICAN
62201100123409-1960 "0000259602"60000526 AMERICAN
62201100123409-1960 "0000093900"60000526 AMERICAN
62201100123409-1960 "0000286610"60000526 AMERICAN
62201100123409-1960 "0000066560"60000526 AMERICAN
62201100123409-1960 "0000040473"60000526 AMERICAN
62201100123409-1960 "0000025000"60000526 AMERICAN
5200AMERICAN EXP 144892 3043002117CCDOCT07 2008
62201100123409-1960 000023768460000526 AMERICAN
62201100123409-1960 000025263060000526 AMERICAN
62201100123409-1960 000024188960000526 AMERICAN
705RMR*11*379477306771000*0000133379

Your request is not very clear, please show me the required output.

The required output should be. Basically I want only one line starting with 6 should appear but the field at positon 30:39 for the line should be the sum of all the fields at this position on all the lines starting with 6. But this should group by based on line starting with 5.

101 12110825030430021170810060810A094101BANK OF AMERICA
5200AMERICAN EXP 144892 3043002117CCDOCT06 2008
62201100123409-1960 "0001710448"60000526 AMERICAN
5200AMERICAN EXP 144892 3043002117CCDOCT07 2008
62201100123409-1960 "0000732203"60000526 AMERICAN
705RMR*11*379477306771000*0000133379

where "0001710448" is the total of all the fields in quotes lines starting with 6 for the first group line 5
"0000137684"
"0000852630"
"0000041889"
"0000259602"
"0000286610"
"0000066560"
"0000040473"
"0000025000"
"0001710448"

Smilarly 0000732203 is the total of all the lines starting with 6 for the second group line 5
"0000237684"
"0000252630"
"0000241889"

awk ' FILENAME == "YOURFILENAME" { if ((substr$0,1,1)==6)
amt += substr($0,30,10)
if ((substr$0,1,1)==5)
{ printf("%s\n,amt)
print $0
amt = 0
}
} ' yourfilename

Hi
It displays the total of all the 6 lines for first 5 line only. It doesn't display the other total for the 2nd 5 line. also it displays only 5 lines and the total like this

5200AMEX 144892 4304553002117CCDOCT06 2008
1.12369e+06
5200AMEX 144892 504555002117CCDOCT07 2008

Try this:

awk '
(/^5/ || /^7/) && t {printf("%s%010d%s\n",substr(s,1,20),t,substr(s,31));t=0}
/^6/ {t+=substr($0,21,10);s=$0;next}
{print}' file

Hi
It didn't display the 6 lines at all. also there is no total of all the 6 lines for each 5 line.

Thanks
Please help

What is your OS?
Use nawk, or /usr/xpg4/bin/awk on Solaris.

Hi
It is HP-UX. But I can see other lines like 5 and 7 but not just 6 lines in the file.

This is my output:

$ cat file
101 12110825030430021170810060810A094101BANK OF AMERICA
5200AMERICAN EXP 144892 3043002117CCDOCT06 2008
62201100123409-1960 000013768460000526 AMERICAN
62201100123409-1960 000085263060000526 AMERICAN
62201100123409-1960 000004188960000526 AMERICAN
62201100123409-1960 000025960260000526 AMERICAN
62201100123409-1960 000009390060000526 AMERICAN
62201100123409-1960 000028661060000526 AMERICAN
62201100123409-1960 000006656060000526 AMERICAN
62201100123409-1960 000004047360000526 AMERICAN
62201100123409-1960 000002500060000526 AMERICAN
5200AMERICAN EXP 144892 3043002117CCDOCT07 2008
62201100123409-1960 000023768460000526 AMERICAN
62201100123409-1960 000025263060000526 AMERICAN
62201100123409-1960 000024188960000526 AMERICAN
705RMR*11*379477306771000*0000133379
$
$ awk '
> (/^5/ || /^7/) && t {printf("%s%010d%s\n",substr(s,1,20),t,substr(s,31));t=0}
> /^6/ {t+=substr($0,21,10);s=$0;next}
> {print}' file
101 12110825030430021170810060810A094101BANK OF AMERICA
5200AMERICAN EXP 144892 3043002117CCDOCT06 2008
62201100123409-1960 000180434860000526 AMERICAN
5200AMERICAN EXP 144892 3043002117CCDOCT07 2008
62201100123409-1960 000073220360000526 AMERICAN
705RMR*11*379477306771000*0000133379
$

Hi
Can you pls let me know why cant i get the 6 lines with the same code?

Have you tried nawk? Anyhow try to change the second line like this:

awk '
{if((/^5/ || /^7/) && t) {printf("%s%010d%s\n",substr(s,1,20),t,substr(s,31));t=0}}
/^6/ {t+=substr($0,21,10);s=$0;next}
{print}' file

Hi
nawk doesn't work for me. I still dont get the 6 lines.

Hi
The script works only if the data file is that format. My datafile is sometime in this format too. How to do this. Pls help. Thanks

101 3234344443433230430021170810060810A094101CITIZEN AMERICAN
5200AMERICAN 144892 4333332117CCDOCT06 2008
62201100123409-1960 000013768460000526 AMERICAN
62201100123409-1960 000085263060000526 AMERICAN
62201100123409-1960 000004188960000526 AMERICAN
62201100123409-1960 000025960260000526 AMERICAN
62201100123409-1960 000009390060000526 AMERICAN
62201100123409-1960 000028661060000526 AMERICAN
705RMR*11*3794262747620040000852630\
705RMR*11*378531408781001
0000259602\
705RMR*11*3785380617610060000093900\
705RMR*11*379424325743008
0000286610\
62201100123409-1960 000013337960000526 AMERICAN EXP
5200AMERICAND 144892 3232302117CCDOCT07 2008
62201100123409-1960 000023768460000526 AMERICAN
62201100123409-1960 000025263060000526 AMERICAN
62201100123409-1960 000024188960000526 AMERICAN
62201100123409-1960 000025960260000526 AMERICAN
62201100123409-1960 000029390060000526 AMERICAN
62201100123409-1960 000028661060000526 AMERICAN
705RMR*11*379477306771000*0000133379\
820000018601023114390000000000000000139365873043002117
9000001000020000001860102311439000000000000000013936587
9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999

I'm getting this output and that seems to be correct:

$ cat file
101 3234344443433230430021170810060810A094101CITIZEN AMERICAN
5200AMERICAN 144892 4333332117CCDOCT06 2008
62201100123409-1960 000013768460000526 AMERICAN
62201100123409-1960 000085263060000526 AMERICAN
62201100123409-1960 000004188960000526 AMERICAN
62201100123409-1960 000025960260000526 AMERICAN
62201100123409-1960 000009390060000526 AMERICAN
62201100123409-1960 000028661060000526 AMERICAN
705RMR*11*379426274762004*0000852630\
705RMR*11*378531408781001*0000259602\
705RMR*11*378538061761006*0000093900\
705RMR*11*379424325743008*0000286610\
62201100123409-1960 000013337960000526 AMERICAN EXP
5200AMERICAND 144892 3232302117CCDOCT07 2008
62201100123409-1960 000023768460000526 AMERICAN
62201100123409-1960 000025263060000526 AMERICAN
62201100123409-1960 000024188960000526 AMERICAN
62201100123409-1960 000025960260000526 AMERICAN
62201100123409-1960 000029390060000526 AMERICAN
62201100123409-1960 000028661060000526 AMERICAN
705RMR*11*379477306771000*0000133379\
820000018601023114390000000000000000139365873043002117
9000001000020000001860102311439000000000000000013936587
9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
$
$
$ awk '
> (/^5/ || /^7/) && t {printf("%s%010d%s\n",substr(s,1,20),t,substr(s,31));t=0}
> /^6/ {t+=substr($0,21,10);s=$0;next}
> {print}' file
101 3234344443433230430021170810060810A094101CITIZEN AMERICAN
5200AMERICAN 144892 4333332117CCDOCT06 2008
62201100123409-1960 000167231560000526 AMERICAN
705RMR*11*379426274762004*0000852630\
705RMR*11*378531408781001*0000259602\
705RMR*11*378538061761006*0000093900\
705RMR*11*379424325743008*0000286610\
62201100123409-1960 000013337960000526 AMERICAN EXP
5200AMERICAND 144892 3232302117CCDOCT07 2008
62201100123409-1960 000157231560000526 AMERICAN
705RMR*11*379477306771000*0000133379\
820000018601023114390000000000000000139365873043002117
9000001000020000001860102311439000000000000000013936587
9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999$
$

Regards

Hi
Thanks very much for ur help.The line in red should not be there as there should be only one 6 line for every 5 line. so the total should include to that total too. Pls help

101 3234344443433230430021170810060810A094101CITIZEN AMERICAN
5200AMERICAN 144892 4333332117CCDOCT06 2008
62201100123409-1960 000167231560000526 AMERICAN
705RMR*11*3794262747620040000852630\
705RMR*11*378531408781001
0000259602\
705RMR*11*3785380617610060000093900\
705RMR*11*379424325743008
0000286610\
62201100123409-1960 000013337960000526 AMERICAN EXP
5200AMERICAND 144892 3232302117CCDOCT07 2008
62201100123409-1960 000157231560000526 AMERICAN
705RMR*11*379477306771000*0000133379\
820000018601023114390000000000000000139365873043002117
9000001000020000001860102311439000000000000000013936587
9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999$