Sum of columns and format the output

Input file:

011100020100 0.00 1 20000 30000 20000 
011110000025 0.00 1 000 240000 10000 
011100020100 0.00 1 200000 2324000 403500 
032200030025 0.00 1 2077500 3077500 250000 
032200030025 0.00 1 2565000 25536400 320000 
022220000005 0.00 1 10000 300000 300000 
022220000005 0.00 1 200050 400000 300000

I Need to sum multiple columns and format the output as shown below.
Note: File is fixed length
step 1:
I want the sum of column 3 (position:65-73) , column 4(position:82-95), column 5(position:98-112),column 6(position:114-127)
based on column 1(position:2-13)
2)I want the output to be look like this:
Output:

011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00 
011110000025 0.00 1 $0.00 $240.000 $100.00 
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00 
032200030025 0.00 2 $46,425.00 $28,613.900 $6,000.00

Can anyone please help?

Is there any way i can attach the input and output file. it looks like the input and ouput file postion got changed while posting.

not able to attach the inupt/output files

something to start with:

awk '{
   idx=$1 OFS $2
   nf=NF
   a[idx]
   for(i=3;i<=NF;i++) 
      v[idx,i]+=$i
}
END {
   for(i in a){
    printf("%s",i)
    for(j=3; j<=nf;j++) 
      printf("%s%d%s", OFS, v[i,j], (j==nf)?ORS:"")
   }
}' myFile

Another solution:

awk '
BEGIN {
   split("1 13", k)
   t=split("65 73 82 95 98 112 114 127", rng)
}
{
   key=substr($0,k[1],k[2]-k[1]+1)
   keys[key]
   for(i=1;i<t;i+=2) tot[key,i]+=substr($0,rng,rng[i+1]-rng+1)
}
END {
  for(key in keys) {
   printf "%s %d", key, tot[key,1]
   tot[key,5]/=10
   for(i=3;i<t;i+=2) printf " $%'\''.2f", tot[key,i]/100
   printf "\n"
  }
}' infile

Input:

011100020100                            0.00                            1                 20000            30000          20000 
011110000025                            0.00                            1                   000           240000          10000 
011100020100                            0.00                            1                200000          2324000         403500 
032200030025                            0.00                            1               2077500          3077500         250000 
032200030025                            0.00                            1               2565000         25536400         320000 
022220000005                            0.00                            1                 10000           300000         300000 
022220000005                            0.00                            1                200050           400000         300000

Output:

032200030025  2 $46,425.00 $28,613.90 $5,700.00
011100020100  2 $2,200.00 $2,354.00 $4,235.00
022220000005  2 $2,100.50 $700.00 $6,000.00
011110000025  1 $0.00 $240.00 $100.00

Of course, your sample input is absolutely different from your description of fixed length records and I don't understand how you would expect to get $6,000.00 from adding 250000 cents to 320000 cents, but the following seems to come closer to what you requested (with two of your input columns being cents, one of your columns being mils and keeping the 2nd column intact, and sorting the output on the 1st column):

awk  '
{	f4[$1] += $4
	f5[$1] += $5
	f6[$1] += $6
	c[$1]++
}
END {	for(i in f4)
		printf("%s 0.00 %d %s %s %s\n", i, c, df(2, f4),
			df(3, f5),df(2, f6))
}
function df(dp, a,	c, s) {
	c = a % (dp == 2 ? 100 : 1000)
	a /= (dp == 2 ? 100 : 1000)
	s = ""
	while(a > 999) {
		s = sprintf(",%03d%s", a % 1000, s)
		a /= 1000
	}
	return sprintf("$%d%s.%0*d", a, s, dp, c)
}' file | sort

which produces the output:

011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00
011110000025 0.00 1 $0.00 $240.000 $100.00
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00
032200030025 0.00 2 $46,425.00 $28,613.900 $5,700.00

from your sample input.

Perhaps fields 2 and 3 should also be summed rather than assuming they are all 0.00 and 1.00 respectively:

awk '
{
   t2[$1] += $2
   t3[$1] += $3
   t4[$1] += $4 / 100;
   t5[$1] += $5 / 1000;
   t6[$1] += $6 / 100;
}
END {
  f=" $%'\''.2f"
  for(i in t2)
   printf "%s %.2f %d" f f f "\n", i, t2, t3, t4, t5, t6
}' infile | sort
2 Likes

Good idea, but this still only prints 2 decimal places in the field 5 output. Perhaps one more tweak (fixing the number of decimal places printed for field and reducing the number of divisions from three per input line to three total):

awk -v sq="'" '
{
	t2[$1] += $2
	t3[$1] += $3
	t4[$1] += $4
	t5[$1] += $5
	t6[$1] += $6
}
END {
	for(i in t2)
		printf("%s %.2f %d $%"sq".2f $%"sq".3f $%"sq".2f\n", i, t2,
			t3, t4 / 100, t5 / 1000, t6 / 100)
}' file | sort

producing:

011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00
011110000025 0.00 1 $0.00 $240.000 $100.00
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00
032200030025 0.00 2 $46,425.00 $28,613.900 $5,700.00
1 Like

Don/Chubler_XL ,

Thanks a lot for your help . Both code is working fine .

Don,

You are right , the total is $5,700.00
032200030025 0.00 2 $46,425.00 $28,613.900 $5,700.00

In Chubler's code ,fileld 5 ouput was printing only 2 decimal places . Thanks for fixing the code.