vinus
June 25, 2014, 4:36pm
1
Input file:
011100020100 0.00 1 20000 30000 20000
011110000025 0.00 1 000 240000 10000
011100020100 0.00 1 200000 2324000 403500
032200030025 0.00 1 2077500 3077500 250000
032200030025 0.00 1 2565000 25536400 320000
022220000005 0.00 1 10000 300000 300000
022220000005 0.00 1 200050 400000 300000
I Need to sum multiple columns and format the output as shown below.
Note: File is fixed length
step 1:
I want the sum of column 3 (position:65-73) , column 4(position:82-95), column 5(position:98-112),column 6(position:114-127)
based on column 1(position:2-13)
2)I want the output to be look like this:
Output:
011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00
011110000025 0.00 1 $0.00 $240.000 $100.00
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00
032200030025 0.00 2 $46,425.00 $28,613.900 $6,000.00
Can anyone please help?
Is there any way i can attach the input and output file. it looks like the input and ouput file postion got changed while posting.
vinus
June 25, 2014, 4:45pm
2
not able to attach the inupt/output files
something to start with:
awk '{
idx=$1 OFS $2
nf=NF
a[idx]
for(i=3;i<=NF;i++)
v[idx,i]+=$i
}
END {
for(i in a){
printf("%s",i)
for(j=3; j<=nf;j++)
printf("%s%d%s", OFS, v[i,j], (j==nf)?ORS:"")
}
}' myFile
Another solution:
awk '
BEGIN {
split("1 13", k)
t=split("65 73 82 95 98 112 114 127", rng)
}
{
key=substr($0,k[1],k[2]-k[1]+1)
keys[key]
for(i=1;i<t;i+=2) tot[key,i]+=substr($0,rng,rng[i+1]-rng+1)
}
END {
for(key in keys) {
printf "%s %d", key, tot[key,1]
tot[key,5]/=10
for(i=3;i<t;i+=2) printf " $%'\''.2f", tot[key,i]/100
printf "\n"
}
}' infile
Input:
011100020100 0.00 1 20000 30000 20000
011110000025 0.00 1 000 240000 10000
011100020100 0.00 1 200000 2324000 403500
032200030025 0.00 1 2077500 3077500 250000
032200030025 0.00 1 2565000 25536400 320000
022220000005 0.00 1 10000 300000 300000
022220000005 0.00 1 200050 400000 300000
Output:
032200030025 2 $46,425.00 $28,613.90 $5,700.00
011100020100 2 $2,200.00 $2,354.00 $4,235.00
022220000005 2 $2,100.50 $700.00 $6,000.00
011110000025 1 $0.00 $240.00 $100.00
vinus:
Input file:
011100020100 0.00 1 20000 30000 20000
011110000025 0.00 1 000 240000 10000
011100020100 0.00 1 200000 2324000 403500
032200030025 0.00 1 2077500 3077500 250000
032200030025 0.00 1 2565000 25536400 320000
022220000005 0.00 1 10000 300000 300000
022220000005 0.00 1 200050 400000 300000
I Need to sum multiple columns and format the output as shown below.
Note: File is fixed length
step 1:
I want the sum of column 3 (position:65-73) , column 4(position:82-95), column 5(position:98-112),column 6(position:114-127)
based on column 1(position:2-13)
2)I want the output to be look like this:
Output:
011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00
011110000025 0.00 1 $0.00 $240.000 $100.00
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00
032200030025 0.00 2 $46,425.00 $28,613.900 $6,000.00
Can anyone please help?
Is there any way i can attach the input and output file. it looks like the input and ouput file postion got changed while posting.
Of course, your sample input is absolutely different from your description of fixed length records and I don't understand how you would expect to get $6,000.00 from adding 250000 cents to 320000 cents, but the following seems to come closer to what you requested (with two of your input columns being cents, one of your columns being mils and keeping the 2nd column intact, and sorting the output on the 1st column):
awk '
{ f4[$1] += $4
f5[$1] += $5
f6[$1] += $6
c[$1]++
}
END { for(i in f4)
printf("%s 0.00 %d %s %s %s\n", i, c, df(2, f4),
df(3, f5),df(2, f6))
}
function df(dp, a, c, s) {
c = a % (dp == 2 ? 100 : 1000)
a /= (dp == 2 ? 100 : 1000)
s = ""
while(a > 999) {
s = sprintf(",%03d%s", a % 1000, s)
a /= 1000
}
return sprintf("$%d%s.%0*d", a, s, dp, c)
}' file | sort
which produces the output:
011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00
011110000025 0.00 1 $0.00 $240.000 $100.00
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00
032200030025 0.00 2 $46,425.00 $28,613.900 $5,700.00
from your sample input.
Perhaps fields 2 and 3 should also be summed rather than assuming they are all 0.00 and 1.00 respectively:
awk '
{
t2[$1] += $2
t3[$1] += $3
t4[$1] += $4 / 100;
t5[$1] += $5 / 1000;
t6[$1] += $6 / 100;
}
END {
f=" $%'\''.2f"
for(i in t2)
printf "%s %.2f %d" f f f "\n", i, t2, t3, t4, t5, t6
}' infile | sort
2 Likes
chubler_xl:
Perhaps fields 2 and 3 should also be summed rather than assuming they are all 0.00 and 1.00 respectively:
awk '
{
t2[$1] += $2
t3[$1] += $3
t4[$1] += $4 / 100;
t5[$1] += $5 / 1000;
t6[$1] += $6 / 100;
}
END {
f=" $%'\''.2f"
for(i in t2)
printf "%s %.2f %d" f f f "\n", i, t2, t3, t4, t5, t6
}' infile | sort
Good idea, but this still only prints 2 decimal places in the field 5 output. Perhaps one more tweak (fixing the number of decimal places printed for field and reducing the number of divisions from three per input line to three total):
awk -v sq="'" '
{
t2[$1] += $2
t3[$1] += $3
t4[$1] += $4
t5[$1] += $5
t6[$1] += $6
}
END {
for(i in t2)
printf("%s %.2f %d $%"sq".2f $%"sq".3f $%"sq".2f\n", i, t2,
t3, t4 / 100, t5 / 1000, t6 / 100)
}' file | sort
producing:
011100020100 0.00 2 $2,200.00 $2,354.000 $4,235.00
011110000025 0.00 1 $0.00 $240.000 $100.00
022220000005 0.00 2 $2,100.50 $700.000 $6,000.00
032200030025 0.00 2 $46,425.00 $28,613.900 $5,700.00
1 Like
vinus
June 26, 2014, 1:08pm
8
Don/Chubler_XL ,
Thanks a lot for your help . Both code is working fine .
Don,
You are right , the total is $5,700.00
032200030025 0.00 2 $46,425.00 $28,613.900 $5,700.00
In Chubler's code ,fileld 5 ouput was printing only 2 decimal places . Thanks for fixing the code.