I have input file like Input.dat with below content
RRD 0Z91YUn000000Lk 9000100001 103020151117 STMT151117155527001 0000 2 000000 000004
RRD 0Z91YUn00000ysj 9000100001 103020151117 STMT151117155527001 0000 3 000000 000003
RRD 0Z91YUn00001vGh 9000100002 103020151117 STMT151117155527002 0000 1 000000 000007
RRD 0Z91YUn00002tng 9000100002 103020151117 STMT151117155527002 0000 0 000000 000006
RRD 0Z91YUn00003qBe 9000100003 103020151117 STMT151117155527003 0000 1 000000 000007
RRD 0Z91YUn00004oid 9000100003 103020151117 STMT151117155527003 0000 0 000000 000006
RRD 0Z91YUn00005l6b 9000100004 103020151117 STMT151117155527004 0000 1 000000 000007
RRD 0Z91YUn00006jda 9000100004 103020151117 STMT151117155527004 0000 0 000000 000006
1 2 3 4 5 6 7 8 9 <----- column number (This line is not part of file)
Here 3 column is the key column and 7th and 9th are the value columns. I want to do some arithematic operations on 7th & 9th column and update that
value to 8th column on matching of 3rd key column.
example : here first two row has same key column(3rd column) (9000100001) so I want to do
((((7th column value + 9th column value)of row 1)+((7th column value + 9th column value)of row 2))/2)
ie (((2+4)+(3+3))/2)=000006 (leading zeros for complete 6 digit number)
so i have to update the first 2 rows(key is 9000100001)of origal Input.dat files 8th column with 000006 value
So final output of above example is as
RRD 0Z91YUn000000Lk 9000100001 103020151117 STMT151117155527001 0000 2 000006 000004
RRD 0Z91YUn00000ysj 9000100001 103020151117 STMT151117155527001 0000 3 000006 000003
RRD 0Z91YUn00001vGh 9000100002 103020151117 STMT151117155527002 0000 1 000007 000007
RRD 0Z91YUn00002tng 9000100002 103020151117 STMT151117155527002 0000 0 000007 000006
RRD 0Z91YUn00003qBe 9000100003 103020151117 STMT151117155527003 0000 1 000007 000007
RRD 0Z91YUn00004oid 9000100003 103020151117 STMT151117155527003 0000 0 000007 000006
RRD 0Z91YUn00005l6b 9000100004 103020151117 STMT151117155527004 0000 1 000007 000007
RRD 0Z91YUn00006jda 9000100004 103020151117 STMT151117155527004 0000 0 000007 000006
I have done script for this requirement as below : Final_Count.sh
#!/bin/sh
JOBFOLDER=/opt/hpx/exstream/StatementAppA/inputs/POC/testSort
input_file=$JOBFOLDER/Input.dat
count_output_file=$JOBFOLDER/Count_output.dat
final_updated_file=$JOBFOLDER/Final_updated.dat
awk ' BEGIN {FS=" "}
function round(A) {
return int( A + 0.5 )
}
{a[$3]+=$7; b[$3]+=$9} END{for(x in a) print x,round((a[x]+b[x])/2)
}' $input_file >>$count_output_file
echo "--- `basename $count_output_file `---"
cat $count_output_file
echo "---------------- ` basename $final_updated_file ` --------------"
awk ' BEGIN {FS=" "} FNR==NR{a[$1]=$2;next}
{
if(a[$3] >= 0 && a[$3] <= 9)
{
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?"00000"a[$3]:"NA",$9,$10
}
else if(a[$3] >= 10 && a[$3] <= 99)
{
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?"0000"a[$3]:"NA",$9,$10
}
else if(a[$3] >= 100 && a[$3] <= 999)
{
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?"000"a[$3]:"NA",$9,$10
}
else if(a[$3] >= 1000 && a[$3] <= 9999)
{
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?"00"a[$3]:"NA",$9,$10
}
else if(a[$3] >= 10000 && a[$3] <= 99999)
{
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?"0"a[$3]:"NA",$9,$10
}
else if(a[$3] >= 100000 && a[$3] <= 999999)
{
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?a[$3]:"NA",$9,$10
}
}' $count_output_file $input_file >> $final_updated_file
echo "----------------------------------------------------"
cat $final_updated_file
rm $final_updated_file $count_output_file
and output is :
--- Count_output.dat
---
9000100002 7
9000100003 7
9000100004 7
9000100001 6
---------------- Final_updated.dat
--------------
----------------------------------------------------
RRD 0Z91YUn000000Lk 9000100001 103020151117 STMT151117155527001 0000 2 000006 000004
RRD 0Z91YUn00000ysj 9000100001 103020151117 STMT151117155527001 0000 3 000006 000003
RRD 0Z91YUn00001vGh 9000100002 103020151117 STMT151117155527002 0000 1 000007 000007
RRD 0Z91YUn00002tng 9000100002 103020151117 STMT151117155527002 0000 0 000007 000006
RRD 0Z91YUn00003qBe 9000100003 103020151117 STMT151117155527003 0000 1 000007 000007
RRD 0Z91YUn00004oid 9000100003 103020151117 STMT151117155527003 0000 0 000007 000006
RRD 0Z91YUn00005l6b 9000100004 103020151117 STMT151117155527004 0000 1 000007 000007
RRD 0Z91YUn00006jda 9000100004 103020151117 STMT151117155527004 0000 0 000007 000006
-------------------------------------------------------------------------------------------------------------------
Here in this script for final updated file I am using static printing like
print $1" ",$2,$3,$4" ",$5" ",$6,$7,a[$3]?"00"a[$3]:"NA",$9,$10
so here I am printing spaces after column value manually. I want this as dynamic.
In short print all column with as it is(with space) and just update 8th column.
Please can you help me with this ?