Hi,
I have a similar input format-
A_1 2
B_0 4
A_1 1
B_2 5
A_4 1
and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks!
letter number_of_letters Total Split
A 3 4 2+1+1
B 2 10 4+5
That would require something like this:
awk -F'[_ ]*' '
{
A[$1]++
n=$2*$3
if(n>B[$1]) B[$1]=n
C[$1]=C[$1] (C[$1]==""?x:"+") $3
}
END{
for(i in A) print i, A, B, C
}
' OFS='\t' file
If not please specify more elaborately what it is that you need. Also, next time please show your attempts at a solution...
Thank you.Can you pls tell me how to get rid of the _ field separator & count the duplicates in $1?
Say, for the input
A_1 2
B_0 4
A_1 1
B_0 5
A_1 1
and output should be
A_1 3 4 2+1+1 B_0 2 10 4+5
How do you arrive at 10 for B_0 ?
Sorry, typo.
It should read:
A 3 4 2+1+1
B 2 9 4+5
---------- Post updated at 09:28 AM ---------- Previous update was at 09:26 AM ----------
Oops.. this is the correct format required:
A_1 3 4 2+1+1
B_0 2 9 4+5
Try:
awk '
{
A[$1]++
B[$1]+=$2
C[$1]=C[$1] (C[$1]==""?x:"+") $2
}
END{
for(i in A) print i, A, B, C
}
' OFS='\t' file
1 Like