Count occurrences in first column

input

amex-11	10	abc
amex-11	20	bcn
amed-12	1	abc

I tried something like this.

awk '{h[$1]++}; END { for(k in h) print k, h[k] }' rm1

output

amex-11 1 10 abc
amex-11 1 20 bcn
amed-12 2 1 abc

Note: The second column represents the occurrences. amex-11 is first one and amed-12 is the second one. ....

Not quite sure what you mean by "The second column represents the occurrences"

Perhaps this:

awk '{print $1,++c[$3],$2,$3}' infile

yes almost. more detailed example for clarification.

input

amex-11	10	abc
amex-11	20	bcn
amed-12	1	abc
amex-12	10	abc
amex-12	20	bcn
amed-13	1	abc

ouput shouldbe

amex-11	1 10	abc
amex-11	1 20	bcn
amed-12	2 1	abc
amex-12	3 10	abc
amex-12	3 20	bcn
amed-13	4 1	abc

How about one of these two:

awk '{c[$1];$1=$1 OFS length(c)}1' infile
awk '
!($1 in c){c[$1]=++U}
$1=$1 OFS c[$1]' infile
1 Like

Note: using length() to determine the number of elements in an array is a non-standard extension that only works in GNU awk ( and may work in BSD awk as an undocumented feature)...

This one should work.

awk 'BEGIN{h[" "]=0;max=0}{ind=-1;for(k in h)if(k==$1){ind=h[k];break}if(ind==-1){ind=++max;h[$1]=ind}print $1,ind,$2,$3}' infile