Count occurrences in first column

quincyjones · May 19, 2015, 4:54pm

input

amex-11	10	abc
amex-11	20	bcn
amed-12	1	abc

I tried something like this.

awk '{h[$1]++}; END { for(k in h) print k, h[k] }' rm1

output

amex-11 1 10 abc
amex-11 1 20 bcn
amed-12 2 1 abc

Note: The second column represents the occurrences. amex-11 is first one and amed-12 is the second one. ....

Chubler_XL · May 19, 2015, 5:10pm

Not quite sure what you mean by "The second column represents the occurrences"

Perhaps this:

awk '{print $1,++c[$3],$2,$3}' infile

quincyjones · May 19, 2015, 5:15pm

yes almost. more detailed example for clarification.

input

amex-11	10	abc
amex-11	20	bcn
amed-12	1	abc
amex-12	10	abc
amex-12	20	bcn
amed-13	1	abc

ouput shouldbe

amex-11	1 10	abc
amex-11	1 20	bcn
amed-12	2 1	abc
amex-12	3 10	abc
amex-12	3 20	bcn
amed-13	4 1	abc

Chubler_XL · May 19, 2015, 5:25pm

How about one of these two:

awk '{c[$1];$1=$1 OFS length(c)}1' infile

awk '
!($1 in c){c[$1]=++U}
$1=$1 OFS c[$1]' infile

Scrutinizer · May 19, 2015, 11:37pm

Note: using length() to determine the number of elements in an array is a non-standard extension that only works in GNU awk ( and may work in BSD awk as an undocumented feature)...

sheemam · May 20, 2015, 1:52am

This one should work.

awk 'BEGIN{h[" "]=0;max=0}{ind=-1;for(k in h)if(k==$1){ind=h[k];break}if(ind==-1){ind=++max;h[$1]=ind}print $1,ind,$2,$3}' infile