Sequential counting

Lucky_Ali · January 7, 2013, 11:29am

Hi have a file of the following format

a1
a1
a2
a2
a4
a4
a4
a3
a3
a5
a6
a6
a6
a6
..

100's of lines

The file is sorted and I would like to get the counts of each of the characters
the output I needed

a1 2
a2 2
a4 3
a3 2
a5 1
a6 4

let me know the best way to do it in awk.

radoulov · January 7, 2013, 11:34am

Considering that the file is already ordered, you could use:

uniq -c infile

Otherwise:

sort infile | uniq -c

With awk it would be:

awk '{ cnt[$0]++ }
END { 
  for (e in cnt)
    print cnt[e], e
    }' infile

The awk script won't preserve the original order.

rdrtx1 · January 7, 2013, 11:55am

for awk in order, try:

awk '!a[$0]++ {} END {for (i in a) print i, a}' infile

ordered as listed:

awk '!a[$0]++ {o[n++]=$0} END {for (i=0; i<n ; i++) print o, a[o]}' infile