Column value and it's count

Hi, i have data like
a
b
a
a
b
c
d
...

I have to output info of each distinct value that appears in the column and the count of it
a-3,b-2,c-1,d-1

Is there a single line command using awk or sed to accomplish this?

Thanks,
-srinivas yelamanchili

You can try this:

awk '{c[$1]++;} END { s=""; for( x in c ){ printf( "%s%s=%d", s, x, c[x]); s=","; } printf( "\n" );  }'  input-file
1 Like

agama,
that works Fantastic !!!
Thanks a lot !

sort < infile | uniq -c
2 Likes

Associative arrays are just live savers in these needs

awk '{arr[$1]++}END{for (i in arr) {print i "-" arr}}' uniq.dat
1 Like

@Jayan_Jay
So simple.Great , Thanx a ton

Thanks Jayan,
i used your simple straight forward code as below:
$ cat a.txt
a
a
b
c
a
b
d
a
b
$

echo `cat a.txt | sort | uniq -c | awk '{print $2"-"$1}' | tr '\n' ','` | sed 's/[,]*$//g'

a-4,b-3,c-1,d-1
$

I like jayan_jay's idea; I would have written a small sort in awk -- nice approach.

However, the echo and cat are not necessary. Sort can read directly from a file, so there's no need to pipe it in. If you bundle the output into a command line parameter you run the risk of exceeding the maximum number of arguments and experiencing an unnecessary error. Better to write your command this way:

sort a.txt | uniq -c | awk '{print $2"-"$1}' | tr '\n' ','` | sed 's/[,]*$//g'

Further, you can eliminate two processes if you do all the work in the first awk:

sort a.txt | uniq -c | awk '{printf( "%s%s-%s", NR > 1 ? "," : "", $2, $1 ); } END {printf( "\n" );}'