awk to count duplicated lines

We have an input file as follows:

2010-09-15-12.41.15
2010-09-15-12.41.15
2010-09-15-12.41.24
2010-09-15-12.41.24
2010-09-15-12.41.24
2010-09-15-12.41.24
2010-09-15-12.41.25
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.26
2010-09-15-12.41.28
2010-09-15-12.41.28
2010-09-15-12.41.28
2010-09-15-12.41.28
2010-09-15-12.41.41

And we have this loop which works fine to count and print the line recurrences, i.e.:

for i in `cat infile | uniq`
        do
        num=`cat infile | grep $i | wc -l`
        echo $i $num
        done

However, would like to use the awk program to perform the similar logic. Please assist if possible and thanking you in advance.

awk 'arr[$0]++  END {for (i in arr) { if(arr>1]) {print arr, "    ", $0 }}' inputfile | sort -n

This produces a list of lines that occur more than once, with a count of the number of times they occur.

1 Like

Or

sort file | uniq -c
1 Like

Should be something like:

awk '{a[$0]++}END{for(i in a){print i, a}}' file

---------- Post updated at 05:16 PM ---------- Previous update was at 05:13 PM ----------

or if you need only duplicate count

awk '{a[$0]++}END{for(i in a){if(a-1)print i,a}}' file
1 Like