Printing most frequent string in column

I am trying to put together an script that will output the most frequent string in a column. This is what I have:

 awk '{count[$1]++} END {for ( i in count ) print i, count }' 
 

Of course, my script is outputting all different strings and counts . However, I just need the most frequent one (there will be always one)
I will appreciate any help

Perhaps something like:

 awk '{if(count[$1]++ >= max) max = count[$1]} END {for ( i in count ) if(max == count) print i, count }'
1 Like

Thanks a TON! I got it

How could I modify the above script in such way that I can print all strings that represent >30% percent of all entries within the column?

Individual strings with percentage > 30 or strings that collectively have a percentage > 30?

Don
Let say I have the following file:

 a
 a
 a
 a
 a
 b
 b
 b
 c
 c
 

The desired output should be:

 a 5
 b 3
 

or

 a 50%
 b 30%
 

c should be excluded since it only accounts for 20% of the total count
Thanks!
PS> I tried modifying the variable max but I could not get it to print the desire output

How about

awk -v PCT=".3" '
        {if(++count[$1] > count[max]) max = $1
         tot++
        }
END     {# print max, count[max]                      # solution for the former problem
         for (c in count) if (count[c]/tot >= PCT) print c, count[c]
        }
' file
a 5
b 3
1 Like

I got it! Thanks a BUNCH! Never thought about introducing PCT as a variable