I am trying to put together an script that will output the most frequent string in a column. This is what I have:
awk '{count[$1]++} END {for ( i in count ) print i, count }'
Of course, my script is outputting all different strings and counts . However, I just need the most frequent one (there will be always one)
I will appreciate any help
c should be excluded since it only accounts for 20% of the total count
Thanks!
PS> I tried modifying the variable max but I could not get it to print the desire output
awk -v PCT=".3" '
{if(++count[$1] > count[max]) max = $1
tot++
}
END {# print max, count[max] # solution for the former problem
for (c in count) if (count[c]/tot >= PCT) print c, count[c]
}
' file
a 5
b 3