Hi!
The fallowing awk script counts words from input file, then sorts these words to decreasing order of occurrences and also to alphabetical order. And then prints all these words out with the number of their occurrence. For example:
and 7
for 4
make 4
you 4
awk 1
....
Problem is that if the text file includes thousands of words then the output is also very long. And I'm only interested of first 10 most occurred word, which means that I'd like to print out only first 10 rows. I have tried to change the printf command to print only first 10 sorted rows, but i have had no success:( Is it even possible to achieve this goal by only changing the printf command? Should i try something else?
script:
\{
$0 = tolower\($0\)
gsub\(/[^[:alnum:]_[:blank:]]/, "", $0\)
for \(i = 1; i <= NF; i\+\+\)
freq[$i]\+\+
\}
END {
sort = "sort -k 2nr"
for (word in freq)
printf "%s\t%d\n", word, freq[word] | sort
close(sort)
}
Thanks in advance!