Frequency of Words in a File, sed script from 1980

tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | sed ${1:-25} < book7.txt

This is not my script, it can be found way back from 1980 but once it worked fine to give me the most used words in a text file.
Now the shell is complaining about an error in sed

sed: -e expression #1, Character 2: missing command

The instruction to this one liner tells to set it into an executable script, but lazy people ask, because in my former configuration it worked fine to find the most used words in a large text file. So can anyone give me a hint on the error of sed and its missing expression to the characters. I am trying this in the very directory where the file of book7.txt is located.
Thanks in advance.

One might guess that a current sed would work if you change:

sed ${1:-25}

in that pipeline to:

sed -n "1,${1:-25}p"

which would print the 1st 25 lines if no command line arguments are given to your script or the top X lines if the 1st argument to your script is X.

1 Like

Did you follow the instruction? And run the executable script with an adequate parameter?In bourne compatible shells, the ${1:-25} expands to the first positional parameter's contents or - if missing - to 25; c.f. man bash :

With no parameter given, I get the same error message as you do, as sed can't cope with a 25 as the sole "command". With a first positional parameter of e.g. 1,15!d , above script will print the 15 topmost words in the text presented.

I'm a bit surprised that script should have ever run with no parameters given.

Where do you think tr is getting its input?

Good point. A better chance at a working script might be any one of the following three commands:

{ tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}
} < book7.txt

or:

(tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}) < book7.txt

or:

tr -cs A-Za-z\' '\n' < book7.txt | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}
1 Like

@Rudi C It worked under debian squeeze.
@ Don Cragun I will try the given options, many thanks, really
@cfajohnson I do not know, I thought it would be given by the < character
While moving to another living space, I will try this given hints, and reply which one solved the problem. Will take some days...
@Don Cragun

all I get as an answer is that there is a wrong modifier in all three cases

[/CODE]
$ (-)

So I guess it would be better to make it an executable script to test it. And taking out the " - " character did not work either




Well, I am probably unable to just apply a script. The following is from 2009, and should work as well counting the frequency or occurrence of words in a given textfile. 


  cat test.file | tr -d '[:punct:]' | tr ' ' '\n' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -rn   

I put in another .txt-file and it works fine.