Using sed with a foreach loop

So I am back again beating my head against the wall with a shell script and getting a headache! I want to change each year in a file (1980, 1981, 1982, 1983, etc.) to the same year followed by a tab.

The input is "blah blah (1980) blah blah".

I want to get "blah blah (1980 ) blah blah".

foreach year (1980 1981 1982)
sed -e 's/$year/$year     )/g' input_file > output_file

only changes the last year. Does sed need something special with foreach?

Thanks much to anyone in advance, Peggy

I don't know which shell you are using but this way it shouldn't work at all:

First, a "for"-loop looks like this:

for year in 1980 1981 1982 ; do
     sed ....
done

You might have a shell which can understand what you have written, but i suggest writing shell scripts which are really really portable. You might want to use a script you have written before for system x on system y too only to find that what works on system x will not work on system y because some specialities of system x you have used.

Don't get me wrong, this is a trap almost all beginners fall into, especially because they cannot distinguish between system-specific extensions and commonly available tools. This is not meant to embarrass you, just to help you starting proper scripting habits as soon as possible - you can't start too early doing it the right way.

Second, you repeatedly read from "inputfile" and write to "outputfile". You are overwriting in every pass what you have written in the last pass - this is why only the last change is visible to you. To have all the changes in your output use the following logic:

lastyear="input_file"
for year in 1980 1981 1982 ; do
     sed ..... "$lastyear" > "output_file.$year"
     lastyear="$year"
done
mv "output_file.$year" output_file      # <- remove intermediate files
rm -f output_file.198[012]              # <-

To better understand the logic go through it manually and write down the content of the "$year" and "$lastyear" variables as they change throughout the execution.

Third, you use single quotes to surround your sed script. While this is usually very sound practice because sed scripts tend to contain all sorts of characters which are special to the shell and need to be protected by single quotes in this case it is "protecting" the variable from being expanded as well. If you write

$year

the shell will expand this to the content of the variable "year", but if you write

'$year'

(note the single quotes) this is just literally the string "$year". Again, your shell might tolerate this and still expand "$year" to its value, but the standard shells will not. To be portable you better do not rely on this mechanism but use the following:

sed 's/'"$year"'/'"$year"' )/g' input_file > output_file

I know, this looks a bit awkward, but it will work in every standard shell (Bourne, Korn, Bourne Again, etc.) there is.

There are two last points, which are more optional: if you match some string in sed and want to use the complete matched string in the replacement you can use the "&" metacharacter. A second point is you could phrase the regular expression in a way that you would not need a loop at all. The following variant of the sed script will do the same as yours, but without the need of a shell loop (replace the "<tab>" with literal tab characters, of course):

sed 's/198[012]/&<tab>)/g' input_file > output_file

Even if you can't pack it into one regex you can execute several commands at once in a sed script:

sed 's/1980/&<tab>)/g
     s/1981/&<tab>)/g
     s/1982/&<tab>)/g' input_file > output_file

I hope this helps.

bakunin

THANKS! Absolutely wonderful! :slight_smile: