I'm taking a tutorial - intro to Unix Shell Scripting - and the first exercise is a walk-through of writing a script to achieve the following:
"As a network administrator you are collecting configuration information about all FTP servers in the organisation. You need to write a script to collect information about:
The FTP server that is running.
The configuration file and its location on the FTP server.
Permission information of the configuration file.
Useful content of the configuration file."
I'm sure this is probably a naive script that would make some of you gosu's vomit but just go along with it for now please. SO anyway, it has me enter the following commands to start the script:
#!/bin/bash
clear
ps ax | grep -v grep | grep ftpd 1> /dev/null || echo "NO FTP service running"
#### end of snippet #####
Now obviously when I finish the script the tutorial prints the desired output. However I don't understand why there are so many greps in this command. For example, I ran the same command on a Ubuntu vm:
Why do I need so many greps in that command? Again - I realise this is some tutorial that is presenting this script, not you, but can you guess why they would require this? Just fyi - here's the whole script:
]#!/bin/bash
clear
ps ax | grep -v grep | grep ftpd 1> /dev/null || echo "NO FTP service running"
CONFIG_FILE=$(ps ax | grep -v grep | grep ftpd | awk '{print $6}')
echo "FTP configuration file on `hostname` is"
basename $CONFIG_FILE
echo "It is located in the directory"
dirname $CONFIG_FILE
echo ________________Detailed information about the file_______________
ls -l $CONFIG_FILE
echo
echo
echo "_______________Non-commented lines in the ${CONFIG_FILE}__________"
cat $CONFIG_FILE | grep -v ^#
Oh I forgot about that - kind of like when I run ps there will be a "ps" process showing in the list of processes simply because I ran the ps command. Ok I should have caught that.
Better still, use pgrep/pkill. They may not be universal, but they're not rare either. A lot of systems have them.
If not, your ps probably has a way to modify its output format to only include the command's name in the output (argv[0]). This way, you don't have to defend against command line arguments colliding with command names (or user names, or group names, or ... you get the picture).
Linux's ps seems to have a -C option for filtering on command name, which could also be of use. But, if using Linux, pgrep is almost certainly available.
Sadly, ps is one of those commands that is not at all portable. ps on a *BSD system and ps on a linux box and ps on one of the commercial unices tend to require different arguments for everthing but the most trivial invocations.
which is exactly why it is extensively used in combination with grep, in order to make it work no matter what teh underlying system supports in the ps command.
It may work, mostly, regardless of the underlying system, but it's not a robust solution. False positives remain a possibility and could result in killing the wrong process or mistakenly concluding that a dead process is still running.
A few years ago I helped someone troubleshoot an intermittent failure which I found to be caused by grepping ps output for a short command name which sometimes occured as a substring in a related command's arguments.
You could improve the situation by tightening up the regular expression a bit, or perhaps using AWK to search for the term in a specific field, but at that point you may be depending on non-portable output format characteristics. May as well make life easier and just use the non-portable ps options.
For this task, I'm not sure you can achieve portability and correctness using ps. I think with pgrep you can, but I haven't taken a close look at many pgrep implementations. That said, I realize that many systems and tasks are not mission critical, and so an 80% solution is often sufficient.