The script below works okay and emails me the log in the end once the script completes but what I'm trying to do is to also notify me via an email as soon as the script encounters any error whatsoever.
cat test.list
hdisk0 00a6351a2c832da1 rootvg active
hdisk1 00a6351a2c832f66 rootvg active
hdisk2 00a6351a2c833311 optvg active
hdisk3 00a6351a2c8334a5 optvg active
hdisk4 00a6351a2cbf3049 optvg active
##########################################
cat NEWDISK.list
rootvg hdisk6 hdisk0
rootvg hdisk7 hdisk1
optvg hdisk8 hdisk2
optvg hdisk9 hdisk3
optvg hdisk10 hdisk4
cat replacepv.sh
#!/bin/ksh
cat test.list | while read DISK1 PVID VG
do
if grep $DISK1 NEWDISK.list | read VG DISK2
then
echo "Replace DISK $DISK1 to $DISK2........" | tee -a LOG.OUT
sudo replacepv $DISK1 $DISK2 | tee -a ERR.OUT
fi
done
echo "" | tee -a LOG.OUT
echo "" | tee -a ERR.OUT
mail -s "Disk replaced" user@abc.com< LOG.OUT
Please be more explicit about what errors you want to cause to send an additional email.
Is it an error that you are invoking replacepv with three operands while the man page only specifies what happen when two operands are given? For example, when you read the 1st line from test.list (setting DISK1 to hdisk0 ), you will be running the command:
sudo replacepv hdisk0 hdisk6 hdisk0 | tee -a ERR.OUT
Is that an error? Does the above replacepv command complete successfully? Does it print any diagnostic messages? Does it write anything into ERR.OUT ?
Is it an error when your grep command matches two or more lines and you only process one of those lines? For example, when you read the 2nd line from test.list , the command:
grep hdisk1 NEWDISK.list
will match the following lines from NEWDISK.list :
rootvg hdisk7 hdisk1
optvg hdisk10 hdisk4
but you only read the 1st matched line and run the command:
sudo replacepv hdisk1 hdisk7 hdisk1
ignoring the 2nd line and not running the command:
sudo replacepv hdisk1 hdisk10 hdisk4
Is that an error? Or, is it an error that the above grep command matched hdisk10 on the 2nd line when it should not match hdisk10 when I assume you only wanted exact matches for the word hdisk1 ?
Or, did you just want to mail any lines appended to ERR.OUT (even though ERR.OUT does not contain any diagnostic messages that might have been written to explain what errors had been detected)?
My apologies for missing out a value in the script, it should have been as below,
if grep -w $DISK1 NEWDISK.list | read VG DISK2 DISK3
Since I'm replacing multiple disks, I want to get an email or I can set it up to send an alert whenever the "replacepv" command fails for any reason like disk not found or something before trying to run "replacepv" on next disk in sequence in other words notify me as soon as it fails at any point during the script execution. I noticed that ERR.OUT doesn't log anything if I use a disk that doesn't exist which throws an error on screen.
What system hardware are we talking about?
What RAID controller?
Are you asking....
How to interrogate the RAID controller?
How to trap the error in the script?
When you say to mail you as soon as it fails are you saying that you want to receive an email that very second? Email systems (daemons) don't work that fast. The mail relays don't work that fast depending if your inbox isn't on the same system. It could take a few minutes.
---------- Post updated at 09:38 AM ---------- Previous update was at 09:36 AM ----------
This may work, assuming I understand your requirements correctly.
Add the following lines to the start of your program:
err_code() {
mail -s "Problem with disk replacement" user@abc.com< ERR.OUT
}
trap 'err_code' ERR
set -e
set -o pipefail
What should now happen is that if the disk replacement program fails, an ERROR signal is sent to the process running your script, and it will run the function err_code , then exit.
The command set -e tells the shell to exit with an error if a command fails (you may want to put your loop, or just the sudo line, between a set -eset +e pair of commands so that other commands don't execute the err_code function). This isn't enough, however, as the tee command will exit with no error. So set -o pipefail command will cause the pipeline to fail if the sudo fails.