Monitoring processes in parallel and process log file after process exits

shunya · September 12, 2017, 12:42am

I am writing a script to kick off a process to gather logs on multiple nodes in parallel using "&". These processes create individual log files. Which I would like to filter and convert in CSV format after they are complete. I am facing following issues:

Monitor all Processes parallelly. whichever process completes first I would like to convert o/p log file of corresponding process to csv format. Sometimes process might take more than 30 mins to complete. Problem with my code is it will go serially converting files and not any process which completes or releases the file first. Please help

I have following code

#!/bin/sh -x

infile="$1"
PWD=`pwd`
shift
exec 2>&1
for IPADD in `cat $infile|awk '{print $1}' |tr -d '\015'`
do
OPFILE=${IPADD}_syslog.txt
nohup ${PWD}/collect_log.sh $IPADD > ${OPFILE} &
DPID=$!
echo -e "$DPID $OPFILE" >> pid_pfile.txt
done

##### Loop to parse and rename the files after data collection is complete.
check_palive()
{
PALIVE=`ps cax | grep $DPID | grep -o '^[ ]*[0-9]*'`
if [ -z $PALIVE ];then
HNAME=`grep -i hostname |awk '{print $NF}'`
awk '/name/,/exit/'  $OPFILE |head -n -1 |awk '{print $1,$2,$3,$4,$5}'> ${HNAME}.txt
fi
}


PCOUNT=`pgrep collect_log.sh |wc -l`

while [ $PCOUNT -gt 0 ];do

      for DPID in `pgrep collect_log.sh |awk '{print $1}'`
       do

       PALIVE=`ps -p $DPID --no-headers | wc -l`
          if [ $PALIVE == 0 ];then
            wait $DPID
            OPFILE=`grep $DPID pid_pfile.txt|awk '{print $2}'`
            check_palive
           # sed "/$DPID/d" pid_pfile.txt > pid_pfile.txt
        else
          sleep 120
       fi
done

PCOUNT=`pgrep "junk" |wc -l`
done
rm -f pid_pfile.txt

---------- Post updated at 11:42 PM ---------- Previous update was at 11:38 PM ----------

$infile has list of IP addresses of nodes.

apmcd47 · September 12, 2017, 6:18am

Stupid question, but why not something like

nohup ${PWD}/collect_log.sh $IPADD | awk '/name/,/exit/{print $1,$2,$3,$4,$5}'|head -n -1> ${OPFILE} &

The output files would then be post-processed on the fly.

I'm not up on job control, but by using bash or ksh I suspect you could change the DPID line to

DPID="$DPID $!"

and at the end of the loop wait for all processes:

wait $DPID

Andrew

rbatte1 · September 12, 2017, 6:19am

I think that this might be the wait and following commands. They are all serial, i.e. you wait and then do something for each PID in sequence.

You might need to do something more like this:-

while read PID OPFILE
do
    (wait $PID ; process_OPFILE $OPFILE ) &
done < pid_pfile.txt              # Read each line of the file in a loop

wait    # Make sure all have completed before continuing to any end of script process.

This would create a watcher for each main process that you want to process the report on afterwards. You can see that I read the two variables from the file at the same time at the start of the loop and you can then use them as you wish. This is just an example. Writing a function for process_OPFILE to keep the code neater if you have a lot to do.

Additionally, I've seen you try a sed of a file writing the output to the same file. Whilst probably no longer required, this will fail because the redirector > will open and empty the file before you have had chance to read it. If you want to update a file like this, try:-

sed -i "/$DPID/d" pid_pfile.txt

Does this help you?

Robin

shunya · September 12, 2017, 11:37am

Hi Andrew,

Your method works only if there is only file format filtering. I also need to rename the o/p file base on HNAME.

HNAME=`grep -i hostname ${OPFILE} |awk '{print $NF}'`

Not sure if we can incorporate that in the same line.

nohup ${PWD}/collect_log.sh $IPADD | awk '/name/,/exit/{print $1,$2,$3,$4,$5}'|head -n -1> ${OPFILE} &

Corona688 · September 12, 2017, 1:40pm

You can make your script a lot simpler by using more of the shell's own basic features. You can split input on fields in shell without the help of awk, sed, and tr. You can write to the same file 37 times instead of reopening the same files 37 times to append.

Also, why bother with all the pgrep stuff when you made yourself such a nicely formatted list of PID's to read?

#!/bin/bash

infile="$1" ; shift
PWD=`pwd`
exec 2>&1

NEWIFS=`printf "\r\n\t "` # Make sure read splits on carriage returns too

while IFS="$NEWIFS" read IPADD JUNK
do
        OPFILE=${IPADD}_syslog.txt
        nohup ${PWD}/collect_log.sh $IPADD > ${OPFILE} &
        DPID=$!
        echo "$DPID $OPFILE"
done < "$infile" > pid_pfile.txt

while read DPID OPFILE
do
        wait "$DPID"  # Waiting for a specific PID may require bash or ksh
        HNAME=`grep -i hostname |awk '{print $NF}'` # ???? What is this reading from?
        awk '/name/,/exit/'  $OPFILE |head -n -1 |awk '{print $1,$2,$3,$4,$5}'> ${HNAME}.txt
done < pid_pfile.txt

rm -f pid_pfile.txt

Also, if we knew what your data looked like, we could probably streamline that awk / grep / awk / head / awk down into one awk call.

shunya · September 12, 2017, 4:00pm

That is wonderful, Corona!! It works for me. Thanks for your help.