pid.cleanup script.

Hi guys!

I have a directory in the production environment from which i have to delete files older then 40 minutes with .pid extention. I wrote a script below for the purpose.

#!/bin/bash
#
# Script to delete specific file older than N minutes.
# OLDERTHAN="40" #40 minutes

FOLDER="home/optima/pids/"
PID="*.pid"
OLDERTHAN="40"
if [ -e ${FOLDER}/${PID} ]
then
     ls_time=`ls -l ${FOLDER}/$PID`
     pid_h=`echo $ls_time | cut -d\ -f 8 | cut -d\: -f 1`
     pid_m=`echo $ls_time | cut -d\ -f 8 | cut -d\: -f 2`
     echo pid time=$pid_h:$pid_m
     let pid_time=(10#$pid_h*60)+10#$pid_m
     curr_h=`date | cut -d\ -f 4 | cut -d\: -f 1`
     curr_m=`date | cut -d\ -f 4 | cut -d\: -f 2`
     let curr_time=(10#$curr_h*60)+10#$curr_m
     echo curr_time=`date | cut -d\ -f 4`
     let diff=10#$curr_time-10#$pid_time
     #echo pid_time=$pid_time
     #echo curr_time=$curr_time
     echo diff=$diff minutes
     if [ $diff -ge $OLDERTHAN ]
     then
          echo "${PID} is older than $OLDERTHAN minutes"
          echo "Deleting ${PID}..."
          rm -f ${FOLDER}/${PID}
     else
          echo -e "${PID} is not older than $OLDERTHAN minutes"
     fi
else
     echo -e "${PID} not found."
fi

I created a test pid in /home/somefolder/pids to check if the script will delete old pids or not. I ran it & it showed the following result �*.pid not found.� Any advice?

# ls -ltr
total 96
-rw-r----- 1 root sys 0 Feb 28 09:00 test.pid
-rwxr-xr-x 1 root sys 15 Feb 28 11:52 optimamd_opx_LOD_GEN_110_00111010L.pid
-rwxr-xr-x 1 root sys 15 Feb 28 11:52 optimamd_opx_LOD_GEN_110_001110107.pid
-rwxr-xr-x 1 root sys 15 Feb 28 11:52 optimamd_opx_LOD_GEN_110_001110109.pid
-rwxr-xr-x 1 root sys 16 Feb 28 11:52 optimamd_opx_LOD_GEN_110_00111010H.pid
-rwxr-xr-x 1 root sys 15 Feb 28 11:52 optimamd_opx_LOD_GEN_110_00111009A.pid
-rwxr-xr-x 1 root sys 15 Feb 28 11:52 optimamd_opx_LOD_GEN_110_001110106.pid
#
 
/home/optima/run
 
-rwxrwxrwx 1 optima dba 1716 Jun 2 2010 NSN_MGWtest_DAP.sh
-rwxrwxrwx 1 optima dba 2828 Jul 7 2010 RunValidators_HUAWEI_BSC6000.sh.orig
-rwxrwxrwx 1 optima dba 5482 Jul 7 2010 NSN_MSS_DAP.sh
-rwxrwxrwx 1 optima dba 1729 Jul 7 2010 NSN_MGW_DAP.sh
-rwxrwxrwx 1 optima dba 244 Jul 7 2010 Run_FTP.sh
-rwxrwxrwx 1 optima dba 1914 Jul 7 2010 NSN_MGW_VAL.sh
-rwxrwxrwx 1 optima dba 6013 Jul 7 2010 NSN_MSS_VAL.sh
-rwxrwxrwx 1 optima dba 5138 Jul 9 2010 NSN_MSS_loader.sh
-rwxrwxrwx 1 optima dba 1715 Jul 9 2010 NSN_MGW_loader.sh
-rwxrwxrwx 1 optima dba 1460 Jul 10 2010 MOTOROLA_BSS_VALIDATOR.sh
-rwxrwxrwx 1 optima dba 6100 Oct 11 08:37 RunLoaders_HUAWEI_BSC6000.sh
-rwxrwxrwx 1 optima dba 2828 Oct 18 12:40 RunValidators_HUAWEI_BSC6000.sh
-rwxrwxrwx 1 optima dba 945 Nov 29 16:27 MOTOROLA_BSS_CMB.sh
-rwxrwxrwx 1 optima dba 2120 Dec 6 11:39 MOTOROLA_BSS_LOADERS.sh
-rwxrwxrwx 1 optima dba 228 Jan 10 11:31 OLD_RunProcessMonitor_004.sh
-rwx------ 1 root sys 252 Jan 19 12:34 Neighbour_single.sh
-rwx------ 1 root sys 228 Jan 26 12:07 RunProcessMonitor_004.sh
-rwxrwxrwx 1 root sys 957 Feb 28 11:39 pid_cleanup.sh
# ./pid_cleanup.sh
*.pid not found.

will changing

PID="*.pid"

to

PID="*pid* *.pid*".

help?

---------- Post updated at 06:44 AM ---------- Previous update was at 05:19 AM ----------

I replaced PID=�*.pid� with PID="*pid* *.pid*" and ran the script and got the below result. Kindly advice.

# ./pid_cleanup.sh
./pid_cleanup.sh: script:  not found.
./pid_cleanup.sh[10]: Syntax error at line 16 : `(' is not expected.
#

The test

if [ -e ${FOLDER}/${PID} ]

does not accept multiple values this way. Putting a set -x in the head of your script shows following:

+ '[' -e home/optima/pids/optimamd_opx_LOD_GEN_110_00111009A.pid home/optima/pids/optimamd_opx_LOD_GEN_110_001110106.pid
 home/optima/pids/optimamd_opx_LOD_GEN_110_001110107.pid home/optima/pids/optimamd_opx_LOD_GEN_110_001110109.pid
 home/optima/pids/optimamd_opx_LOD_GEN_110_00111010H.pid home/optima/pids/optimamd_opx_LOD_GEN_110_00111010L.pid ']'
./mach.ksh: line 10: [: too many arguments
+ echo -e '*.pid not found.'
*.pid not found.

I suggest to put it this way and do not test with -e if the files exist, as the loop will only run on the files that fit to the expansion anyway and sowith exist:

for f in ${FOLDER}/*pid; do
   ls_time=`...
   ...
done

You will also get problems with these kind of cuts:

cut -d\ -f 8 | cut -d\: -f 1

If you want a blank as delimeter, write it like this:

cut -d" " -f 8 ...
#or
cut -d\  -f 8 ...

In the second line there are 2 blanks so cut will know that only the following blank right after the backslash is the delimeter, not the rest of the command too. I'd suggest you take the 1st solution as it is more clear to read.

And another more general thing is, that the programs or scripts that generate those pid files, should be able to clean them up themselves. It seems that there is something not working properly and that this script is just a workaround for the real problem.

1 Like

Referring to post #1 (we'll ignore post #2).
The script contains a fundamental logic error. If there is more than one file matching the pattern *.pid the variable $PID contains multiple values. From that point on the script fails.

I'm not quite clear what you are trying to do here. Reading betweeen the lines, beware that deleting a ".pid" file will not stop the process with that Process ID.

The script needs restructuring to process each file in turn.

e.g.

#!/bin/bash
#
# Script to delete specific file older than N minutes.
# OLDERTHAN="40" #40 minutes

FOLDER="home/optima/pids/"
ls -1d "${FOLDER}"/*.pid 2>/dev/null | while read PID
do
OLDERTHAN="40"
if [ -e ${FOLDER}/${PID} ]
then
     ls_time=`ls -l ${FOLDER}/$PID`
     pid_h=`echo $ls_time | cut -d' ' -f8 | cut -d\: -f1`
     pid_m=`echo $ls_time | cut -d' ' -f8 | cut -d\: -f2`
     echo pid time=$pid_h:$pid_m
     let pid_time=(10#$pid_h*60)+10#$pid_m
     curr_h=`date | cut -d' ' -f4 | cut -d\: -f1`
     curr_m=`date | cut -d' ' -f4 | cut -d\: -f2`
     let curr_time=(10#$curr_h*60)+10#$curr_m
     echo curr_time=`date | cut -d' ' -f4`
     let diff=10#$curr_time-10#$pid_time
     #echo pid_time=$pid_time
     #echo curr_time=$curr_time
     echo diff=$diff minutes
     if [ $diff -ge $OLDERTHAN ]
     then
          echo "${PID} is older than $OLDERTHAN minutes"
          echo "Deleting ${PID}..."
          # Remove echo when thoroughly tested
          echo rm -f ${FOLDER}/${PID}
     else
          echo -e "${PID} is not older than $OLDERTHAN minutes"
     fi
else
     echo -e "${PID} not found."
fi
done

Keep the "rm" line as an "echo" until you are happy that your script does what you want.
Noted zaxxon comments and corrected "cut" statements.
Noted that the "date" line could be better but didn't change it.

1 Like

can be done with 1 find command

 
find /home/somefolder/pids -maxdepth 1 -type f -name "*.pid" -amin +40 -exec rm {} \;
1 Like

Change it to -mmin (as -amin changes also when someone issues a read like cat on it) and xoops solution will be best, if your find supports that switch. :b: Though you might check out the commented things by Methyl and me to reduce errors in further shell scripts.

1 Like

:slight_smile: You got me there mate. This is a work around. I am trying to support my customer who is on the other side of the planet. And I cant have a remote session on his production server due to their IT Policey. I will look into your suggestion and will give you an Update. Thank you very much for your support.

---------- Post updated at 09:19 AM ---------- Previous update was at 08:43 AM ----------

Thanks methyl. That was very educating. You are right about the process thingy. I might end up getting a zombie process.

Let me explain what is going on a bit. There is a process, or a program, called the Process Monitor. Its job is to remove the failed programs.

Whenever a process is initiated, it generates an associated monitor file (.i.e. <exename>_<programid>.pid) file. The purpose of this file is to make sure that the new instances of a program are not started if the previous instance is still running.

Before a program starts an instance, it checks if an associated monitor file exists. If one does exist, then this indicates that an instance is already running and so the program immediately exits. If a monitor file does not exist, the program starts and creates a monitor file. This file is uniquely associated to the program instance using the PRID in a common directory (the default is $OPTIMA/PIDS). When the program has run, it removes the monitor file. The Process Monitor ensures that monitor files are removed if programs crash or hang.

By the way, I tried the following set of lines and it worked but with a catch.

#!/bin/ksh
# filetimes
filetime()
{
perl -e '
$mtime = (stat $ARGV[0])[9];
print $mtime
' $1
}
now=$(date +%s)
limit=$(( $now - 1800 )) # one half hour ago.
find /home/optima/pids -type f |\
while read filename 
do
dt=$(filetime $filename)
if [[ $dt -lt $limit ]] ; then
rm -f $filename # delete ALL files older then 30 minutes
fi
done

I have realized that the PIDS in /pids folder have the same updated time like the below figs

the script will not be able to delete anyone of the pids since the time is same and the loader will see that another process is running. I have hit the wall here :wall:
any advice dear?

---------- Post updated at 09:23 AM ---------- Previous update was at 09:19 AM ----------

Thanks mate. I tried this before even posting here. Particular flavor of the solaris dosnt support -amin or -mmin of the find command

---------- Post updated at 09:38 AM ---------- Previous update was at 09:23 AM ----------

Standard Unix file systems don't store creation time. The only times
Unix stores for files are access time, modification time, and inode
change time.

Oh GOD help ... I not cut for this :frowning:

so ,can you use "touch" to change the access time?