Retry every ten seconds while lockfile present

Hi,

I have written below check lockfile script but need some tweaking on it.

If there is a lockfile from present, I need the script to retry every 10 seconds to see if the lockfile is still there. After 120 seconds it should send an email.

In my current version, if the script encounters the lockfile it will start counting to 120 seconds anyway and send an email, even if the lockfile is not present anymore.

Can someone give me some directions on how to tweak the second while ; do ; done loop?

LOCK1=$TRANSOUT/lock1.txt             # name and place of first lockfile
LOCK2=$TRANSOUT/lock2.txt             # name and place of second lockfile
HEARTBEATFILE=$TRANSOUT/heartbeat       # name and place of heartbeat file for
while :                                          # start an infinite loop here
 do
    LOGFILE=`date +$LOGDIR/COPY_LOG_%Y%m%d.log` # name and place of our daily logfile
    HEARTBEAT=`date +%Y%m%d%H%M`                 # heartbeat timestamp
    TIMESTAMP="`date +%H:%M:%S`"                 # timestamp to add to our log
    
    typeset -i count                             # declare the count variable to be local to the function it is defined in
    rm -f $LOCK1                              # remove our lockfile if we didn't exit cleanly              
    
     while [ -f $LOCK2 ]                      # while Others are busy
      do
    
          unset TIMESTAMP                        # clear timestamp so we will see the right time for the next log entry
          sleep 10                               # wait for 10 seconds and retry
          TIMESTAMP="`date +%H:%M:%S`"           # new timestamp to add to our log
          count=$((count + 1))                   # increment counter by one
    
           if [ $count -eq 12 ] ; then           # if we waited 120 seconds
           
            echo "Lockfile present" | mailx -m -s "WARNING LOCKFILE PRESENT, WAITED 120 SECONDS" john@doe.com # email the culprit
            echo "$LOCK2 PRESENT, WAITED 120 SECONDS" | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add message to logfile
           
            count=0                              # restart the counter
           fi
      done                                       # if no second lockfile exists (anymore)
                             
     echo $$ > $LOCK1                         # create our own lockfile with our own process ID so that others know we are working
                                                 
     cd /interface          # go to initial directory where files are
   
     ls -1 * > move.lst                      # (ONE not L!) create a list of file names in single column format in directory
                                                 # and put them into one file called move.lst
                                
      if [ -s move.lst ] ; then              # check to see if move.lst is not empty so we can continue

       while read N                              # while we read lines in move.lst
        do
         case
           *) echo >>run_move.lst mv $N $TRANSOUT/ ;;                # add move command and put this in run_move.lst
          esac
        done <move.lst

       chmod 777 run_move.lst                   # make run_move.lst executable
       run_move.lst                             # and execute it
                                                 
       rm move.lst                           # remove original list
       cat run_move.lst | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add our move list to a logfile
       rm run_move.lst                          # remove move list

      else                                       # if move.lst is empty in the first place
       echo "NO FILES TO MOVE" | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add a message to the logfile
       rm move.lst                           # remove original list
      fi

    COPYFILE="meert.log"
    cp $LOCK1 $LOGDIR/$COPYFILE               # put the current process id to the logdir
    echo $HEARTBEAT > $HEARTBEATFILE             # put heartbeat timestamp into heartbeat file
    rm -f $LOCK1                              # remove our lockfile so the next program can start
                                                 
    unset TIMESTAMP                              # clear timestamp so we will see the right time for the next log entry
    sleep 180                                    # wait another 180 seconds and start with the whole process again
    TIMESTAMP="`date +%H:%M:%S`"                 # new timestamp to add to our log
    echo "WAITED 180 SECONDS FOR NEXT MOVE" | sed -e 's/^/'"$TIMESTAMP"' /g' >> $LOGFILE # add a message to the logfile
    unset TIMESTAMP                              # clear timestamp so we will see the right time for the next log entry
    unset HEARTBEAT                              # clear heartbeat timestamp
    unset LOGFILE                                # clear logfile name so this program keeps logging to the most current logfile
 done
count=0
while [ -f $LOCK2 ]                      # while Others are busy
do
  if [ $count -eq 12 ] ; then            # if we waited 120 seconds
    TIMESTAMP="$(date +%H:%M:%S)"        # new timestamp to add to our log

    echo "Lockfile present" | .......
    echo "$LOCK2 PRESENT, WAITED 120 SECONDS" | .......

    count=0
  fi
  sleep 10                               # wait for 10 seconds and retry
  count=$((count + 1))                   # increment counter by one
done

Hello ,
Common atomic operation on filesystem is mv .
Code :
"while [ -f $LOCK2 ] " will fail , eventually , some day under heavy stress with concurrent access .

Correct locking should be like this :

while [ 1 ];do
   mv  $COMMON  $PRIVATE
   if [ $? -eq 0  ];then 
      # critical section here
      # do the work and   release the lock 
      mv $PRIVATE  $COMMON
   else 
       #  bad luck 
       sleep 120
    fi
done

This code will work on almost any possible combinations ( NFS , AFS , GFS , OCFS etc )

Scott: thank you, this resolved my issue.

rrstone: I don't understand your reply.

My claim is :
Check for file existence is open to race condition .

My code provides solution for this race condition .

and this is provided as a module called as flock

flock on NFS ?
You should be very careful with flock on NFS .
Bad things happen when you use different Unixes and rely on flock .