Script to check the files existence inside a directory.

Hello Folks,

On Below Script, I want to apply condition.
Condition that it check the directory for files if not found keep checking.
Once directory have files and founded do the calculation and finish the code.

Script to check the files existence inside a directory, If not then keep checking in loop unless it found and apply the rest of the logic and exit.

How can i achieve that?

Kindly guide.


#!/bin/bash
DATE=`date +"%d-%m-%Y-%H:%M"`
FLAG=FIRST
stream=IUCS
path="/bigpfstest/DPI_INVESTIG/IUCS"
for  files in `printf "%s\n" $path/* | tail -5`
        do
	TTHEX=`awk -F ',' 'END{print $4}' $files`
        TIMESTAMP=$( date +'%H:%M:%S' -r $files)
        TRANS_TIME=$(date -d @$(expr `printf "%d" 0x$TTHEX` / 1000) | awk '{print $4}')
        TIME_LAG=$(date +%H:%M:%S -ud @$((`expr $(date -u -d "$TIMESTAMP" +"%s") - $(date -u -d "$TRANS_TIME" +"%s")`)))
        echo "${DATE} ${stream} ${FLAG} $(ls -l $files | awk '{print $9}'| cut -d '/' -f5) ${TIMESTAMP} ${TRANS_TIME} ${TIME_LAG}"  >>IUCS_TEST.csv
        done

You can wrap your for-loop by an infinite loop, end exit the script once you did your calculation:

while true
do
    for ...
    do
       ...
       exit 0
    done
    sleep 123
done
1 Like

This might be neater as a while or until loop:-

while [ ! -f $path/* ]
do
    sleep 5              # 5 second pause, or whatever else you want to do
done

# Rest of script here

... or ...

until [ -f $path/* ]
do
    sleep 5              # 5 second pause, or whatever else you want to do
done

# Rest of script here

It depends how best suits the flow of reading your logic.

Does that help?

Robin

1 Like

It is of course a matter of taste. I personally don't like your solution so much, because it globs for the files twice (in the while- and in the for-loop), which makes maintenance more difficult (think about the case, when the requirement changes and you have to look for files matching, say, *.log, instead of * ... you have to remember to make the change in two places).

Of course if we start thinking in this way, there is much more which could be improved in this script .....

2 Likes

^^^ he didn't take my suggestions either in the other thread he started.

1 Like

Some people are more choosy than others! :smiley:

1 Like

@rbatte1: while your proposal will work fine in an empty directory, if you're expecting one single matching file, it will fail should there be more than one. Which is not too far fetched according to what the requestor posted in post#1. So additional measures should be taken to avoid the error condition.

3 Likes

---------- Post updated at 12:35 PM ---------- Previous update was at 12:30 PM ----------

#!/bin/sh
DATE=`date +"%d-%m-%Y-%H:%M:%S"`
stream=S1
FLAG=FIRST
path=/bigpfstest/INVESTIG/S12
CNT=0
while [[ $CNT -le 10 ]]
do
                if [[  -f $path/*.RTM ]]
                                then
                                                for  files in `printf "%s\n" $path/*.RTM | tail -3`
                                                                do
                                                                                TT=`awk -F ',' 'NR==1{print $1}' $files`
                                                                                FILENAME=`ls  $files | cut -d '/' -f5`
                                                                                TIMESTAMP=$( date +'%H:%M:%S' -r $files)
                                                                                TRANS_TIME=$(date -d @$(printf '%.0f\n' $TT) | awk '{print $4}')
                                                                                TIME_LAG=$(date +%H:%M:%S -ud @$((`expr $(date -u -d "$TIMESTAMP" +"%s") - $(date -u -d "$TRANS_TIME" +"%s")`)))
                                                                                if [ "${FILENAME}" == "" ];
                                                                                then
                                                                                                                echo "" >/dev/null
                                                                                else
                                                                                                                echo "${DATE} ${stream} ${FLAG} ${FILENAME} ${TIMESTAMP} ${TRANS_TIME} ${TIME_LAG}" >> /bigpfstest/DPI_INVESTIG/check.csv
                                                                                fi
                                                                done
                                ((CNT=100))
                else
                                sleep 2s
                                (( CNT++ ))
                                echo "$CNT"
                fi
done


kindly help what i can change on it, to catch the file apply the for loop to do some calculation and exit.

If not found atleast it iterate for counter to keep looking for files till no. of counter reach

---------- Post updated at 01:39 PM ---------- Previous update was at 12:35 PM ----------

Requriment:

Say
Directory contains 1000 files in every 5 mins and fly quickly to other place.
I need a help to correct my script which can keep checking the directory for the files after every 10 seconds for 20 times.

If anytime it found the files inside the directory within this 20 times, execute the for logic of my script part and with if condition echo the result.

Kindly guide.

#!/bin/sh
DATE=`date +"%d-%m-%Y-%H:%M:%S"`
stream=S1
FLAG=FIRST
path=/bigpfstest/INVESTIG/S1
CNT=0
while [[ $CNT -le 10 ]]
do
                if [[ ! -f $path/*.RTM ]]
                                then
                                                for  files in `printf "%s\n" $path/*.RTM | tail -3`
                                                                do
                                                                                TT=`awk -F ',' 'NR==1{print $1}' $files`
                                                                                FILENAME=`ls  $files|cut -d '/' -f5`
                                                                                TIMESTAMP=$( date +'%H:%M:%S' -r $files)
                                                                                TRANS_TIME=$(date -d @$(printf '%.0f\n' $TT) | awk '{print $4}')
                                                                                TIME_LAG=$(date +%H:%M:%S -ud @$((`expr $(date -u -d "$TIMESTAMP" +"%s") - $(date -u -d "$TRANS_TIME" +"%s")`)))
                                                                                if [ "${FILENAME}" == "" ];
                                                                                then
                                                                                                                echo "" >/dev/null
                                                                                else
                                                                                                                echo "${DATE} ${stream} ${FLAG} ${FILENAME} ${TIMESTAMP} ${TRANS_TIME} ${TIME_LAG}" >> /bigpfstest/DPI_INVESTIG/check.csv
                                                                                fi
                                                                done
                                ((CNT=100))
                else
                                sleep 2s
                                (( CNT++ ))
                                echo "$CNT"
                fi
done
if [ -r job.pid ]
then
    echo $(cat job.pid) did not terminate 
    mail -s "$(cat job.pid) did not terminate" admin
    exit
fi
echo $$ >job.pid
list=$(ls *.job)
if [ $list <> "" ]
  then 
  for file in $list
  do
    fuser $file
    if [ $?  -eq  1 ]
    then
       echo $file $(date) >>job.log
       your stuff
       mv $file done
    fi
   done
fi
echo $(cat job.pid) finished at $(date) >>job.log
rm job.pid

Generally I create a requests directory to hold the input files, and and a done directory to hold the completed input files. If a file is in use, it is left for the next invocation. Create a cron job to run this as frequently as you need.
Add a line in one of your rc commands to remove any job.pid files on system startup.
You need to stop the process that removes the files from the directory and replace it with this process otherwise you will never guarantee that you have processed every file. Also with this many files being added you will need to create a dedicated incoming directory, and have a cron job delete and recreate this directory on a regular basis.

1 Like

Hey jgt,

I cannot hold the files in another directory as per my requirement.

I want last 5 files and on every file only 1st record.
I need file creation time too, so if i copy these to another location then the file creation time will change on the latest directory.

each files contains record where from 1st record of each file i am selecting
a column having time data,

If you notice on my script, echo last three attribute is file creation time, time inside record and time lag---which i am calculating from file creation time and time inside record of each files says column 1st.

kindly let me know, if more clarification required.

My for loop is must required for my requirement.
I need a script where my for loop fit.
so that if directory doesn't contain files it should not go to for loop and keep in loop till it found the files.
if it find then only it should run the for loop.

You have to explain how these files arrive? What process assigns the file name? and what process removes the files from the directory.
How large are these files?

From destination server these files landing into my environment directory.
Through shell script assign the files to our landing directory, so every 5 mins the files keep on coming to our landing directory and from their it fly to different server within frame of time.

Once its landed ,it moved to different server.

Each file size is almost 256KB.

My requirement is i need to catch last 3 files and perform for loop of my script echo values and exit.

And if it doesn't able to find then every 10 secs it shud check for the files.
Once found perform rest of the logic and make flag change and script successful exit.

So my requirement is now,

Check the directory for file exists.
If found do for loop logic
Else
Sleep for few secs
And again check for files keep on looping

For certain no of count say 20 count each iterate after every 10secs of sleep.

If within this count not found script should exit successfully.

From the time that you start your process to find the last three files and the time that you try to actually read the file how will you know that the file is still on your system?
Can you predict the file names?
My opinion is that without access to the code that either sends the files or removes the files you will not be able to accomplish your requirement with any degree of certainty.
You may be able to find a solution using a packet sniffer.

Consider it stay there for 10 mins.

No of files are good in amount like 1000.

So I can catch them within that frame.

Manually i am able to do so, because i am keep looking for files in that directory.

But i dont want to try babysitting.

Just let me know how i can check for the files exist in a directory.
Let me try for 20 times to check it with sleep of 10 secs

If within that frame it catches fine else
Exit

I just need to catch once in that 20 times. If got then execute the rest of the code and exit.

Can i achieve it ?

Using flag, while, until, if any of this logic which can help me to get that.

How do you know when all the files have been received. You could try to continually create the list of the last 3 files until the list no longer changes.

wait=0
cat /dev/null >previous
while true
do
printf "%s\n" *|tail -3 >current 
diff current previous >/dev/null
if [ $? -eq  0]
     #files are the same
     let wait=wait+1
else
     wait=0
     cp current previous
fi
if [ wait -gt ?? ]   # how long do you want to wait after there are no changes in the list?
  then
   #no files have been received for wait time
   break
fi
sleep 5
done
now do your stuff

Why do you need "5 last files" when that can change every moment? Why not just a representative sample?

This line lists the files in alphabetical order; which may not be the order that is expected.
Even using "ls -tr" will not necessarily have the last few files if the files arrive within a very short time frame.
Consider the output of the split command; 'ls ' does not always produce the same list as 'ls -tr' even though the lists should be identical.