script to check for existence of file (or else sleep for x time)

Hi Forum.

I have a script that accepts 3 input parameters (source directory, list file text, sleep time) and checks for the presence of files. If not there, script goes to sleep for certain amount of time provided by 3rd input.

list file text contains 1 entry but may contain more (file prefix):
MF_FF

#!/usr/bin/ksh

# Setup working variables
source_directory="$1"
src_list_files="$2"
sleep_time=$3
UPLD_DIR_BASE=/data/data2/staging

while read line
do
   source_file=`echo $line | awk '{print $1}'`
   while [[ ! -e ${UPLD_DIR_BASE}/${source_directory}/${source_file}*[0-9]*.[dD][
aA][tT]* ]]
   do
      echo "sleeping..."
      sleep ${sleep_time}
   done
 
done < ${UPLDSH_DIR}/${src_list_files}
Input:
$ pwd
/data/data2/staging/mf
$ ls MF*
MF_FF_20110228.dat

Output:
$ edw_check_source_files.sh mf MF_load_files.txt 55
sleeping...
sleeping...
.....

I cannot understand why my script is going to sleep even though the file exists.

Anyone can help me with my code and also ways to improve it?

Thanks Forum.

Globbing like that will also cause it to throw an error if there's more than one file that matches.

You can eliminate the use of awk by using read's built-in argument splitting features.

Without knowing the contents of that file I can't tell why it's not matching those files, but echoing the string when it doesn't will at least tell you what it is actually looking for.

#!/usr/bin/ksh

# Setup working variables
source_directory="$1"
src_list_files="$2"
sleep_time=$3
UPLD_DIR_BASE=/data/data2/staging

# 'read' is capable of splitting by itself.
# the first parameter goes into LINE, everything else into G.
while read source_file G
do
        # Shove it all into an array.  If there's more than one that's not an error.
        ARR=( "${UPLD_DIR_BASE}/${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]* ]] )

        while [[ -z "${ARR[0]}" ]]
        do
                echo "No match for ${UPLD_DIR_BASE}/${source_directory}/${source_file}*[0-9]*.[dD][aA][tT]*"
                sleep 10
                ARR=( "${UPLD_DIR_BASE}/${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]* ]] )

        done
done < "${UPLDSH_DIR}/${src_list_files}"

Thank you Corona688.

I will give your code a try.

---------- Post updated at 12:44 PM ---------- Previous update was at 12:32 PM ----------

Hi Corona688.

I'm getting the following error when executing the script:

0403-057 Syntax error at line 29 : `(' is not expected.

This is the line in question:

ARR=( "${UPLD_DIR_BASE}/${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]* ]] )

Please help.

Can you also explain after the sleep command, why we are capturing the entry into the ARR array again?

Thanks.

I accidentally left ]] in. Remove it.

To see if the file was created yet. No point checking the same data every loop!

Thanks for your quick reply.

After I removed the ']]' from the ARR syntax, I'm still getting the same error -

0403-057 Syntax error at line 29 : `(' is not expected.

See my code below:

# Setup working variables
source_directory="$1"
src_list_files="$2"
sleep_time=$3
 
while read source_file G
do
 
  # Insert all entries into an array.  If there's more than one that's not an error.
  ARR=( "${UPLD_DIR_BASE}${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]* )
 
  while [[ -z "${ARR[0]}" ]]
  do
     echo "No match for ${UPLD_DIR_BASE}${source_directory}/${source_file}*[0-9]*.[dD][aA][tT]*"
 
     # Sleep for x amount of time
     sleep $sleep_time
 
     ARR=( "${UPLD_DIR_BASE}${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]* )

  done
 
done < "${UPLDSH_DIR}/${src_list_files}"

Also I wasn't clear on my first posting, basically, script should check for existence of file(s) and if not there go to sleep for x mins.

After x mins, the script should terminate.

thanks for helping out a fellow Canadian/Torontonian. :smiley:

Could you please post your file MF_load_files.txt . i.e $2

Here you go Pravin27 - thanks

Only 1 entry in the file with 2 columns. First column represents a file prefix (as we get filename with different dates on a daily basis).

What the intent of the script is suppose to do - is sometime we don't get files on time, so the script is suppose to delay our process a bit if the file(s) arrive late.

$ more MF_load_files.txt
MF_FF 1

you forgot the line setting up the UPLD_DIR_BASE

UPLD_DIR_BASE=/data/data2/staging

sorry ctsgnb.

I didn't include the first few lines of the script since it's setting a bunch of environment variables.

#!/usr/bin/ksh

# Setup environment variables
. /data/informatica/ming/Scripts/EDW_ENV_PARM.sh

script is complaining about array syntax not the missing path.

thanks.

You may have an old version of ksh that doesn't support the () array syntax... Try this for size:

set -A ARR "${UPLD_DIR_BASE}${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]*

or a

set -- "${UPLD_DIR_BASE}${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]*
for i in $*
do
...
done

.... but ok some limitations may be encountered regarding the number of args

This has the side-effect of destroying your commandline parameters. Which is okay since you saved all the ones you cared about already; but just so you know.

Also, you'd need to do if [ -f "$1" ] , not if [ -f "${ARR[0]}" ]

---------- Post updated at 11:56 AM ---------- Previous update was at 11:52 AM ----------

The same limitation may exist for arrays anyway. Since we're only checking if the first element exists it's probably immaterial.

Try with this, This work in bash

while [[ ! -e "${ARR[0]}" ]]

Excellent guys - it's working now:

set -A ARR "${UPLD_DIR_BASE}${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]*

I'm busy with other things right now so I will test out the script later.

I might need your help later on.

---------- Post updated at 04:43 PM ---------- Previous update was at 02:01 PM ----------

sorry - I don't get why I need to check for

if [ -f "$1" ] instead of if [ -f "${ARR[0]}"

what's $1? Don't I need to check if the array is empty?

Also if there are 2 sets of files "MF_FF_201011.dat" and "MF_FF_201012.dat", what will the array contain with this statement?

set -A ARR "${UPLD_DIR_BASE}${source_directory}/${source_file}"*[0-9]*.[dD][aA][tT]*

thanks.