Check to see if a file is generating

Hi guys,

I am pulling my hair out here. I have a file that comes in, once it finishes i want to move it to a new location. This sounds all very easy but my solution is failing and moves the file before it has finished generating.

isrun=`ps -ef | grep -i filename | grep -v grep | wc -l`

while [ $isrun -gt 0 ]
do
sleep 60
done

i test this on the command line it works fine. while the file is generating the result is greater than 0 . But i just ran this and it grabbed the file before it finished generating. Can someone explain what i am doing wrong and or give me a simple solution. I'm sure there maybe an easier way (maybe using fuser) but it drives me insane when everything tells me it should work and it fails to. I am self taught on UNIX so im still learning, any help would be aprreciated

Is the file being generated byyour account - i.e., the process runs under your username.
If this is Linux, you can use the /proc filesystem to see if there exists a file descriptor open for that file. When the descriptor goes away, the file is done generating.

You can also use inotify, lsof, or fuser to see if that process has the file open. Depends on the OS.

Please identify the OS and shell you are using. Then we can give you exact help.

1 Like

First you need to precisely define "done". Then how do you handle error conditions?

And you need to take into account the only entity that knows for certain that the file is "done" - and correct - is whatever is writing it.

Even checking if a process has the file open isn't going to tell you the file is "done" with certainty - that ignores error conditions, such as when a network connection fails.

1 Like

Thanks for the replies, Jim the system I am using is a Solaris. With very strict restrictions on plugins. So I think fuser maybe the way to go? Could you suggest away to use that?
Also achenle thanks for your input. Good points made. Will have to consider the error handling when I get past this little snagget.

---------- Post updated at 10:39 PM ---------- Previous update was at 10:30 PM ----------

Just had another thought. Could I use cksum in some way?
cksum file grab it using awk '{print $1}' then checking it again and comparing the cksums? Not sure on how to do that but maybe an approach?

The easiest way to know that a file is complete is to also write a marker file, so if you are sending it with ftp or similar, you would do the following on the sending:-

cd $target_dir
put $file
put /dev/null $file.OK

On the receiving server, you write you script to look for $file.OK and then you know you have the file.

Of course, you should put an error check after you think you have sent the data file before writing the flag file else the receiving server will just grab it.

If you want to use a checksum, you could use this as content to the flag file, assuming that the sender and the receiver are using the same check-sum algorithm, i.e. the output from sum is very different to cksum and it depends what each side has.

So, on the sender, something like:-

cksum $file >$file.OK
ftp ......... # whatever you usually put here
   cd
   put $file
   put $file.OK
   quit

And on the receiver:-

cksum $file > $file.cksum
diff -q $file.OK $file.ckcum
if [ $? -ne 0 ]
then
   echo "Transfer error"
   exit
fi

Of course you could use the m5sum functions to make this neater if you wish with:-

md5sum $file>$file.OK
ftp .......etc.

...and...

if [ ! -z "`md5sum --quiet -c $file.OK 2>&1`" ]
then
   echo "Transfer error"
fi

Does this logic help?

Robin

1 Like

You didn't reply to jim's first question. "Is the file being generated byyour account - i.e., the process runs under your username.?"

If yes, you need to use the correct way of implementation. i.e. Only move the complete file to the final directory. It means, use some temp location while it being written, when finished, move to final location.

1 Like

Thanks, great replies. Alas the file is sent from a 3rd party and all I do with it is wait till its complete and send it onto someone else. This is a manual task so I thought I would automate it. I am surprised at the problems I ran into. In my head I am thinking easy. Got to love unix and how you can do 1 thing several ways.

  1. is the file there
  2. has it finished generating
  3. send the fiile.

why cant I think of an elegant way to do that.
Thanks guys for taking the time to help a self taught noob.

As rbatte1 posted, you have to design the "done" part into the way you transmit the file.

Without that, there is no elegant way to do it.

As said before, fuser is appropriate here.
Much better than comparing times with ls -l (which is less overhead than comparing the contents with cksum ).

file=filename
while :
do
  pid=`fuser "$file" 2>/dev/null`
  [ -z "$pid" ] && break
  echo "$file is in use by process"
  ps -p "$pid"
  sleep 60
done
echo "$file is completed."