Ensure file copy is complete before starting process

Hi experts,

I have a requirement wherein a user is uploading a file to the Landing directory on one of our Linux servers. A cron job is scheduled to run after every 5 minutes which will pick up the files from the source (Landing) dir and copy to the target dir, and once successfully copied to the target, should remove that file from the source folder. For ensuring that the files which are stil being uploaded (when the cron script starts) arent deleted, I am first creating a file list, and the copy command should go in a loop, copying all the files which are there in the list to the target. Once the copy command gives a successfuly exit for all the files, I'll capture the byte-count for each file at both source and target dirs, and if the sizes match, I'll delete the source file.

The issue I face is with large files. I tried to upload a file called abc (a binary file) to the server using WinSCP, and I noticed that before the file is completely uploaded to the source dir, it is named as "abc.filepart". So, assuming a situation wherein the file is being uploaded when the cron starts, it will identify the filename as "abc.filepart" and copy the contents to the target dir. But if the upload is complete before the copy command finishes, the file name changes back to "abc". Consequently, the loop which runs to compare the byte-counts, wont be able to locate the file "abc.filepart" in the source dir, and the script fails.

I may also try to ignore the "filepart" extension, but then I'm not sure what all extensions can possibly be suffixed to the files while they are being uploaded.

Can anyone give me any pointer on how to ensure that I can check this particular clause.

Regards,
Sriram

Use lsof , strace or some similar tool to monitor the processes accessing the particular file. If there are none such processes the upload is complete, otherwise it is still being transmitted.

This way you don't have to keep file lists or something such at all.

The following is NOT a runnable script, just a sketch to demonstrate the logic:

typeset fSrc="/path/to/sourcedir"
typeset fTgt="/path/to/targetdir"

while : ; do
     ls /path/to/sourcedir | while read file ; do
          if [ $(lsof $fSrc/$file | wc -l) -gt 1 ] ; then
               echo "file $file still loading, skipping it"
          else
               mv $fSrc/$file $fTgt/$file
               echo "file $file completed upload, moving it"
          fi
     done
done

I hope this helps.

bakunin

how are you running WinSCP on your linux?

If your hosting system (where the webserver is running) is Linux, I would suggest looking at the use of inotify. A companion tool, incron, can then be used to setup a condition of when a file is "closed-from-write", a trigger is generated that can run a script. Alternatively, you can use a php program (on the server side to handle file upload), that would either move or rename the file, then have your cron job filter the received files.

---------- Post updated at 08:05 AM ---------- Previous update was at 08:00 AM ----------

Here is a link that discusses the use of inotify:

Thanks All,

I got the solution using the lsof command. :b:

Regards,
Sriram