Help with retrieving files via SFTP

I have a process which generates a file and places it on my sftp server

Then I connect to the sftp to retrieve the file. However for some reason, I keep getting an incomplete file.

These are the steps.

  1. Submit a request to my sftp file to start generating a file
  2. That file is picked up processed on a backend server and the output is placed back on the sftp server
  3. Meanwhile my script keeps running frequently to check if the output file is present.
  4. Once present it picks up the file

For some reason my process is picking up an incomplete file. Its like picking up the file whilst the output is getting transfered on the sftp server from the backend server.

How do you make sure the creating process is finished generating the file?

I can view that via a UI setup to monitor the same.

Let's take for granted that sftp works flawlessly. So the problem must be somewhere else. If you say for sure the file is completely created, what are the differences that you find? Lines missing? At the end? other? Please post.

And - "present" does NOT mean "complete", "finished"!

The generating server could upload the file with a temporary filename and then, after upload is complete, rename it to the final name

sftp user@server.example.com
    put output output.tmp
    rename output.tmp output
    ....

This way, whenever your scripts find a file named output , you know, that the upload is complete.

Sure.

My incomplete file gets truncated.

Only half the data is retrieved whereas on the sftp server, the entire file is present. I know that the entire file is present, because I compare the timestamp of the completed file on the sftp server, with the timestamp of my scripts pull. They match.

The data is missing towards end of the file. There are some garbage characters.

Have the process that puts the file on the sftp server put 2 files.

put april_2_0900.dat
put april_2_0900.job

On the retrieving machine, get *.job files first.
Then from the job files retrieved create a job to get a specific list of .dat files.
When the .dat files are successfully processed, delete all those .job files on the server.
The .job file can contain any minimal amount of data.

ls -l april_2_0900.dat >april_2_0900.job