Automated File Transfer Script

We are receiving data feed files in SFTP location daily. so the folder structure in SFTP location is abc/def/studyname_1/outbound/zipped files
So we will be getting different studies and for each study a folder is created abc/def/studyname_2/outbound/zipped files , abc/def/studyname_3/outbound/zipped files

Script needs to go to each study name and then fetch the zipped files to another Windows location.

Here I would want to have a reference table created in SQL where I have listed all the incoming zipped files and its SFTP location. So the script also needs to cross verify with the sql table and then pick-up the files.

Or

we can fetch zip filenames and then insert into sql table and mark it as processed.

I understand we can use MGET to fetch all the files in SFTP location. I am confused on how I can achieve the tracking of the zip files and how to go through each study and fetch files.

if [ $ftpmethod = SFTP_MGET_NOEXT ]
then
/usr/bin/expect -f - <<EOFMGET 
spawn sftp $remoteusername@$remotehost
expect "password:"
send "$remotepwd\n" 
expect "sftp> "
send "cd $remotedir\n"
expect "sftp> "
send "lcd $localdir\n"
expect "sftp> "
send "mget $filename*\n"
expect "sftp> "
send "sleep 120\n"
expect "sftp> "
send "exit\n"
interact
EOFMGET
fi

The "sleep 120" does nothing since sftp has no such command, by the way.

What do you mean 'tracking'?

You can examine the contents of zip files with unzip -l

You should not be using expect to cram passwords into sftp. You should be using shared keys. That way sftp will work as intended with no need to force-feed it via expect.

by tracking I mean zipped which are coming from the source needs to be directed a sql database where we know which files have been processed.

also thanks for the inputs

Thanks again

Rather than bludgeon your way past good security and causing yourself more complications with expect, could you not use SSH keys to get password-less authentication in place? This will simplify thing tremendously and you would end up with code more like this:-

pushd $localdir

sftp $remoteusername@$remotehost <<-EOMGET
  cd $remotedir
  mget $filename*
EOMGET

popd

No need for a sleep attempt or anything. The process will exit when the mget completes.

I have added pushd before to set the required local directory & popd afterwards to put you back where you were beforehand, but these may be OS/shell specific and you haven't told us either. Another way would be OLDPWD=$PWD ; cd $localdir before and cd $OLDPWD afterwards, but it looks messier to me.

There are plenty of thread about setting up password-less authentication. Basically the client making the sftp request has to create a key-pair with ssh-keygen and send the *.pub file created to the server it wants to connect to. The content has to be added to the correct place on the server and then the connection can be opened. Have a search and let us know how you get on.

I hope that this helps,
Robin

bakunin

Thank you for your answer.. I would need help in picking zipped files in subdirectories.

so the folder structure in SFTP location is abc/def/studyname_1/outbound/zipped files
abc/def/studyname_2/outbound/zipped files , abc/def/studyname_3/outbound/zipped files

So the script needs to go to abc/def/ and then pick the zipped files from each sub-directory which are placed in SFTP server and move to different directory in Unix.

OK, that is at least a starting point.

First, the standard questions:

  • What is your OS and its version? (os that we can anticipate which tools will be there and which may be their quiirks)

  • What is your shell and its version? (for pretty much the same reasons)

Some project-related questions:

  • Does the directory structure of the source dirs need to be preserved? i.e. when you want to "pick the zipped files from each sub-directory which are placed in SFTP server and move to different directory" as you said, will the files
/somewhere/abc/def/studyname_1/outbound/zipped1.zip
/somewhere/abc/def/studyname_2/outbound/zipped2.zip

go to

/other_place/studyname_1/zipped1.zip
/other_place/studyname_2/zipped2.zip

or should they go to

/other_place/zipped1.zip
/other_place/zipped2.zip

If the latter, what should be done about identical filenames?

  • What should be done about different versions? i.e. suppose there is a file
/somewhere/abc/def/studyname_1/outbound/zipped1.zip

and in the target directory there is already a file named

/other_place/zipped1.zip

What should the script do in this case? Use the newer file? use the file in the source directory, regardless? Leave the file in the target directory alone? None of the above?

I hope this helps.

bakunin

$ uname -a
Linux sosuhenhl01 2.6.32-696.10.1.el6.x86_64 #1 SMP Wed Jul 19 08:18:21 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux
$ ps
  PID TTY          TIME CMD
15392 pts/0    00:00:00 ps
41789 pts/0    00:00:00 bash
$ bash --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
$ ksh --version
-bash: ksh: command not found
$ sh --version
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
There is NO WARRANTY, to the extent permitted by law.
$ ls -l /bin/sh
lrwxrwxrwx 1 root root 4 Mar 31  2017 /bin/sh -> bash
$ awk --version
GNU Awk 3.1.7
Copyright (C) 1989, 1991-2009 Free Software Foundation.
  • Does the directory structure of the source dirs need to be preserved? i.e. when you want to "pick the zipped files from each sub-directory which are placed in SFTP server and move to different directory" as you said, will the files

##From SFTP location, we just need to pick up the zipped files from each sub-directory and move it to on different location as /other_place/zipped1.zip and other_place/zipped2.zip which is on unix environment.

If the latter, what should be done about identical filenames?

##This scenario is not applicable here as the zipped files will be differently named by the source system. I would get more info regarding this situation.

  • What should be done about different versions? i.e. suppose there is a file

##If the same file is available in target location, then use the file in the source directorY AND leave the file in the target directory. The files which we are moving to is intermediate layer so the zipped will be completely moved from the target location to another server.

So once we have moved the zipped files to target location, then we need to check if each file within zipped file is not 0 kb and if we find a 0kb file then raise an email with some content.