Archiving files using shell script

Dear Team,

I am looking for transferring files to and from the local and remote servers using SFTP commands. Currently the script is using the mget and mput commands to do the copying of the files. While I am trying to move the files from local to remote server, I would also like to archive or save a copy in the local system. Could someone suggest me the suitable commands for the same?
Currently the script is like this:

---Copy remote files to local file system

echo cd ${FILE_SOURCE_PATH} > ${TMP_SFTP_SCRPT}

echo mget ${FILE_MASK} >> ${TMP_SFTP_SCRPT}

echo rm ${FILE_MASK} >> ${TMP_SFTP_SCRPT}  ## this has to be modified to archive the files to the local system directory and then remove the file##

echo bye >> ${TMP_SFTP_SCRPT}

sftp -oPort=${FILE_SOURCE_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_SOURCE_ACCOUNT} 2>/dev/null 1>&2

Thanks in advance for your help.

Heyas Rads

Please use code tags (not none / not icode) for multiline code examples.

When you download a file to local, using mget its already copied local, so to archive it localy, you would have to finish the sftp sessions, and then do a copy of that file you just downloaded.

Obviously most of your commands dont work, as you have put an echo in front of them.
I suggest to remove all those echo 's, but leave the echo rm ... so the remove files do not get deleted (just yet).

Otherwise, your code looks 'ok'.
This said, i dont really understand your question, could yuu please elabrote a bit more?

hth

1 Like

Firstly I would streamline the code a little. You open the file quite a few times. You can easily do this:-

{
   echo "First line"
   echo "Second line"
} > filename

If you are driving commands based on what is on a remove server, then you may have to break this into several sftp steps:-

  • Get list of files on remote server
  • Loop for each file
    [list]
  • SFTP get the file
  • Confirm file okay
  • SFTP delete the file
  • Confirm delete okay (perhaps try to list the file with SFTP)
    [/list]
  • Report outcome of the overall job

Does thinking about breaking up your overall process with this logic help?

Robin

Hi Robin,

Thanks for the information. I am new to shell scripting and this script for archiving is a bit challenging for me. Your help would be much appreciated. Please find the code as below:

#!/bin/ksh
# Purpose                      : Transfer files from Local to Remote/Remote to 
#Name of the file that is executed
SCRIPT_NAME=`basename ${0} .ksh`    
#Lock file/Temporary file to ensure that the files are not run multiple times                                         
LCK_FILE="/tmp/${SCRIPT_NAME}_${2}.run"  
#File used for build of SFTP tasks                                    
TMP_SFTP_SCRPT="/tmp/sftp_script_${2}_$$.ftp"  
#Denotes the current directory                              
DIR_NAME=`dirname ${0}`    
#File that defines the source and target specifications                                                  
FILE_CONFIG="${DIR_NAME}/${1}.config"                                        
FOUND=false
TIMESTAMP=`date '+%d-%b-%Y_%R'`
FILES_PATH="/tmp/Files2Transfer_${2}"

#
# Copy files from local system to remote system
#
Local2Remote() {

	cd ${FILES_PATH} 2>/dev/null
	if [[ ${?} -gt 0 ]];then
          print -u2 "LocalTarget path not found";
          rm ${LCK_FILE}
          exit -1;
        fi

	echo cd ${FILE_TARGET_PATH} > ${TMP_SFTP_SCRPT}
	echo mput ${FILE_MASK} >> ${TMP_SFTP_SCRPT}
	echo bye >> ${TMP_SFTP_SCRPT}

	sftp -oPort=${FILE_TARGET_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_TARGET_ACCOUNT} 2>/dev/null 1>&2

	rm ${TMP_SFTP_SCRPT}

}

#
# Copy files from remote system to local system
#
Remote2Local() {

	cd ${FILES_PATH} 2>/dev/null
	if [[ ${?} -gt 0 ]];then
          print -u2 "LocalSource path not found";
          rm ${LCK_FILE}
          exit 1;
        fi

	echo cd ${FILE_SOURCE_PATH} > ${TMP_SFTP_SCRPT}
	echo mget ${FILE_MASK} >> ${TMP_SFTP_SCRPT}
	
# The below line needs to be modified to enable archive functionality. 	
  echo rm ${FILE_MASK} >> ${TMP_SFTP_SCRPT}    

# Added below line to copy and remove the files to the desired folder 
# echo mv ${file_mask} >> ${tmp_sftp_scrpt}                                  	
		          
# End of change 
	echo bye >> ${TMP_SFTP_SCRPT}

	sftp -oPort=${FILE_SOURCE_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_SOURCE_ACCOUNT} 2>/dev/null 1>&2

	rm ${TMP_SFTP_SCRPT}
			
}

#
# Main function
#

if [ ! -f ${FILE_CONFIG} ]; then
	print -u2 "Configuration file not found"
	rm ${LCK_FILE}
        exit 1
fi

if [ -f ${LCK_FILE} ]; then
	print -u2 "Script with parameter ${CL_NAME} already running, exiting..."
	exit 1
else
	touch ${LCK_FILE}
fi

exec < ${FILE_CONFIG}
	while IFS=\| read -r name source_account source_port source_path target_account target_port target_path mask; do
	case ${name} in
		\#* | "")
			continue
			;;
	esac
		
			eval FILE_SOURCE_ACCOUNT=${source_account}
                        eval FILE_SOURCE_PORT=${source_port}
			eval FILE_SOURCE_PATH=${source_path}
			eval FILE_TARGET_ACCOUNT=${target_account}
                        eval FILE_TARGET_PORT=${target_port}
			eval FILE_TARGET_PATH=${target_path}
			eval FILE_MASK=${mask} 
			FOUND=true
			break

done

if [ ${FOUND} = false ]; then
	print -u2 "File to transfer not configured"
    rm ${LCK_FILE}
	exit 2
fi

mkdir ${FILES_PATH}

set -f
Remote2Local
Local2Remote
set +f

rm -Rf ${FILES_PATH}

rm ${LCK_FILE}

exit 0

I am following below process to test the files:
This script and the config file exists in one server. I am creating source and target folder in FTP server and putting in a sample CSV in source folder. I then login via putty and execute this script name.ksh<space> "config file name". Upon execution, I should be able to see the CSV file in the target folder.

I would like to know if I am using the correct method to do the archival process. Also is it possible to do the logging during the archival process (i.e., between archive start and end) to include name, timestamp information?

---------- Post updated 06-18-15 at 11:11 AM ---------- Previous update was 06-17-15 at 04:07 PM ----------

Hello,

I have now modified the script to include a functionality to archive the files. Please find the code as below. Could somebody suggest how I can log the information (start time of archival, files archived, folder to which files are moved, archival end time) in the script?

CL_NAME=${1}
SCRIPT_NAME=`basename ${0} .ksh`                                   # Name of the shell script which is executed
LCK_FILE="/tmp/${SCRIPT_NAME}_${2}.run"                            # Lock file/Temporary file to ensure that the files are not run multiple times
TMP_SFTP_SCRPT="/tmp/sftp_script_${2}_$$.ftp"                      # File used for build of SFTP tasks
DIR_NAME=`dirname ${0}`                                            # Directory from which shell script is executed
FILE_CONFIG="${DIR_NAME}/${1}.config"                              # File that defines the source and target specifications
FOUND=false
TIMESTAMP=`date '+%d-%b-%Y_%R'`
FILES_PATH="/tmp/Files2Transfer_${2}"  
ARCHIVE_PATH="/tmp/iBuyFilesTransfered/archive"

#
# Copy local files to remote file system
#
Local2Remote() {

	cd ${FILES_PATH} 2>/dev/null 
	if [[ ${?} -gt 0 ]];then 
          print -u2 "LocalTarget path not found";
          rm ${LCK_FILE}
          exit -1;
        fi

	echo cd ${FILE_TARGET_PATH} > ${TMP_SFTP_SCRPT} 
	echo mput ${FILE_MASK} >> ${TMP_SFTP_SCRPT}  
	echo bye >> ${TMP_SFTP_SCRPT}

	sftp -oPort=${FILE_TARGET_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_TARGET_ACCOUNT} 2>/dev/null 1>&2 

	rm ${TMP_SFTP_SCRPT}

}


#
# Copy remote files to local file system
#
Remote2Local() {

	cd ${FILES_PATH} 2>/dev/null
	if [[ ${?} -gt 0 ]];then
          print -u2 "LocalSource path not found";
          rm ${LCK_FILE}
          exit 1;
        fi

	echo cd ${FILE_SOURCE_PATH} > ${TMP_SFTP_SCRPT}
	echo mget ${FILE_MASK} >> ${TMP_SFTP_SCRPT}
	# Changes added to archive the files
	# echo rm ${FILE_MASK} >> ${TMP_SFTP_SCRPT} Commented this line as the below zip command moves the files into archive and deletes the files from the system
	echo zip -m archive.zip ${FILE_MASK} >> ${TMP_SFTP_SCRPT} 
    # End of change
	echo bye >> ${TMP_SFTP_SCRPT}

	sftp -oPort=${FILE_SOURCE_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_SOURCE_ACCOUNT} 2>/dev/null 1>&2 

	rm ${TMP_SFTP_SCRPT}
			
}

#
# Main function
#

if [ ! -f ${FILE_CONFIG} ]; then
	print -u2 "Configuration file not found"
	rm ${LCK_FILE}
        exit 1
fi

if [ -f ${LCK_FILE} ]; then
	print -u2 "Script with parameter ${CL_NAME} already running, exiting..."
	exit 1
else
	touch ${LCK_FILE}
fi

exec < ${FILE_CONFIG}
	while IFS=\| read -r name source_account source_port source_path target_account target_port target_path mask; do
	case ${name} in
		\#* | "")
			continue
			;;
	esac
		
			eval FILE_SOURCE_ACCOUNT=${source_account}
                        eval FILE_SOURCE_PORT=${source_port}
			eval FILE_SOURCE_PATH=${source_path}
			eval FILE_TARGET_ACCOUNT=${target_account}
                        eval FILE_TARGET_PORT=${target_port}
			eval FILE_TARGET_PATH=${target_path}
			eval FILE_MASK=${mask} 
			FOUND=true
			break

done

if [ ${FOUND} = false ]; then
	print -u2 "File to transfer not configured"
    rm ${LCK_FILE}
	exit 2
fi

mkdir ${FILES_PATH}

set -f
Remote2Local
Local2Remote
set +f

rm -Rf ${FILES_PATH}

rm ${LCK_FILE}

exit 0

Thanks in advance.

Regards,
Radhika.

Before i start giving you tips on how to improve your script there is a general point to discuss:

Archiving files is usually done with the intent of being able to restore some previous state of a system (at least locally, say, in a certain directory). This means you do not only have to restore a files (directories) contents but some meta-information with it. Let us look at a certain file:

# ls -lai /some/directory
total 2832
40960 drwxr-xr-x    5 root     system         4096 May 27 15:09 .
    2 drwxr-xr-x   31 bin      bin            4096 Apr 27 13:38 ..
40961 -rw-r--r--    1 myuser   mygroup        1333 Oct  7 2013  somefile

Obviously there is the content of the file - the 1333 bytes it takes on the disk. There is also the file mode ("rw-r--r--" or "644"), there is the ownership (owner=myuser, group membership="mygroup"), the inode number (40961) and (not completely visible) three time stamps: the creation date/time, the last modification date/time and the last access date/time. Anything beyond the file content is stored in the inode of the file.

You may need some or even all of these metadata to go along with the raw content to form an archive enabling you to restore the file.

This is why there are special archiving programs (namely "tar", "cpio" and their successor "pax") which do exactly that. This is why you should consider creating tar- (or cpio-, pax-, ...) archives first and only as a last step transfer these archives to remote systems. This way you do not need to meddle with "mget" and other options of "ftp". A single "get <archive.file>" would suffice.

I hope this helps.

bakunin

Hi,

Thanks for the detailed information and the tips.

I am completely new to this and it would be great help if you can suggest me the suitable changes to be made to the script for archiving the files and also for logging the information.

My peers are not recommending the usage of TAR or other functionalities.

Regards,
Radhika.

You need to define clearly the aims, then break them down into smaller tasks. Try to draw the process or write it out. Keep clear that the three basic logical flow types are:-

  • Sequence
  • Branch
  • Loop

The sequence can be any number of statements (including the other two) and calls to commands etc.

Is there a reason your peers are 'not recommending' tar? Do they not trust it, or is it that they just haven't suggested it.

It is nice that your script uses functions (Local2Remote etc.) but as they are called just once, it seems a little odd. Perhaps it makes the main part of the scripts easier to read, but I also worry that you are trying to call zip within sftp and I'm not sure it will work.

I would suggest that your two test at the beginning of the main part are the wrong way round. You should test for the lock-file first, then the configuration file.

I'm not too sure on why you exec < ${FILE_CONFIG} in your code the read in in your loop. Could you just simply . ${FILE_CONFIG} which will execute the file in the current shell, i.e. any variables or functions set are then available to the script that called it.

Your loop with a break and the eval statements are a little perplexing too.

Can you explain what your input file FILE_CONFIG will contain and the eventual process you want to achieve? There may be a simpler way to get there.

There is definitely some good stuff in the script though, so don't lose confidence. I'm just trying to understand the overall objective and trying to get the best outcome for you in a way that you can maintain.

Robin

Hi Robin,

Thanks for the tips and the suggestions.

Please find answers to your questions as below:

  1. Reasons for not recommending tar : I have suggested them this functionality but they are not recommending it.
  2. To answer your other questions: There is an already existing script performing the functionality of copying files from local to remote and vice-versa. I am just including the zip command to do archival of files to local system. So I don't think it is recommended that I change the complete script.
  3. Regarding input file 'FILE_CONFIG': It contains the details of source and target - host server, port, path and filemask (in this case it is *.csv).

Please suggest me a suitable way to do the archival with the existing functionality and also a way to log the information.

In that case, I would suggest simply moving the zip command outside the sftp but keep it within the function. Perhaps:-

:
:
echo bye >> ${TMP_SCRIPT}

sftp -oPort=..................

zip -m ................

rm ${TMP_SFTP_SCRPT}
:
:

Robin

I tried using zip outside sftp and inside function, but it is not working.

Can you elaborate? Are there any error messages for example, or does the function end before you get to the zip?

Robin

I am doing the testing as follows:

I am creating a Source folder(testSRC) with few csv files and a destination folder (testTRGT), both in same server. I am putting in my shell script in another server along with the configuration details of source and target hosts. When I run the shell script, it should put in the csv files from testSRC folder into the testTRGT folder. But now with this change,script is running without errors but it is not putting in the files into the testTRGT folder.

I hope it is clear. Let me know in case further information is required.

Can you capture the output from the sftp rather than sending it to /dev/null so you can look at that perhaps?

Also, with the zip outside the sftp but within the function, are there any errors?

Robin

Could you please guide me as to how to capture output from SFTP?

There are no errors thrown after using zip outside the sftp function

At the end of the line with the sftp command in, there is a redirection to /dev/null for all standard errors and then standard output is sent 'to the same place' so if you change this:-

sftp -oPort=${FILE_TARGET_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_TARGET_ACCOUNT} 2>/dev/null 1>&2

..... to this:

sftp -oPort=${FILE_TARGET_PORT} -b ${TMP_SFTP_SCRPT} ${FILE_TARGET_ACCOUNT} 2>/tmp/sftp_log.txt 1>&2

.... you should capture the messages to /tmp/sftp_log.txt and you can have a read of that.

Robin

1 Like

Thanks for that. I am able to do it and got the below message in the /tmp/sftp_log.txt file:

Host key verification failed.
Couldn't read packet: Connection reset by peer

Is this error due to any change in the host key details?

---------- Post updated at 03:48 PM ---------- Previous update was at 03:01 PM ----------

When I execute the script, its giving ZIP is an invalid command issue. Can I use any other command to archive files? If I am using tar, where and how should I use it in the script?

Sadly we're back to the beginning and a question I should have asked then.

What OS & version are you using? THe output from uname -a would be good to see. Fell free to obscure any sensitive bits if you are worried about them, e.g. if a server serial number appears or it gives the full DNS name of a public facing server.

Robin

This means ultimately, that the sftp-connection to the remote host did not take place.

The most probable cause is: ssh/sftp-connections rely on either using a password (which would have to provided interactively, like in telnet, rlogin, ...) or exchanged keys. You generate a host key on one system under a certain user, put this authenticating key into a "keyring" of another user on a remote host and from then on the user-ID trying to connect is not authenticated via providing a password but by providing the right key - hence passwordless.

How you produce and exchange the key(s) depends on the exact software you use as sftp-server/client but with these pointers you should be able to search for it.

Regarding "tar": tar has some limitations, mostly because it is some really old piece of software (from the seventies, to be more or less precise ;-)) ). Namely the "USTAR" standard, to which a conformant tar-implementation has to adhere. Among other things it restricts the size of a single file to be backed up by tar to 8 GB. If you want to overcome that you will have to use cpio or pax or some other software with an equivalent functionality.

Still, this does not change anything i said above. To correctly archive data and be able to restore them as they were you need to store metadata along with the data and you need a tool that will do exactly that. A file copy, remote or local, will NOT do that.

I hope this helps.

bakunin

Hi thanks,

As I had a discussion with my peer, the ! zip command not working at SFTP level and would like to use an mv command to do the same purpose. Could you please help me with that?

---------- Post updated at 06:15 PM ---------- Previous update was at 04:10 PM ----------

I will try to make the requirement more understandable and clear:

The script uses 2 functions Remote 2 Local and Local 2 Remote - copies files from remote server, gets it in local and saves it temporarily until its again transferred to the remote server.
My requirement is : in the function Remote2local, I want to connect to an SFTP server, GET some files, then move those files to a different directory on the same SFTP server . But there doesn't seem to be a way to move files between directories on the remote server from SFTP. zip functionality doesnt work at SFTP level and mv command too doesnt work at this SFTP level. Could you please provide a workaround?

Thanks in advance.

Is the overall objective to get a file moved on the remote server?

Perhaps you could use ssh to drive a mv command on the remote server. If you already have password-less authentication set up for the sftp and the account is allowed to ssh to a shell prompt, you should be able to do this:-

ssh remote-server "mv source-file  target-file"

Am I missing the point?

Robin