How to download Images and Json file from server(godaddy) to Local machine (Ubuntu 14.04).?

Hi Guys,

Just entering the Linux word, So I need help to write a script on my local machine(Ubuntu 14.04) that continuously check the particular folder(contains images) and a json file on the server and download whenever new images are added to that folder and whenever there is a change in the Json file.

I prefer to use wget.

Thanks,

Something like :

#!/usr/bin/ksh
LOCKFILE=/path/to/lock/wget.lock
LOGFILE=/path/to/log/wget.log
DDIR=/path/to/destination
if [ -f $LOCKFILE ]; then
printf "%s\n" "Lock file exists on date $(date "+%Y%m%d_%H%M")" >> $LOGFILE
exit 127
fi
cd $DDIR && touch $LOCKFILE || exit 127
wget -m -c -nH --output-file=$LOGFILE --timeout=3 --tries=3 ftp://user:pass@yourftpsite/* && rm $LOCKFILE || exit 127
cd -

This will mirror all the files from ftp site (../*) to your $DDIR.
It will not mirror if the remote file is not newer then then local file.

Also the $LOGFILE will be overwritten upon each execution of wget, which can be changed by adding date command to log name.

Hope that helps
Regards
Peasant.

1 Like

Thanks a lot! Peasant.
I would try this one today and let you how it goes.
Much appreciated :slight_smile:

---------- Post updated at 06:45 PM ---------- Previous update was at 09:38 AM ----------

It works very well.
But the text file that I am trying to download from ftp server to my local machine is not exact same as on server( I mean some words from the files are missing). Is there any way I can put checksum to check that files are not corrupted?

---------- Post updated at 09:59 PM ---------- Previous update was at 06:45 PM ----------

could you please tell me what is $LOCKFILE and why do we need this?

Lockfile is used so the script will not run multiple times, for instance from crontab.
IF you have this script in cron to run every 5 minutes, and the download lasts 7 minutes lockfile will prevent it from running during download (unwanted in any case).

As for checksuming, unfortunately wget cannot do this, you will have to do this manually or switch to using ssh with rsync (which has checksuming builtin).

If you could somehow generate a md5sum of all files named for instance mysum.txt on the server you are fetching the data from and transfer that to your server you could run a check from the script after download like :

cd $DDIR && md5sum -c mysum.txt

Have you tried to fetch those file manually to confirm that the same thing is happening with any FTP client ?

Hope that helps
Regards
Peasant.

1 Like

I dont wanna use cron job, instead Im trying to use while statement in script itself so in that case where should I put following statements, inside the while loop or outside?

if [ -f $LOCKFILE ]; then printf "%s\n" "Lock file exists on date $(date "+%Y%m%d_%H%M")" >> $LOGFILE exit 127 fi

---------- Post updated at 01:24 AM ---------- Previous update was at 01:14 AM ----------

Also for security reasons, I do not want to write my ftp username and password in the script. Is there any way that I can fetch username and password from some other file or script in encrypted form?

Can you post your entire code if possible ?
Without the && || syntax the code would be longer.

cd $DDIR
if [ $? -eq 0 ]; then # i have successfully changed directory to $DDIR
touch $LOCKFILE
wget <options>
      if [ $? -eq 0 ]; then # if wget has successfully finished
      rm $LOCKFILE
      else
      exit 127 # wget has failed, i will not exit gracefully, but with error (LOCKFILE remains)
      fi
else # i wasn't able to switch to $DDIR so i will exit with error 127.
exit 127
fi

When i look exit 127 might not be good choice for exit code, so change this to exit 1. Sorry for that.

As for security, using FTP is insecure by default. This is why protocols such as SFTP/SSH or FTPS (ftp over ssl with certificates) were established.

You will have hard time securing FTP (if possible at all), also FTP would be suspectable to man in the middle attacks due to passwords/users traveling over network unencrypted.
A simple tcpdump on your local network would reveal user and passwords...

Can you use ssh and rsync with keys ?

Regards
Peasant.

#!/bin/sh
LOCKFILE=/home/gav/Desktop/wgetTest/lock/wget.lock
LOGFILE=/home/gav/Desktop/wgetTest/log/wget.log
DDIR=/home/gav/Desktop/wgetTest/
JSONDIR=/home/gav/Desktop/wgetTest/text/new.json
IMAGEDIR=/home/gav/Desktop/wgetTest/display/

if [ -f $LOCKFILE ]; then
printf "%s\n" "Lock file exists on date $(date "+%Y%m%d_%H%M")" >> $LOGFILE
exit 127
fi

    cd $DDIR && touch $LOCKFILE || exit 127
    wget -m -c -nH --cut-dirs=2 --output-file=$LOGFILE --timeout=3 --tries=3 --passive-ftp ftp://usr:pwd@mywesite.com/www/test/images/* && rm $LOCKFILE || exit 127
    mv images/* $IMAGEDIR

    cd $DDIR && touch $LOCKFILE || exit 127
    wget -m -c -nH --cut-dirs=2 --output-file=$LOGFILE --timeout=3 --tries=3 --passive-ftp ftp://usr:pwd@mywesite.com/www/test/display.json && rm $LOCKFILE || exit 127
    mv display.json $JSONDIR

sleep 60

    cd $DDIR && touch $LOCKFILE 

while true;do
    
    wget -m -c -nH --cut-dirs=2 --output-file=$LOGFILE --timeout=3 --tries=3 --passive-ftp ftp://usr:pwd@mywesite.com/www/test/display.json && rm $LOCKFILE

    if cmp -s display.json $JSONDIR; then
        sleep 15
        echo "same files"
    else
                echo "different files and downloading new files"
        wget -m -c -nH --cut-dirs=2 --output-file=$LOGFILE --timeout=3 --tries=3 --passive-ftp ftp://usr:pwd@mywesite.com/www/test/images/* && rm $LOCKFILE
        mv images/* $IMAGEDIR
        mv display.text $JSONDIR
    fi
sleep 60
done

---------- Post updated at 02:22 AM ---------- Previous update was at 02:19 AM ----------

Sorry, I am new to linux and just trying to create a logic and I write many statements multiple time in my code :confused:

No need to apologize, no one was born with linux/unix skills.

No need to use cmp
This is why we are using wget mirroring based on timestamps.
wget will know if something has changed and copy it if it has (check the log, it will write it).

Try this for entire operation :

#!/bin/sh
# On Linux systems, /bin/sh is often link to dash or bash, on other unix systems it will be posix shell and stuff here might not work.
LOCKFILE=/home/gav/Desktop/wgetTest/lock/wget.lock
LOGFILE=/home/gav/Desktop/wgetTest/log/wget.log
DDIR=/home/gav/Desktop/wgetTest/
JSONDIR=/home/gav/Desktop/wgetTest/text/new.json
IMAGEDIR=/home/gav/Desktop/wgetTest/display/

if [ -f $LOCKFILE ]; then
printf "%s\n" "Lock file exists on date $(date "+%Y%m%d_%H%M")" >> $LOGFILE
exit 1
fi
while true
do
    cd $DDIR || exit 1
    wget -m -c -nH --cut-dirs=2 --output-file=$LOGFILE --timeout=3 --tries=3 --passive-ftp ftp://usr:pwd@mywesite.com/www/test/images/*
    WEXIT1=$? # we shall catch the exit code of wget, 0 is expected (success)
    wget -m -c -nH --cut-dirs=2 --output-file=$LOGFILE --timeout=3 --tries=3 --passive-ftp ftp://usr:pwd@mywesite.com/www/test/display.json
    WEXIT2=$? # we shall catch the exit code of wget, 0 is expected (success)
    ERROR=$(( $WEXIT1 + $WEXIT2 )) # if the sum of both exit codes
    if [ $ERROR -eq 0 ]; then # is 0 (success both, expected), proceed with mv
         mv images/* $IMAGEDIR # we are not checking for mv status, we hope filesystem will work :) (you could do a $? checking here if required)
         mv display.json $JSONDIR  # we are not checking for mv status, we hope filesystem will work :) (you could do a $? checking here if required)
         rm $LOCKFILE # remove the $LOCKFILE for next itteration since everything succeded.
    else
    printf "%s\n" "A wget error has happend, check $LOGFILE, remove $LOCKFILE by hand and rerun"
   exit 1
   fi
sleep 60
done

I would rather recommend using cron every minute with less shell code and no while loops.
Also i haven't ran this code i just wrote it directly on forums, so give it a try.

Hope this is what you are looking for, but again, can you use SSH with rsync ?
With that and exchanged ssh keys this entire script would be a couple of lines.

Best regards
Peasant.

1 Like

I will test this code and let you know the results soon :slight_smile: Thanks

I can use ssh with rsync but I dont know much about how to set it up on godaddy webhosting.

I am using while loops because I want to create a service(upstart) for this later that initiate this script during boot and keep it always running at background. Is it a good idea or you prefer cron job more?

---------- Post updated at 04:03 AM ---------- Previous update was at 03:53 AM ----------

One question, can i put

rm $LOCKFILE 

in the else section just after the error message so that I dont need to remove it manually

I would prefer cron job.
If you use cron to run every minute, there is no need for while loop.

Regarding LOCKFILE..
It is there for the sole purpose for a human to check what went wrong and, when fixed, to remove it and to stop script starting when another instance of the same script is running.

I have no experience with upstart so far so i cannot advise on that subject, but it looks like an overhead to me.

I'm not acquainted with godaddy and their limits or features, you will have to examine that yourself using google.

1 Like

Thanks a lot Peasant for your help :slight_smile: