Using wget and check for non-zero files

weatherboys · October 18, 2007, 5:59am

Hello,

I'm using a shell script containing a wget-command that copies html-files from a website to my ISP-server. I therefore want to check if that file exist and also if the filessize is larger than e.g. 100 bytes. Probably it's done by using something like fileexist and filesize.

The wget -commandline looks like:

wget http://www.test.nl/test.html -O /home/myname/domains/myname.nl/public_html/data/test2.html --quiet

But I have a beginners experience in shell-scripting so can you help me out?!

kavera · October 18, 2007, 7:02am

eg: wget --spider -v http://images.ucomics.com/comics/ga/2007/ga071017.gif"

$man wget
When invoked with --spider option,Wget will behave as a Web Spider,which means that it will not download the pages, just check that
they are there.

-v is for verbose output, as always

weatherboys · October 18, 2007, 7:24am

Thanks for the quick reply!

So the script will now look like:

wget --spider -v http://www.test.nl/test.html
wget http://www.test.nl/test.html -O /home/myname/domains/myname.nl/public_html/data/test2.html --quiet

But still gets files if they are zero bytes and overwrites existing files. That's not what I want. I want to check the existence (using your spider option) and I want to allow overwriting in case the files to be copied are not empty.

Can you help me out?!

trey85stang · August 12, 2008, 2:17am

weatherboys:

Thanks for the quick reply!

So the script will now look like:
wget --spider -v http://www.test.nl/test.html
wget http://www.test.nl/test.html -O /home/myname/domains/myname.nl/public_html/data/test2.html --quiet
But still gets files if they are zero bytes and overwrites existing files. That's not what I want. I want to check the existence (using your spider option) and I want to allow overwriting in case the files to be copied are not empty.

Can you help me out?!

can you post the output of wget --spider -v test.nl - Test Resources and Information. ? Perhaps you just need to redirect that out put to a txt file run a few awk lines then check the file with if for the information you want and then download the file.. or not download it.

edit: sorry for posting this.. didnt realize it was a year old thread. Not sure how I even came across it