Downloading with Wget

Hello everyone. I'm new both to the forum and to unix scripting, and this website has been very useful in putting together a script I am working on. However, I have run into a bit of a snag, which is why I have come here seeking help. First I will say what I am trying to do, and then what I have done so far.

I am trying to download weather model data for a meteorological program called GEMPAK, and the models update ever so often and filenames are based on timestamp and model type. An example filename is 2011051800_nam242.gem
These files take up a LOT of space (this directory alone is 21 gig!), so as you can imagine, after downloading the files I want to delete the old ones. Even more so, I do not need the *entire* directory, just the last 2 day's worth at most.

So, here is what I have done so far.

cd / ;
wget -S -N -l1 -r -np -A.gem http://metfs1.agron.iastate.edu/data/gempak/model/nam ;
find /metfs1.agron.iastate.edu/data/gempak/model/nam2 -Btime +4 -exec rm {} \; 

Now I have the entire directory downloaded in the proper hierarchy, an the script will remove items older than 4 days. However, when the script runs again, it downloads the files I just deleted all over again, and I am trying not to keep too much space with files I don't need. So, my question is, can anyone help me find a way to either download 'x' amount of files, or only after 'x' timestamp? Or maybe another way I do not know about? Thank you very much in advance.

I haven't tried this, but I think you could add -nc (or --no-clobber) to prevent it from downloading a file if one already exists, and then truncate rather than delete.

You'll end up with a bunch of zero-length files, but at least they don't use up disk space.

To truncate, this should work: -exec /bin/cp -f /dev/null {} \;

(I use /bin/cp instead of cp because I have cp aliased to "cp -i", and the "-i" overrides the "-f".)

1 Like

Thanks for the response!
As far as -nc goes, I am not having trouble with it re-downloading the files while they exist, but after they are deleted. Unless of course I am misunderstanding you.

You'll have to pardon my noobishness, but what does -exec /bin/cp -f /dev/null {} \; do exactly? I'm trying learn as I go. I know -exec is for execute, but what is /bin/cp? or -f? or /dev/null? Thanks for your patience.

EDIT: So, I modified the code, and unfortunately wget is now comparing file sizes and re-downloads the data if the sizes do not match up with the directory. Any thoughts?
EDIT2: Got it all figured out. Now I know what -nc is for, and replaced -N with -nc. Thank you!