Hi,
I need to basically get a list of all the tarballs located at uri
I am currently doing a wget on urito get the index.html page
Now this index page contains the list of uris that I want to use in my bash script.
can someone please guide me ,.
I am new to Linux and shell scripting.
Thanks,
M
You want to look at wget resursive download options in particular the -r (recursive) and -l (level).
Typically wget -r -l 1 http://my.site.com/index.html
This creates a directory structure of the site itself. I donot want to create a directory stucture. Basically, just like index.html , i want to have another text file that contains all the URLs present in the site.
Thanks,
M
Oh I see, how about this:
awk 'BEGIN{ RS="<a *href *= *\""} NR>2 {sub(/".*/,"");print; }' index.html
1 Like
Thank you! That helped a lot
lynx -dump Website Domains Names & Hosting | Domain.com | grep -A999 "^References$" | tail -n +3 | awk '{print $2 }'