hello. i want to make an awk script to search an html file and output all the links (e.g .html, .htm, .jpg, .doc, .pdf, etc..) inside it. also, i want the links that will be output to be split into 3 groups (separated by an empty line), the first group with links to other webpages (.html .htm etc), the second group with links to images (.jpg .jpeg) and the third group with links to .pdf .doc or other downloadable files. and next to each link i want to output how many times each one occurs in the html file.
To make a script with awk you must have at least some knowledge of awk, do you?
What have you done to attempt to solve this problem yourself?
Post your sample script, and we'll see how we can assist.
no i haven't done much. i only know a few things... actually i have just thought about declaring a FS or a RS with something like FS="< >" and then search within the fields for /http/ or something and then for /html/. But i don't know a lot of things so i just want to do the basics..