HTML parsing by PERL

avik1983 · February 21, 2007, 4:29am

i have a HTML report file..its in attachment(a part of the whole report is attached..name "input html.doc").also its source is attached in "report source code.txt"

i just want to seperate the datas like in first line it should be..

NHTEST-3848498958-NHTEST-10.2-no-baloo a
and so on for whole report

i have done that already using a perl script.its also attached ,named-"perl coding for parsing.txt"(its attached for ur help)

now suppose i have more than 1 file,ie 20 report in html format.and i have to compare different values of all the tables from different report files (ie,to compare buffer cache values from different report file).

so how to do that..plss give me some ideas.
i need a script to do this in unix or perl..can you help me in this regards.
waitin for ur reply

anbu23 · February 23, 2007, 7:58am

sed -n "s/.*Buffer Cache:<\/TD><[^>]*> *\([0-9,]*[A-Za-z]*\)<\/TD><[^>]*> *\([0-9,]*[A-Za-z]*\).*/\1 \2/p" report source code.txt

This will give buffer cache values from report source code.txt

avik1983 · February 23, 2007, 8:04am

thanks for replying...
it will be valid for one report file only.but how can it be possible for 500 report files...i have to write a script which only can do it .then only can i compare the values

anbu23 · February 23, 2007, 8:25am

If you kept all the report files in one directory then try this

sed -n "s/.*Buffer Cache:<\/TD><[^>]*> *\([0-9,]*[A-Za-z]*\)<\/TD><[^>]*> *\([0-9,]*[A-Za-z]*\).*/\1 \2/p" *