Pattern matching extracting urls from rss, shell scripts

Hi all, how could i do ?

I have a Rss file, i want to extract only the Urls (many) matching XXX Sex - Free Porn Videos at from that file and copy into another file.


<pubDate>Wed, 29 Apr 2009 00:00:00 PST</pubDate>
<content:encoded><![CDATA[<table><tr valign="top"><td width="67"><a href="Apple - Movie Trailers - The Hangover"><img src="" width="65" height="97" border="0"></a></td><td> � </td><td><a href="Apple - Movie Trailers - The Hangover/"><span style="font-size: 16px; font-weight: 900; text-decoration: underline;">The Hangover - Trailer 2</span></a><br /><span style="font-size: 12px;">Two days before his wedding, Doug and his three friends drive to Las Vegas for a blow-out bachelor party they�ll never forget. But, in fact, when the three groomsmen wake up the ustin Bartha</span></td></tr></table>]]></content:encoded> .....

all made with bash script file.

thanks 4 help !


cat << EOF |
<pubDate>Wed, 29 Apr 2009 00:00:00 PST</pubDate>
<content:encoded><![CDATA[<table><tr valign="top"><td width="67"><a href="Apple - Movie Trailers - The Hangover"><
img src="" width="65" height="97" bord
er="0"></a></td><td> � </td><td><a href="Apple - Movie Trailers - The Hangover/"><span style="font-size: 16px
; font-weight: 900; text-decoration: underline;">The Hangover - Trailer 2</span></a><bill_run_id /><span style="fo
nt-size: 12px;">Two days before his wedding, Doug and his three friends drive to Las Vegas for a blow-out bachelor
 party they�ll never forget. But, in fact, when the three groomsmen wake up the ustin Bartha</span></td></t
r></table>]]></content:encoded> .....
tr '<' '\012' |
tr '>' '\012' |
grep '^a href' |
sed -e 's/a href=.//' \
    -e 's/.$//'

Maybe something like this:

perl -ne '{ while (/.*?(http:\/\/www.*?.com\/trailers\/).*?/gi) { print $1,"\n"; } }' <your_filename>


"Only after disaster can we be resurrected."

thanks lot, that fine, works well
