I have a file input.txt which have loads of weird characters, html tags and useful materials. I want to display 35 characters after the word "description" excluding weird characters like $&lmp and without html tags in the new file output.txt. Help me. Thanx in advance. I have attached the input file. Please help me. It's urgent. Input sample:
</image>
<title>A Londoner Looks Back: Were The Olympics Awesome?</title>
<link>http://www.askmen.com/sports/fanatic/london-olympics-post-mortem.html</link>
<description rdf:parseType="Literal">
The other evening I walked out of London’s <a
href="http://www.askmen.com/fashion/watch_100/135_olympic-watches.html">Olympic
stadium onto the new “Javelin” train into town.
Output should be like this:
The other evening I walked out of London Olympic
stadium onto the new Javelin train into town. (The journey from east to
central London, quite recently still
If you thought moustaches were solely to distinguish regular males from porn stars and
hipsters, think again. A new study suggests that