I'm trying to extract the text in between these anchor tag and ignoring everything else using grep. I managed to ignore the tags but unable to remove the "href" and its values in my output. This is the code I used
Would you use the butter knife to carve the turkey at dinner time?
grep is not the tool for what you want to learn.
What you want to learn is Regular Expressions, which ironically, it is not the best tool neither to parse html, other than simple instances.
[akshay@nio tmp]$ grep -oP '(?<=>).*(?=</a>)' file
Linux
Unix
Oracle
Perl
Didn't know that all I have to do is to remove the "a" in the first tags and here I'm trying to put several combination of regular expression in the first tag. There's more for me to learn.