Extract lines of text based on a specific keyword

I regularly extract lines of text from files based on the presence of a particular keyword; I place the extracted lines into another text file. This takes about 2 hours to complete using the "sort" command then Kate's find & highlight facility.

I've been reading the forum & googling and can find scripts and shell commands which extract a particular string from a file but nothing that extracts a complete line based on a keyword/string within a line.

Here's an example of the lines of data I'm using:

<li><a href="http://some-website1.com/"><b>CategoryOne: </b>Description defgh</a></li>
<li><a href="http://some-website2.com/"><b>CategoryThree: </b>Description cdefg</a></li>
<li><a href="http://some-website3.com/"><b>CategoryTwo: </b>Description bcdef</a></li>
<li><a href="http://some-website3.com/"><b>CategoryOne: </b>Description abcde</a></li>
<li><a href="http://some-website2.com/"><b>CategoryOne: </b>Description zabcd</a></li>

The data is alway a list item.

I need something that will find the line containing a specified category which will then extract the complete line and move it to a new text file (preferably named after that category). For example:

If I search for "<b >CategoryOne</b >" then I need it to move every line containing "<b >CategoryOne</b >" to text file categoryone.txt

Please help...

Something like that ?

awk -F'[<|:|>]' '{f=tolower($8)".txt";print >> f;close(f)}' file

ahem... i think you are thinking too complicated. This is what "grep" was built for! Your solution is a one-liner:

grep "your-criteria" /path/to/source > your-criteria.html

Replace to value of the criteria with a variable, put some error-handling in and you are done. You could also refine the search criteria to inlude the list-tag in the line, etc., but that is all just bells and whistles.

I hope this helps.

bakunin

Thanks for you replies.

Danmero, to be honest, I haven't a clue what you've posted means. I know it's an Awk script and I believe it uses regular expressions but I know little about either of them. I'll try and work it out.

Bakunin, worked a dream :slight_smile: thank you so, so much.

Look like you try to split a large file into multiple file named base on category name and that's what the awk oneliner will do.

bakunin grep solution will work for one-by-one by category split if that's what you want.