help for fast way of finding line number for a regex

Hello,

I am trying to find out the line numbers where regex match and put them into a file with below command:

 awk '/'$pat'/ {print NR}' $fileName >> temp.txt

where $pat is the regex

but this command is taking a lot of time to execute with bigger files for size more than 5000000 KBs.

could we make this faster by any alternative command?

Note: I am writing a script to find out a section of a file based on regex match. and the regex can be more than one. for example n lines above regex and n lines down the regex.

try with fgrep ..

$ fgrep -n "$pat" infile
 
grep -n $pat $fileName | cut -d: -f1 > temp.txt

Of course it's slow, what you're doing is akin to trying to repairing a watch using a hammer: it's possible, but frustrating. In this case, the regex as you're applying it has to scan each line completely, checking each character on each line, checking for a match.

So the first question is: is it really a regex, or is it a fixed string?
And the second: can the regex be anchored in some way? Eg, start of the line, or the only word on the line, or something else to minimize the search cost?

Hello pludi,

regex is a user input and can be any string and we just need to find out the line numbers where it matches anywhere on a line.

Jayan,

fgrep -n "$pat" infile is not working .... :frowning:

---------- Post updated at 04:41 AM ---------- Previous update was at 04:37 AM ----------

may be this clarifies more....

I am trying to write a script which will give n lines above (user input) and n lines below (user input) the matched pattern(user input).
and the pattern may be n number of times in the file. so taking all occurance separatly and asking user for which occurance user want the above and below lines.

$ fgrep -n "$pat" $fileName | cut -d: -f1 > temp.txt

With GNU grep you can use -B (before) and -A (after) options.

Let me check if I get this right: the user input will probably be a regular string, without any funky regex stuff (eg foo.+ *b[aA]r\t(baz|BAZ) in it?

I am using solaris .....

fgrep is almost taking same time as awk ....

---------- Post updated at 05:16 AM ---------- Previous update was at 05:15 AM ----------

yups user input is a regular input ..... simple words or numbers no speacial chars ....