I am trying to find out the line numbers where regex match and put them into a file with below command:
awk '/'$pat'/ {print NR}' $fileName >> temp.txt
where $pat is the regex
but this command is taking a lot of time to execute with bigger files for size more than 5000000 KBs.
could we make this faster by any alternative command?
Note: I am writing a script to find out a section of a file based on regex match. and the regex can be more than one. for example n lines above regex and n lines down the regex.
Of course it's slow, what you're doing is akin to trying to repairing a watch using a hammer: it's possible, but frustrating. In this case, the regex as you're applying it has to scan each line completely, checking each character on each line, checking for a match.
So the first question is: is it really a regex, or is it a fixed string?
And the second: can the regex be anchored in some way? Eg, start of the line, or the only word on the line, or something else to minimize the search cost?
regex is a user input and can be any string and we just need to find out the line numbers where it matches anywhere on a line.
Jayan,
fgrep -n "$pat" infile is not working ....
---------- Post updated at 04:41 AM ---------- Previous update was at 04:37 AM ----------
may be this clarifies more....
I am trying to write a script which will give n lines above (user input) and n lines below (user input) the matched pattern(user input).
and the pattern may be n number of times in the file. so taking all occurance separatly and asking user for which occurance user want the above and below lines.