Below is a sample script
jb>cat search
#!/bin/bash
set -x
PATTERN='"'`awk -F: '{printf $2 "|"} END {printf "\b"}' RULE`'"'
echo 'Pattern ['$PATTERN']'
ls -tr *$1* |xargs egrep "$PATTERN"
Sample RULE file
jb>cat RULE
P1 :pone
P2 :ptwo
Sample Output
jb>./search f
++ awk -F: '{printf $2 "|"} END {printf "\b"}' RULE
+ PATTERN='"pone|ptwo"'
+ echo 'Pattern ["pone|ptwo"]'
Pattern ["pone|ptwo"]
+ ls -tr f1 f1.txt f2 f3 f4
+ xargs egrep '"pone|ptwo"'
f2:ptwo
jb>
Directly searching
jb>ls -tr *f* |xargs egrep "pone|ptwo"
f1:000pone000
f2:ptwo
Sample file content
jb>cat f1
11
000pone000
1111
jb>cat f2
222
ptwo
22222222
jb>
Why 'search' doesn't list f1?
Try the pattern without the double quotes, replace this line:
PATTERN='"'`awk -F: '{printf $2 "|"} END {printf "\b"}' RULE`'"'
with:
PATTERN=`awk -F: '{printf $2 "|"} END {printf "\b"}' RULE`
Thanks,it works.
I have one more question.
I have some 20 patterns OR'ed 'pat1|pat2|.....pat20'
Each pattern of length around 20 chars
and i search around 30 files each of size 10 MB.
It takes 2 mins to complete !
Is there any optimal/fast way to do this ?
You can try something like:
awk -F: '{print $2}' RULE > tmp.file
grep -f tmp.file $(ls -tr *$1*)
1.Actually the RULE file names pattern,(RULE file as such is not a pattern) example
>cat RULE
SUCCESS :Message sent successfully to
FAILURE :Message send failed for
Acknowledged :Got ack from
#Goes on
2.This line extract the second column ,forms a (ORed) pattern
PATTERN=`awk -F: '{printf $2 "|"} END {printf "\b"}' RULE`
#PATTERN will be 'Message sent successfully to|Message send failed for|Got ack from'
3.Search for all pattern in all file and redirect the o/p to tmp.txt (Takes 2 mins )
ls -tr *$1* |xargs egrep "$PATTERN" > tmp.txt
4.from tmp.txt do some more filtering,collect statistics (count each pattern)
(Dont have problem with this step ,takes very less time.)
Sample Output :
SUCCESS :1423
FAILURE : 432
Acknowledged : 764
But step 3 alone takes much of the time(around 2 mins)
Have you tried my last 2 commands for steps 2 and 3?
awk -F: '{print $2}' RULE > tmp.file
grep -f tmp.file $(ls -tr *$1*) > tmp.txt
Once again thanks,It works
Search completed in less than 30 secs.
My grep version doesn't support -f ,fgrep does.
awk -F: '{print $2}' RULE > tmp.txt
fgrep -f tmp.txt $(ls -tr *26_06_2009*) > tmp2.txt
(Sorry I thought you didn't get my question,but actually I misunderstood your solution )