pooga17
February 13, 2008, 5:06am
1
Hi All,
I am using grep command to find string "abc" in one file .
content of file is
***********
abc = xyz
def= lmn
************
i have given the below mentioned command to redirect the output to tmp file
grep abc file | sort -u | awk '{print #3 }' > out_file
Then i am searching content of out_file in muliple files... by using below mentioned command..
grep -f out_file l*view_data_file
but the same is very slow..is there any way i can improve grep performance
Thanks in advance
otheus
February 13, 2008, 6:19am
2
I think you mean $3, not #3 .
What's with the |* ? Is that a typo?
Do you need to know which file contains the string? If not, it would be faster to merge all the files together, and then do the grep.
cat *data_files.dat | grep -f out_file
Otherwise, you can do a parallelized search, assuming you can take advantage of a multi-CPU system:
for f in *data_files.dat ; do
grep -f out_file $f >>grep-out.$$ &
done
wait
cat grep-out.$$
Of course, if there are thousands of dat files, this might bring the system "to its knees". In that case, you can have each grep do 5 at a time.
ls -1 *data_files.dat |
while read f1; do
read f2
read f3
read f4
read f5
grep -f out_file $f1 $f2 $f3 $f4 $f5 >>grep-out.$$ &
done
If any files contain spaces or strange characters, you'll need to enclose each variable in double-quotes.
Tytalus
February 13, 2008, 6:34am
3
depending on the number and size of files you are searching, and assuming you are using fixed-character strings, you may see better performance with fgrep rather than grep.