Search for patterns in thousands of files

Hi All,

I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error:

/ms.sh: xrealloc: subst.c:5173: cannot allocate 268435456 bytes (536977408 bytes allocated)

Pasting the code that I wrote:

#!/usr/local/bin/bash

for i in `cat msisdn_u.txt`
do

cd /comptel4/elink/backup1/output/vas/NG0/20130301
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130302
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130303
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130304
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130305
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130306
find ./*GPX.Z|xargs zcat|grep $i; cd ..
cd /comptel4/elink/backup1/output/vas/NG0/20130307
find ./*GPX.Z|xargs zcat|grep $i; cd ..
..
..
..
done

This is in the patterns file:

more msisdn_u.txt
0564891888
0500555401
0563433343
0561132174
0562714661
0543210172
0503588147
0541400224
0564445889
0544998887
0564543055
0544095240
0563211334

Please advise as I need to find out and report it to the management.

Thanks

Danish

Why are not using a single find command for all the directories??

find /comptel4/elink/backup1/output/vas/NG0/ -name "*.GPX.Z" -exec zgrep -il $i {} \;

or

find /comptel4/elink/backup1/output/vas/NG0/ -name "*.GPX.Z" -print | xargs zgrep -il $i

Hope this helps :slight_smile:

Thanks for your suggestions PikK45..but is the command descending into directories..because I dont see any output..the command returns back to the command prompt

can you show us what you did??

I am on hp ux

/comptel/elink> find /comptel4/elink/backup1/output/vas/NG0/ -name "*.GPX.Z" -exec zgrep -il $i {} \;
/comptel/elink>
/comptel4/elink/backup1/output/vas/NG0> find /comptel4/elink/backup1/output/vas/NG0/ -name "*.GPX.Z" -print
/comptel4/elink/backup1/output/vas/NG0>

Are there files with .GPX.Z extension in the "/comptel4/elink/backup1/output/vas/NG0/" directory or its sub-directories??

I have run it again but I am getting the same error. There are subdirectories under it so I have given a patter for it.

/comptel4/elink/backup1/output/vas/NG0> ./ms.sh
./ms.sh: xrealloc: subst.c:5173: cannot allocate 268435456 bytes (536936448 bytes allocated)
/comptel4/elink/backup1/output/vas/NG0> more ms.sh
#!/usr/local/bin/bash

for i in `cat msisdn_u.txt`
do

find /comptel4/elink/backup1/output/vas/NG0/201303* -name "*GPX.Z" -print | xargs zgrep -il $i
done

Try the other command... Do not use xargs..

I am not sure what could be causing the error.

find /comptel4/elink/backup1/output/vas/NG0/ -name "*.GPX.Z" -exec zgrep -il $i {} \;

Same problem :frowning:

/comptel4/elink/backup1/output/vas/NG0> ./ms.sh
./ms.sh: xrealloc: subst.c:5173: cannot allocate 268435456 bytes (536936448 bytes allocated)
/comptel4/elink/backup1/output/vas/NG0> more ms.sh
#!/usr/local/bin/bash

for i in `cat msisdn_u.txt`
do

find /comptel4/elink/backup1/output/vas/NG0/ -name "*GPX.Z" -exec zgrep -il $i {} \;

done

I found something on this

Can you give it a try and see?

This is a production box and I can't change any values unfortunately.

The files matching the pattern *GPX.Z are compressed text files or directories?
You could try:

find /comptel4/elink/backup1/output/vas/NG0/2013030[1-7]/ -type f -name '*GPX.Z' | xargs zgrep -f msisdn_u.txt 

or:

find /comptel4/elink/backup1/output/vas/NG0/2013030[1-7]/ -type f -name '*GPX.Z' -exec zgrep -f msisdn_u.txt {} +

if your find implementation supports the {} + construct.
What's your operating system?

And, by the way, you could try your script with a different shell (different than bash).

1 Like

You will continue to trigger that error so long as the contents of the txt file in the highlighted command substitution exceed your system's available memory (or process mem limit).

Regards,
Alister

3 Likes

Not working. Seems like its a ulimit issue :frowning: I am on a HP UX

Good find :slight_smile: :b:

Can this be eliminated by while like below?

while read i
do
find command
done < msisdn_u.txt

Will this cause the same? :wall:

1 Like

Thanks. At least that takes me forward. Maybe I can split the file and run the find command in parts

Since a while loop eliminates the fatal command substitution, the error should not occur.

Regards,
Alister

---------- Post updated at 12:28 PM ---------- Previous update was at 12:13 PM ----------

danish0909:

Did you try radoulov's suggestion?

Regards,
Alister

Thanks alister and PikK45.

Especially PikK45. Thank you very much for helping me all along. Please you have to let me know why running the while loop did not cause the memory error

---------- Post updated at 10:01 PM ---------- Previous update was at 09:59 PM ----------

Yes Alister. Radoulov's suggestion did not work. And yes, thank you too radoulov