Recursive grep

Hello,

First time post - I have no formal unix training and could use some help with this. I have a list of strings in File1 that I want to use to do a recursive search (grep) under a specific directory.

Here is an example of the string I need to search:

/directory/dire ctory/directory/dire ctory/filename

I'm trying to illustrate that the string is a full directory path of a file where some of the directories have spaces in their names.

I then have the following script:

for h in `cat file1`; do grep -rl "$h" /../../../../../ >> /../../file2 ; done

So, I'm trying to say for each string in file1, do a recursive grep in the specified directory and print the results to file2.

The problem (I think) I'm running into is the format of the string I'm searching, the cat I'm doing is treating the spaces as escapes which throws the grep off. I've tried putting the string in single and double quotes but it's still not working.

Sorry for the lack of technical terminology - I hope I was clear enough.

If anyone can offer any help on making it work with what I have or a simpler alternative to what I have, it would be a great help.

Thanks - upstate boy

find /path/to/search/in -type f | \
while read filename
do
       grep -f /path/to/strings.txt $filename
done  > /home/upstate_boy/results.txt

grep -f <file> means to use the strings in <file> as search strings for grep.
The done > filename part writes the output of the loop to filename

The relative path to file2 seems wrong; the output redirection is relative to the current directory, not the directory of the file you are grepping.

The relative pat you are grepping seems wrong too; /../ is equivalent to / is equivalent to /../../../../../

The backticks in the for loop are what are splitting up stuff on whitespace. Use a construct which is less sensitive to spacing issues, or use proper quoting.

for h in "`cat file1`"; do grep -rl "$h" pathtodir >>file2; done

or

while read h; do grep -rl "$h" pathtodir >>file2; done<file1

Thank you both for the replies. I don't think I'm executing your suggestions correctly, I've tried all 3.

Jim,

I'm definately confused by which files go where when I read yours.

assume:
strings.txt = file with strings I want find
results.txt = output file of search results

I am trying:

find /directory/I/want to/search/ -type f | \
while read results.txt
do
grep -f strings.txt $results.txt
done

When I use this, I get:

read: `results.txt': not a valid identifier

era,

I didn't get any errors with your suggestions but strings I'm searching are still being broken up, meaning the spaces or '/' in the strings are being handled as breaks turning 1 string into several small strings that are each getting searched.

A better example of what I was originally trying to do is:

for h in `cat strings.txt`; do grep -rl "$h" /directory/path/I want/to/search/ >> /home/directory/results.txt ; done

using /../../ in my original post was not the best choice on my part when they are the equivalent of back ticks.

I'm going to continue to fiddle with all the suggestions, if any further guidance can be offered it would be a great help.

Thanks upstate boy

The variable in Jim's example can't be named results.txt; just change it to e.g. "file" and you should be fine.

Anything with significant spaces in it should be double-quoted.

I've changed it to:

find /directory/I/want to/search/ -type f | \
while read file
do
grep -f strings.txt $results.txt
done

Results now are:

grep: .txt: No such file or directory

Can someone spell out exactly how I should have it based on the example I've been using?

Thanks upstate boy

See edit above, in red.

Jim, thanks for spelling it out for me. I got it to work but it's not producing the results I need. The results going to the results.txt are the actual contents of the files, and they are not matching my string fully. I need the files that contain the strings I'm searching - which I realize I didn't state clearly initially.

The 2 scripts I've come up with are:

for h in `cat strings.txt`; do echo "**$h**" ; grep -rl $h /path/to/search/ >> results.txt ; done

and

for h in `cat strings.txt`; do find /path/to/search/ -name \*xml -exec grep -l "$h" {} \; >> results.txt ; done

The grep and the find are working fine, it's the `cat` that is giving me trouble. The strings in strings.txt are getting broken up into smaller strings - which I verified by putting that echo in on the grep script.

Example of string in strings.txt is:

/sample/string in/strings file/title.jsp

The cat (and grep -f) is breaking it up into:

/sample/string
in/strings
file/title.jsp

I've tried putting the string in strings.txt in both single and double quotes:

"/sample/string in/strings file/title.jsp"
'/sample/string in/strings file/title.jsp'

and have also tried putting single and double quotes in the scripts:

for h in "`cat strings.txt`"; do echo "**$h**" ; grep -rl "$h" /path/to/search/ >> results.txt ; done

And the echo still shows the string being split into 3 smaller strings.

Thanks upstate boy

Try to change the field separator in your script:

OIFS=$IFS
IFS=""

# Do your stuff here

IFS=$OIFS

Thanks for the suggestion Franklin52. I do see the echo showing the full string now, but the results of the grep are off.

If I do the grep manually - I get 3 files returned which is correct.

If I use my script - I get 1588 files returned.

Script now:

OIFS=$IFS
IFS=""

for h in `cat strings.txt`; do echo $h ; grep -rl "$h" /path/to/search/ >> results.txt ; done

IFS=$OIFS

Why don't you use the -f option?

grep -rl -f strings.txt /path/to/search/*

I tried grep -rl -f strings.txt /path/to/search/* > result.txt

Same problem, the string in strings.txt is being split up:

/sample/string in/strings file/title.jsp

I'm guessing it is being split into these 3 strings:

/sample/string
in/strings
file/title.jsp

I know that if I do this grep, I get only 3 results as opposed to the 1588 results I get with the grep -rl -f strings.txt method.

grep -rl "/sample/string in/strings file/title.jsp" /path/to/search/*

Thanks upstate boy

Could the string in string.txt actually contain something else than plain spaces? Can you inspect it with a hex dump tool (xxd, od, what have you)?

Era - I'm not sure how to inspect in the way you are asking but I've deleted the stings.txt and created a new one with vi adding the string back - no copy/paste. When trying grep -rl -f strings.txt I'm still seeing the same behavior as already described.

Thanks upstate boy

Do you have any hex dump tools at your disposal?

Problem found. I feel kind of foolish now but after so many people offered help and suggestions I feel you should see my error.

My strings.txt file had several blank lines in it prior to and after the string I wanted to search. These blank lines were what was throwing grep -f off.

So, I went from this:

<blank line>
<blank line>
/sample/string in/strings file/title.jsp
<blank line>
<blank line>

to just:

/sample/string in/strings file/title.jsp

and it resolved the problem. I can't say I fully understand it but it's working.

Sorry to have wasted peoples time in trying to resolve my problem but I appreciate it a great deal.

Thanks again for the time and help.

upstate boy

The blank line matches any line, so that's why you were getting so many matches.