This is my problem, I am using the following code to extract the file names with specific strings 0.01 :
find ./ -name "*.txt" -exec grep -H '0.01' {} +
It works wonders with a small sample. However, when I use it in a real scenario it produces an empty file -even though I am sure there are files with the expected string
Am I missing something here? I am pretty sure there should be a better alternative using AWK, I just could not come up with one
Thanks in advance
I couldn't find any example of -H in reference books and grep warned me of "illegal option" when I tried it. Perhaps just "-l" instead. Also, the "find" commands that execute something end in "\;" instead of a plus sign. Putting a backslash in front of your decimal point might be worth a look.
It were highly surprising if grep should fail in "real scenarios". Does "I am sure there are files" guarantee to 100% there are files? Why don't you add a test file to the real scenario to check for correct operation? @wbport: There are grep versions (including linux and FreeBSD) providing the -H option to print file names. And, most (if not all) find commands allow the exec action to be terminated with a ; (for exec ing on every single file found) OR with a + (for exec ing on as many files as would fit). The unescaped dot in the regex will match any char including decimal points, so the missing matches will NOT be due to this.
I did. That's why I know there is something wrong with the performance of the script. In reality, I wanted list all files where values between 0.019-0.011 were found -I just could not come up with a better solution.
As I said, it seem to work in a subset of files but failed miserably using real datasets
If you just want the names of files whose names end in .txt and whose contents include the string 0.01 (without printing the contents of lines that contain that string), I would try:
find . -name '*.txt' -exec grep -Fl 0.01 {} +
PS: Note that since I'm using grep -F (AKA fgrep ) instead of grep without the -F option, we are looking for a fixed string instead of looking for a match to a basic regular expression. Therefore, we don't need to escape the <period> in the string 0.01 to keep it from matching any character as it would in a BRE match.
Expanding a little on what I said in post #7, grep 0.1 and grep -E 0.1 use basic regular expression and extended regular expression matching, respectively, and in both cases the <period> in 0.1 matches any character. So, the RE 0.1 matches the text in red in the output:
nor why id didn't report many of the files found by Aia's perl script.
My suggestion worked because grep -F 0.1 performs a fixed string search; not a regular expression search, and in a fixed string search the <period> in 0.1 only matches a <period>.
And, using grep -l just prints the name of a file that contains a match (without displaying the matching text) and moves on to the next file instead of looking for all possible matches in a single file.