Unexpected Results (at least I did not expect them)

I have two sripts running in bash. The first one uncompresses log files and moves them to a working directory using uncompress -c and > output to new directory. It then creates one control record to assure our search returns a record. It then calls or executes the second script, which is a grep for a literal string (ip and / or url).

My control record for this test was 251.251.251.251 and the data returned was my expected control record plus the following;

Accessed URL xx.xxx.xxx.xxx:/maps?stat_m=tiles:219,219,220,235,235,235,235,235,251,251,251,251,251,266,266,266,282

I am supposing that it matched on the 251,251,251,251. Here is my code;

#!/usr/bin/bash
#
# SearchLogsB
#
# SearchLogsB is called by SearchLogs and will look through the Search Files
# locating the Search Criteria requested.
#
# SEARCHDIR is the directory we created in Part A for doing our search work.
#
# SearchCriteria is the file holding the strings to search for. Normally these
# will be ip sets and urls.
#
# SearchFiles is the file holding the dates for the logs you are searching.

# Echo the info to the our status file so that we may be sure runs with a large
# number of Search Files are executing.

echo "grepping from $SearchFiles" >> $SEARCHDIR/search.status
echo "grepping from $SearchFiles" >> $SEARCHDIR/search.output

# Loop through your Search Criteria, placing info in our status file and writing match info
# to our output file.

while read SearchCriteria
do
  echo $SearchCriteria >> $SEARCHDIR/search.status
  grep "$SearchCriteria" $SearchFiles >> $SEARCHDIR/search.output 2>>$SEARCHDIR/search.error
done < SearchCriteria

rm $SEARCHDIR/$SearchFiles

exit

Thanks for your input.
JB

Sorry if I am stating the obvious, but a dot in regular expressions matches any character. To match a literal dot, use the regular expression [.] or backslash-escape the dot.

Thanks era.

Yes, that is where I am stuck. I guess my real question should have been "How do you make a literal from an ip or url string?" I have tried the following, but both of the sed's give same results;

while read SearchCriteria
do
#  sed "s/\./\\./g" $SearchCriteria
  sed "s/\./[.]/g" $SearchCriteria
  echo "$SearchCriteria" >> $SEARCHDIR/search.status
  grep "$SearchCriteria" $SearchFiles >> $SEARCHDIR/search.output 2>>$SEARCHDIR/search.error
done < SearchCriteria

The simplest workaround would be to use fgrep instead of grep if your search criteria are always static strings. If you are lucky enough to have one which supports the -f option, you also don't need the loop at all.

fgrep -f SearchCriteria $SearchFiles >>$SEARCHDIR/search.output 2>>$SEARCHDIR/search.error

If you want to neutralize any regex specials in the search string, try something like

grep `echo "$SearchCriteria" | sed -e 's/[][\\.*$^]/\\&/g'` $SearchFiles

Your attempt at using sed for this was not doing anything useful, I'm afraid. The above should hopefully work better, although it's completely off the top of my head (so I probably forgot a few of the regex specials) and different versions of sed use slightly different regex syntax (so yours probably has a slightly different set of special characters than mine).

The main misunderstanding was how to pass something to sed; it expects a file name (not a string to use as input) as an argument, or reads standard input; and simply prints any output, so to use the output in your script, you have to capture it with backquotes or something.

Just to top it off, here is a slightly more elegant and efficient way to code the loop:

sed -e 's/[][\\.*$^]/\\&/g' SearchCriteria |
while read regex; do
  grep "$regex" $SearchFiles
done >>output 2>>error

Getting close now.... era

grep `echo "$SearchCriteria" | sed -e 's/[][\\.*$^]/\\&/g'` $SearchFiles

found no results, not even the control record, which is what happened when I was trying sed.

sed -e 's/[][\\.*$^]/\\&/g' SearchCriteria |
while read regex; do
  grep "$regex" $SearchFiles
done >>output 2>>error

found the control records plus the original problem record with the 251,251,251, etc.

alas, the fgrep worked like a champ.

fgrep -f SearchCriteria $SearchFiles >>$SEARCHDIR/search.output 2>>$SEARCHDIR/search.error

I thank you very much. I was spinning my wheels for a couple of days trying to figure out how to make the input records literal strings.

I would really like to be able to run in a loop for only one reason. The case that caused me to work on the scripts involved uncompressing 75 days worth of logs (3 -5 gb uncompressed) and searching for 36 ip ranges and 20 url strings.

When you start the script you never know where you are, or if the script is working (except with a ps -ef) until it completes and you view the output. I broke the original search files into one week ranges so the job would complete during my work hours. I was able to find the records I needed, but got a few that matched due to the "dot" wildcarding.

Do you think maybe the loop would work differently in a shell other than bash? I will test later. Now I need to get back to herding some cats too. Man, that is an extremely hard task!

Thanks again,
JB

Probably my sed script is not completely correct, in the general case or for your particular version of sed. Glad you sorted it out, anyway.

Oh, and you should probably use read -r when reading the regex from the sed script!

Also, try adding double quotes around the backticks.

Just out of curiosity, do you see what I see?

vnix$ echo 12.34.56.78 '^ick\\y.poo$' | sed -e 's/[][\\.*$^]/\\&/g'
12\.34\.56\.78 \^ick\\\\y\.poo\$

I don't think changing shells would change anything significantly, except if you switch to a shell which doesn't have support for the -r option to read. Maybe you have the line command instead; if so, try that.

For what it's worth, my cats are the cat(1) kind, not the felines (-:

:):b:

Big cheezy grins from this side of the console. The -r was all that was needed to exclude special treatment of the \'s.
Now I can incorporate it into our log search tool and publish it for our firewall admin.

As for seeing what you see, yes, I do. So we evidently are using close, if not exact versions.

Thank you very much for your help. Folks like you, make folks like me, look good to the boss! Credits have already been given to you inside of the tool in our repository.

JB

Glad yours are not the feline versions... you would never get the time to help us out in the forums.:b: