Search string or words in logs without using Grep

senormarquez · July 29, 2015, 8:48pm

I'm in need of some kind of script that will search for a string in each logfile in a directory but we don't want to use GREP. GREP seems to use up to much of our memory causing the server to use up a lot of swap space. Our log files are bigger than 500M on a daily basis. We lately started turning over our logs when it reaches 1G to help but it seems to still use up a lot of our memory causing the server to use up too much of its swap space.

Any recommendation on how we can search for strings/words in each logfile thats within a directory without having to use grep?

sea · July 29, 2015, 8:58pm

You might try something like this (untested):

SEARCH="this string"
awk -v S="$SEARCH" '/S/ {print $0}' *

However, i'm pretty the issue is NOT grep.
In fact, it is probably due to wrong usage (eg: too many pipes).
The more commands you use for a single search/compare the longer it takes and the more memory it uses.

hth

Don_Cragun · July 29, 2015, 8:59pm

You can probably do it with sed or awk . What characters are in your search pattern? Are you looking for a fixed string, a match for a basic regular expression, or a match for an extended regular expression? Are you hoping to get a count of matching lines, a list of the matching lines, the line numbers of lines that match?

What options are you giving to grep when you use it to search for a string in each log file in a directory?

What operating system and shell are you using?

Don_Cragun · July 29, 2015, 9:03pm

sea:

You might try something like this (untested):
SEARCH="this string"
awk -v S="$SEARCH" '/S/ {print $0}' *
However, i'm pretty the issue is NOT grep.
In fact, it is probably due to wrong usage (eg: too many pipes).
The more commands you use for a single search/compare the longer it takes and the more memory it uses.

hth

That won't work! The /S/ will print lines containing the character S . If the search string doesn't contain any characters that are special in an ERE, the following should work:

awk -v S="$SEARCH" '$0 ~ S' *.log

if the goal is just to print the matching lines and the target OS is not a Solaris/SunOS system.

senormarquez · July 29, 2015, 9:09pm

the characters vary. sometimes we need to find a particular transaction id, look for errors, or a particular word.

> uname -a
Linux xx.xx.xx.com 2.6.18-371.6.1.el5 #1 SMP Tue Feb 18 11:42:11 EST 2014 x86_64 x86_64 x86_64 GNU/Linux

> ps -p $$
  PID TTY          TIME CMD
 5140 pts/0    00:00:00 ksh

Don_Cragun · July 29, 2015, 9:19pm

I repeat: What grep command-line are you using to get the output you want?

senormarquez · July 29, 2015, 9:31pm

here are some examples

zgrep 11263511 *.trace.log*
grep 1215456 *.log

We are not sure which grep command is what crashes the server. Our performance team can only see that its a grep command that causes the spike in swaps space.

Aia · July 30, 2015, 1:07am

perl -ne '/1215456/ and print' *.log

Don_Cragun · July 30, 2015, 1:18am

There are lots of utilities that can emulate grep . Few, if any, of them will be faster or use less memory than grep for the options you're using. Since you're using fixed strings rather than regular expressions, you could make grep run faster by using the -F option. But, you didn't mention searching compressed and/or gzipped files until post #7 in this thread. Uncompressing or unzipping a file and searching the converted plain text is obviously going to take a LOT more memory and/or swap space and a LOT more time than searching a plain text file.

The fact that you're processing your compressed files, your gzipped files, and your plain text files with:

zgrep 11263511 *.trace.log*

and processing your plain text files with names containing .trace again with:

grep 1215456 *.trace.log

probably won't affect swap space, but it will make processing the files in your directory take longer.

If you have the space, you'd be better off keeping your log files uncompressed until you're done searching them. With uncompressed, unzipped files, grep shouldn't need a lot of memory or swap space. I have no idea whether zgrep takes a lot more memory and swap, but I wouldn't be surprised if it does.

If you want to search for two (or more) patterns in all of your log files, it will take a minuscule bit more space and run a LOT faster if you combine them in a single invocation of grep .

For example:

zgrep -Fe 11263511 -e 1215456 *.trace.log.*
grep -Fe 11263511 -e 1215456 *.log

If the zgrep commands really are taking too much swap, you might need less if you uncompress your compressed files, gunzip your gzipped files, grep the files you uncompressed and unzipped, and then compress or gzip them again. (But, that will be slower, and uncompress and gunzip may eat up as much swap as zgrep .)