Single grep to multiple strings with separate output per string

I need to grep multiple strings from a particular file.

I found the use of egrep "String1|String2|String3" file.txt | wc-l

Now what I'm really after is that I need to separate word count per each string found. I am trying to keep it to use the grep only 1 time.

Can you guys help ?

---------- Post updated at 03:17 PM ---------- Previous update was at 03:15 PM ----------

To explain myself better is that I'm after something similar:

String1 2 
String2 1
String3 4.....

Hello nms,

Your complete requirement is not clear, so based on your explanation, could you please try following and let me know if this helps.

awk '/String1/{count1++;next} /String2/{count2++;next} /String3/{count3++;next} END{print "String1 ",count1 RS "String2 ",count2 RS "String3 ",count3}'  Input_file
 

Thanks,
R. Singh

How about (untested)

grep -Eo "String1|String2|String3" file.txt | sort | uniq -c
2 Likes

I am running on Solaris 11, hence grep -o does not work.

What you provided me works but I'm looking for a command that works the same as 'grep -o' which works on Solaris.

Basically I have a log file and each line contains a timestamp. Therefore when performing uniq command, each line which contains that particular grep, is displayed separately.

To give you a better idea:

These are lines from the log file:

 1 Sep 14 11:00:01 ccsWalletExpiry: [ID 848595 user.crit] ccsWalletExpiry(28635) CRITICAL: ABORTING: Cannot connect to O                                                                     racle as '/'
   1 Sep 14 11:00:06  ccsPeriodicCharge: [ID 848595 user.crit] ccsPeriodicCharge(28632) CRITICAL: Error: failed to initiali                                                                     se database connection, cannot continue.
   1 Sep 14 11:10:00  ccsWalletExpiry: [ID 848595 user.crit] ccsWalletExpiry(12949) CRITICAL: ABORTING: Cannot connect to O                                                                     racle as '/'

I need to grep specifically for CRITICAL only and the output should be the string I'm trying to grep and the number of occurrences this string was matched

CRITICAL 3

Hello nms,

Could you please try following and let me know if this helps you.

awk '/CRITICAL/{count++} END{print "CRITICAL ",count}'   Input_file

on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk . Let me know how it goes then.

Thanks,
R. Singh

1 Like

Hi RavinderSingh13,

Yes that works.
The output is
CRITICAL 919.

However I would like to know if it's possible to include multiple strings.

Hello nms,

I believe I had given this answer in my POST#2 of this thread Single grep to multiple strings with separate output per string Post: 303003374, kindly have a look to it and let me know if you have any queries on same. Also you could hit THANKS button to thank anyone for a useful post at left corner of each post.

PS: In my post#2 I have given examples like STRING1, STRING2. Similarly you could put your actual strings on their place.

Thanks,
R. Singh

1 Like

Hi.

If you have the GNU utilities installed:

#!/usr/bin/env bash

# @(#) s1       Demonstrate extraction of multiple strings, egrep.

g=grep  # Linux
g=ggrep # Solaris, gnu-grep
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C ggrep

FILE=${1-data1}
N=${FILE//[A-Za-z]/}
E=expected-output$N

pl " Input data file $FILE:"
cat $FILE

pl " Expected output:"
cat $E

pl " Results:"
# grep -Eo "String1|String2|String3" file.txt | sort | uniq -c
$g -Eo "String1|String2|String3|CRITICAL" $FILE |
tee f3 |
sort |
tee f2 |
uniq -c |
awk '{ print $2,$1}' |  # Swap fields
tee f1

pl " Verify results if possible:"
C=$HOME/bin/pass-fail
[ -f $C ] && $C f1 $E || ( pe; pe " Results cannot be verified." ) >&2

pl " Some details on $(which ggrep):"
man ggrep 2>/dev/null | head -7

exit 0

producing:

$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: SunOS, 5.11, i86pc
Distribution        : Solaris 11.3 X86
bash GNU bash 4.1.17
ggrep (GNU grep) 2.14

-----
 Input data file data1:
1 Sep 14 11:00:01 ccsWalletExpiry: [ID 848595 user.crit] ccsWalletExpiry(28635) CRITICAL: ABORTING: Cannot connect to O                                                                     racle as '/'
1 Sep 14 11:00:06  ccsPeriodicCharge: [ID 848595 user.crit] ccsPeriodicCharge(28632) CRITICAL: Error: failed to initiali                                                                     se database connection, cannot continue.
1 Sep 14 11:10:00  ccsWalletExpiry: [ID 848595 user.crit] ccsWalletExpiry(12949) CRITICAL: ABORTING: Cannot connect to O                                                                     racle as '/'

-----
 Expected output:
CRITICAL 3

-----
 Results:
CRITICAL 3

-----
 Verify results if possible:

-----
 Comparison of 1 created lines with 1 lines of desired results:
 Succeeded -- files (computed) f1 and (standard) expected-output1 have same content.

-----
 Some details on /usr/bin/ggrep:
GREP(1)                     General Commands Manual                    GREP(1)



NAME
       grep, egrep, fgrep - print lines matching a pattern

Best wishes ... cheers, drl

Try this for an unlimited number of search strings compiled in the SRCH variable, separated by the pipe character:

awk 'match ($0, SRCH) { CNT[substr($0, RSTART, RLENGTH)]++} END {for (c in CNT) print c, CNT[c]}' SRCH="CRITICAL|ABORTING" file
ABORTING 1
CRITICAL 7

It doesn't work correctly if more than one search strings occur in a line; in that case, only the first one will be counted.

1 Like

Try this adaption for matching any number of search strings in a line:

awk '
                {P = 1
                 RSTART = RLENGTH = 0
                 while (match (substr ($0, P+=RSTART+RLENGTH), SRCH))   CNT[substr($0, P+RSTART-1, RLENGTH)]++
                }
END             {for (c in CNT) print c, CNT[c]
                }
' SRCH="CRITICAL|ABORTING|ABC|CDE|DEF|GHI|XYZ|ZYX" file