Dynamic Grep

Experts,

Hi there.

Feel like kicking myself today as I wish I could have paid more attention to learning while I was in college. If only wishes were horses ....

Anyway, Im parsing the contents of a log file.

Now what I need to do here is search for a pattern and then Pick the top 3 occurrences of lets say each one of the TEXT"s for example "PROVISIONED, AVAILABLE, EARL SIGNO and then list the corresponding 10 occurrences of it.

So, here is what Im doing

cat $LOGFILE | grep 'Deleting Text' |  cut -d \" -f2 | more

� but this is only giving me text as below.

PROVISIONED
LOGGED NOR
EARL SIGNO
TURNED OFF
AVAILABLE

What Im looking for is as follows -
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Deleting Text on line 124     (ESN: 665367B8760830) because the number wasnt  "PROVISIONED" and is not found in the database
Deleting Text on line  2474 (ESN: 607765B8577434) because the number wasnt  "PROVISIONED" and is not found in the database
Deleting Text on line  2474 (ESN: 877765B8343429) because the number wasnt  "PROVISIONED" and is not found in the database

Deleting Text on line 47764 (ESN: 3214567B8765430) because a number wasnt  "AVAILABLE" and is not found in the database
Deleting Text on line 12400 (ESN: 7683567Z8776055) because a number wasnt  "AVAILABLE" and is not found in the database
Deleting Text on line  12073 (ESN: 3213468Z4735412) because a number wasnt  "AVAILABLE" and is not found in the database

Deleting Text on line 47764 (ESN: 5754567B8765430) because a number wasnt  "TURNED ON" and is not found in the database
Deleting Text on line  12400 (ESN: 0334567B8765430) because a number wasnt  "TURNED ON"  and is not found in the database
Deleting Text on line  12073 (ESN: 8213467B8765430) because a number wasnt  "TURNED ON"  and is not found in the database

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Please advise !

regards,
Lee

---------- Post updated at 09:42 PM ---------- Previous update was at 08:20 PM ----------

I guess, I did not make myself clear enough here.

Anyway, here is how far I could push this ...

cat $LOGFILE | grep -i 'Deleting Text' | cut -d'(' -f2 |  sed -e 's/because a number wasnt //g' | sed -e 's/ because the number wasnt //g' | sed -e 's/and is not found in the database//g'
ESN: 607765B8577434) "PROVISIONED" 
ESN: 877765B8343429) "PROVISIONED" 
ESN: 3214567B8765430) "AVAILABLE" 
ESN: 7683567Z8776055) "AVAILABLE" 
ESN: 3213468Z4735412) "AVAILABLE" 
ESN: 5754567B8765430) "TURNED ON" 
ESN: 0334567B8765430) "TURNED ON" 
ESN: 8213467B8765430) "TURNED ON"

Now, from the above lines, how do I get the following -

PROVISIONED - 607765B8577434
PROVISIONED - 877765B8343429

AVAILABLE - 3214567B8765430
AVAILABLE - 7683567Z8776055
AVAILABLE - 3213468Z4735412

TURNED ON - 5754567B8765430
TURNED ON - 0334567B8765430
TURNED ON - 8213467B8765430

Please advise.

regards,
Lee.

sed -n 's/Deleting Text .*ESN: \(.*\)).*"\(.*\)".*/\2 - \1/p' $LOGFILE
1 Like
awk -F \" '/Deleting/{split($1,a,"[) ]");print $2 " - " a[7];next}1' infile
1 Like

Rdcwayx & Chubler_XL,

Thank you for your help. I appreciate it & what you wrote in here works the way you expected it & let me rather say this, the way I explained it here. Sorry my bad. Please bear with me here.

What I wanted was something like a random sample of data for each type of error message along with 3 examples of ESN related to that specific error message. When I say Error Type I mean, each specific error type that you see here, namely, PROVISIONED, AVAILABLE, TURNED ON, LOGGED NOR & EARL SIGNO.

Like I said, we have over 10,000 types of error messages and I cant write a grep on all of the 10,000 different Error Message Types and then pipe it to tail -3.

I hope I have explained myself clear enough this time around.

Please help.

best regards,
Lee.

RAW DATA -

PROVISIONED - 607765B8577434
PROVISIONED - 877765B8343429

AVAILABLE - 3214567B8765430
AVAILABLE - 7683567Z8776055
AVAILABLE - 3213468Z4735412

TURNED ON - 5754567B8765430
TURNED ON - 0334567B8765430
TURNED ON - 8213467B8765430
TURNED ON - 8283277B2992311

LOGGED NOR - 0334567B8765112
LOGGED NOR - 3772372B3345677 
LOGGED NOR - 0817272B4873848
LOGGED NOR - 8388311B1200192
LOGGED NOR - 3723288B8788211

EARL SIGNO - 0334567B8348880
EARL SIGNO - 1226676B1210002
EARL SIGNO - 9923883B8212121
EARL SIGNO - 0232233B1255122
EARL SIGNO - 2377288B1162666

EXPECTED OUTPUT -

PROVISIONED - 607765B8577434
PROVISIONED - 877765B8343429

AVAILABLE - 3214567B8765430
AVAILABLE - 7683567Z8776055
AVAILABLE - 3213468Z4735412

TURNED ON - 5754567B8765430
TURNED ON - 0334567B8765430
TURNED ON - 8213467B8765430

LOGGED NOR - 0334567B8765112
LOGGED NOR - 3772372B3345677 
LOGGED NOR - 0817272B4873848

EARL SIGNO - 9923883B8212121
EARL SIGNO - 0232233B1255122
EARL SIGNO - 2377288B1162666

Head 3 preserving blank lines ...

awk '!$1 || d[$1]++ < 3 {print}' <filename> 

Head 3 losing blank lines ...

awk '$1 && d[$1]++ < 3 {print}' <filename> 

Tail 3 losing blank lines is a little more involved (note that the order of output sections is unspecified) ...

awk '$1 { d[$1]=d[$1] " " $NF }
     END {
        for (e in d)
        {
            n=split(d[e], a)
            for (i=(n<4 ? 1 : n-2); i<=n; i++) print e " - " a
        }
     }' <filename> 

Regards,
Mark.

1 Like

Mark,

That worked like a charm.

Many a "THANKS" for the help.

regards,
Lee.

---------- Post updated at 03:31 AM ---------- Previous update was at 03:26 AM ----------

Mark,

Can you please decipher this for me ?

What is this "d[$1]++" doing ?

what is the purpose of using d before incrementing ?

Once again, thanks a million.

regards,
Lee.

d[$1]++: 

set $1 in array d, and count it.