How to 'improve' this script and also 'fix' the pattern matching part?

Hi all,

Below is my script. It is currently working but I want some advice on maybe improving it and need some help on the pattern matching

xx.ksh:

#!/bin/ksh
#
# -------------------------------------------------------------------------------------------------
#
#Fatal NI connect error 12170.
#
#  VERSION INFORMATION:
#        TNS for Linux: Version 11.2.0.4.0 - Production
#        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
#        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
#  Time: 18-OCT-2019 04:33:41
#  Tracing not turned on.
#  Tns error struct:
#    ns main err code: 12535
#
#TNS-12535: TNS:operation timed out
#    ns secondary err code: 12606
#    nt main err code: 0
#    nt secondary err code: 0
#    nt OS err code: 0
#  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=12.123.12.12)(PORT=49931))
#WARNING: inbound connection timed out (ORA-3136)
#Fri Oct 18 04:33:44 2019
#
# -------------------------------------------------------------------------------------------------
#

echo
echo "- Parsing $1 ..."
echo
file_to_parse=$1

cat /dev/null > tmpfile.00
cat /dev/null > tmpfile.01
cat /dev/null > tmpfile.02

grep -in "Fatal NI connect error 12170" ${file_to_parse} | awk -F":" '{ print $1"^"$1+17 }' > tmpfile.00

while read line
do
   start=`echo $line | awk -F"^" '{ print $1 }'`
   end=`echo $line | awk -F"^" '{ print $2 }'`
   sed -n "${start},${end}p" ${file_to_parse} > tmpfile.01
   line_time=`sed -n "7p" tmpfile.01 | sed 's/^ *//;s/ *$//;s/  */ /;'`
   line_client=`sed -n "17p" tmpfile.01 | sed 's/^ *//;s/ *$//;s/  */ /;'`
   line_warning=`grep "^WARNING: inbound connection timed out (ORA-3136)" tmpfile.01`

   if [[ -z "${line_warning}" ]] ; then
      line_warning="NO MATCHING WARNING of ORA-3136"
   fi

   line_detail="${line_time}^${line_client}^${line_warning}"
   echo $line_detail | tee -a tmpfile.02
done < tmpfile.00

So basically I pass a log file and need to search for a block of text.

Fatal NI connect error 12170.

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
  Time: 18-OCT-2019 04:33:41
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535

TNS-12535: TNS:operation timed out
    ns secondary err code: 12606
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=12.123.12.12)(PORT=49931))
WARNING: inbound connection timed out (ORA-3136)

As you can see from the script, I search for the string "Fatal NI connect error 12170." and add 17 to it, that will be the block of text that am after. Then I print this block of text to a file and grep the 3 basic information that I am after which is Time, Client address and the WARNING. The WARNING line sometimes exist and sometimes it doesn't.

So far, running the script does most of what I wanted.

Sample output below:

Time: 19-OCT-2019 11:00:18^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.46)(PORT=60771))^WARNING: inbound connection timed out (ORA-3136)
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production^Tracing not turned on.^NO MATCHING WARNING of ORA-3136
^Tns error struct:^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 11:08:47^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=53555))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 12:22:21^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.36)(PORT=61857))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 13:51:51^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.36)(PORT=62520))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 15:27:38^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.46)(PORT=62541))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 15:59:01^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.45)(PORT=62200))^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 16:00:57^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=55824))^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 16:00:57^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=55828))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 16:02:33^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=55995))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 16:33:00^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.45)(PORT=62409))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 17:29:40^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.46)(PORT=63168))^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 17:29:42^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.46)(PORT=63176))^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 17:53:52^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.45)(PORT=62812))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 17:55:33^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.46)(PORT=63299))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 18:40:15^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.45)(PORT=63152))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 19:09:19^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=57065))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 19:31:03^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=57160))^WARNING: inbound connection timed out (ORA-3136)
Time: 19-OCT-2019 19:39:20^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=57230))^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 19:39:20^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=57234))^WARNING: inbound connection timed out (ORA-3136)
VERSION INFORMATION:^ TNS for Linux: Version 11.2.0.4.0 - Production^NO MATCHING WARNING of ORA-3136
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production^ns main err code: 12535^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 20:27:18^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.36)(PORT=64251))^NO MATCHING WARNING of ORA-3136
Time: 19-OCT-2019 20:34:50^Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.123.99.37)(PORT=57593))^NO MATCHING WARNING of ORA-3136

Sometimes there are some 'malform' of some sort, the ones in RED above. And this is because sometimes the log contain something like below which I am not expecting it to have :frowning:

Fatal NI connect error 12170.

Fatal NI connect error 12170.

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
  Time: 19-OCT-2019 11:06:04
  Time: 19-OCT-2019 11:06:04
  Tracing not turned on.
  Tracing not turned on.
  Tns error struct:
  Tns error struct:
    ns main err code: 12535
    ns main err code: 12535


TNS-12535: TNS:operation timed out
TNS-12535: TNS:operation timed out
    ns secondary err code: 12606
    ns secondary err code: 12606
    nt main err code: 0
    nt main err code: 0
    nt secondary err code: 0
    nt secondary err code: 0
    nt OS err code: 0
    nt OS err code: 0
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.37)(PORT=53542))
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.37)(PORT=53538))
WARNING: inbound connection timed out (ORA-3136)
Sat Oct 19 11:08:47 2019

Or something like below:

Fatal NI connect error 12170.
Sat Oct 19 20:05:49 2019


***********************************************************************

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production

Fatal NI connect error 12170.
  Time: 19-OCT-2019 20:05:49
  Tracing not turned on.

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
  Tns error struct:
  Time: 19-OCT-2019 20:05:49
    ns main err code: 12535

  Tracing not turned on.
TNS-12535: TNS:operation timed out
  Tns error struct:
    ns secondary err code: 12606
    ns main err code: 12535
    nt main err code: 0

    nt secondary err code: 0
TNS-12535: TNS:operation timed out
    nt OS err code: 0
    ns secondary err code: 12606
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.22.36)(PORT=64132))
    nt main err code: 0
    nt secondary err code: 0
    nt OS err code: 0
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.22.36)(PORT=64136))
WARNING: inbound connection timed out (ORA-3136)
Sat Oct 19 20:10:26 2019

As a workaround to this malform, I am changing the script to check if the 7th line is not "^Time" then I am excluding this block of text. I can't find any way of matching a block of text starting from " Fatal NI connect error 12170" to "WARNING: inbound connection timed out (ORA-3136)". Unfortunately, sometimes it does not end with "WARNING: inbound connection timed out (ORA-3136)", instead the last line is "Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.22.36)(PORT=64136))" and HOST and PORT changes.

Below is the script with the 'check' for the malform. The lines in RED.

echo
echo "- Parsing $1 ..."
echo
file_to_parse=$1

cat /dev/null > tmpfile.00
cat /dev/null > tmpfile.01
cat /dev/null > tmpfile.02

grep -in "Fatal NI connect error 12170" ${file_to_parse} | awk -F":" '{ print $1"^"$1+17 }' > tmpfile.00

while read line
do
   start=`echo $line | awk -F"^" '{ print $1 }'`
   end=`echo $line | awk -F"^" '{ print $2 }'`
   sed -n "${start},${end}p" ${file_to_parse} > tmpfile.01
   line_time=`sed -n "7p" tmpfile.01 | sed 's/^ *//;s/ *$//;s/  */ /;'`

   check_line_time=`echo ${line_time} | grep "^Time"`
   if [[ -z "${check_line_time}" ]] ; then      # It's a malform then ???
      continue
   fi

   line_client=`sed -n "17p" tmpfile.01 | sed 's/^ *//;s/ *$//;s/  */ /;'`
   line_warning=`grep "^WARNING: inbound connection timed out (ORA-3136)" tmpfile.01`

   if [[ -z "${line_warning}" ]] ; then
      line_warning="NO MATCHING WARNING of ORA-3136"
   fi

   line_detail="${line_time}^${line_client}^${line_warning}"
   echo $line_detail | tee -a tmpfile.02
done < tmpfile.00

Anyway, please advise if there is any better way of doing this the way it is now.

1 Like

Your script seems a bit intricate. E.g. when using the > redirection, you don't need to create / truncate the file upfront. And, with that many temp files, there must be a better approach. How far would

sed -n '/Fatal NI connect error 12170./,/WARNING: inbound connection timed out (ORA-3136)/p;' file

get you for the first part of your task- in the good case?

Now, if the "WARNING" line is missing sometimes, but the "client address" line is always there, why not use that and add a condition to print the "WARNING" line individually? Is the "WARNING" line always immediately following the "Client" line?

And. looks like your logger sometimes misbehaves by mixing logs of two independent events. If you can't remedy that in the originator, you'll need additional coding on the receiving side, and that can't be done in sed , but needs tools like awk , perl , or similar. Are the log lines in relative order, i.e. the second log line consistently belongs to the second event?

2 Likes

Hi RudiC

Thanks for you reply. Very helpful as usual. Yeah I shouldn't need to pre-create the tmpfiles, just being paranoid I guess, I've since change it to use $$ as well as I need to run the script with several log files.

The sed syntax

sed -n '/Fatal NI connect error 12170./,/WARNING: inbound connection timed out (ORA-3136)/p;' file

sure speed things up and it indeed gets most of the pattern match that I am after.

Yes, the Client line is always there. The WARNING line is there most of the time, I am mainly wanting to include to check if the error that comes with the WARNING is always ORA-3136 or not. The HOST and PORT is always going to be different though in most cases. I tried using

sed -n '/Fatal NI connect error 12170./,/Client/p;' file

and that works fine so I'll stick to doing that.

How do I get it to print the Time, Client and WARNING only though? At the moment, I am running the same command 2x/3x, depending on whether I want to include WARNING, and then grepping for Time, Client and WARNING and then combining the 2/3 files output.

When you say

, are you referring to when I am doing the if/then/else check of where I expect to see the Time, Client or WARNING line to appear? If so, that is because sometimes there is some malform in the log, where it gets too much of these errors happening an the way it get printed in the log file is not in the order that I am expecting it to look like :frowning:

1 Like

Try

awk '
/^ *Fatal/,/^ *Client/  {if (/Time/) printf "^%s", $0
                         if (/Client/)  {printf "^%s", $0
                                         getline
                                         if (/WARNING/) printf "^%s\n", $0
                                         else printf "^No Warning encountered\n"
                                        }
                        }
' file

Doesn't help if the log is messed up with e.g. two events being logged one on top of the other.

1 Like