Help/Advise on parsing these line of text

Hi,

Can anyone please advise how do I parse the following line of strings?

14-OCT-2012 06:38:59 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\ifrun60.EXE)(HOST=8000XXX05004RV)(USER=mickey))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.90.24.239)(PORT=1552)) * establish * test * 0
14-OCT-2012 06:39:15 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\RWRBE60.exe)(HOST=8000XXX05004RV)(USER=mickey))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.90.24.239)(PORT=1574)) * establish * test * 0
14-OCT-2012 06:40:48 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\ifrun60.EXE)(HOST=8200XXX138060Z)(USER=mouse))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.217.35.94)(PORT=2525)) * establish * test * 0
14-OCT-2012 07:01:04 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=server911)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=test)(VERSION=135296000)) *
status * 0
  • I am wanting parse or de-construct it so that I can get the HOST, PROGRAM and USER

  • awk's -F and cut only accept single delimiter so am lost on how to parse these strings.

  • Feedback/advise much appreciated. Thanks in advance.

posted the same again!!!

Don't do like this.

awk -F'[=|\(|\)]' '{
 for(i=1;i<NF;i++) {
  if($i=="PROGRAM") p=$(i+1);
  if($i=="HOST") h=$(i+1);
  if($i=="USER") u=$(i+1);
 }
 printf "Program: %s Host: %s User: %s\n", p,h,u;
} ' file

Sorry Pikk45,

I thought they are different question so I posted it twice, the first one is where I was not even able to read a single line of string.

My apology.

---------- Post updated at 09:46 PM ---------- Previous update was at 09:39 PM ----------

Hi,

I tried your suggestion and it does not print anything.

 
$: cat x
14-OCT-2012 06:38:59 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\ifrun60.EXE)(HOST=8000XXX05004RV)(USER=mickey))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.90.24.239)(PORT=1552)) * establish * test * 0
14-OCT-2012 06:39:15 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\RWRBE60.exe)(HOST=8000XXX05004RV)(USER=mickey))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.90.24.239)(PORT=1574)) * establish * test * 0
14-OCT-2012 06:40:48 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\ifrun60.EXE)(HOST=8200XXX138060Z)(USER=mouse))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.217.35.94)(PORT=2525)) * establish * test * 0
14-OCT-2012 07:01:04 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=server911)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=test)(VERSION=135296000)) *
status * 0
$: awk -F'[=|\(|\)]' '{
>  for(i=1;i<NF;i++) {
>   if($i=="PROGRAM") p=$(i+1);
>   if($i=="HOST") h=$(i+1);
>   if($i=="USER") u=$(i+1);
>  }
>  printf "Program: %s Host: %s User: %s\n", p,h,u;
> } ' x
Program:  Host:  User:
Program:  Host:  User:
Program:  Host:  User:
Program:  Host:  User:
Program:  Host:  User:
$:
 

I am on Solaris, I also tried using FS like the one below

 
$: awk 'BEGIN { FS="(PROGRAM=" } { print $2 }' x
CONNECT_DATA=
CONNECT_DATA=
CONNECT_DATA=
CONNECT_DATA=
 
  • Looks like my awk doesn't like multiple delimiters :frowning:

---------- Post updated at 09:48 PM ---------- Previous update was at 09:46 PM ----------

Hi,

Didn't realize I can simply do awk file and didn't necessarily need to read line by line.

Looks like the first post is not necessary at all.

For Solaris or SunOS use nawk

1 Like

Thanks a lot, using nawk works like a charm

---------- Post updated at 10:05 PM ---------- Previous update was at 09:56 PM ----------

Hi,

How do I include the timestamp on the printf? And if the program is null/zero-length print LOCAL instead.

Currently the output is as below:

Program: Z:\Ora6i\BIN\ifrun60.EXE Host: 11.90.24.239 User: mickey
Program: Z:\Ora6i\BIN\RWRBE60.exe Host: 11.90.24.239 User: mickey
Program: Z:\Ora6i\BIN\ifrun60.EXE Host: 11.217.35.94 User: mouse
Program:  Host: server911 User: oracle

How to make it look like the one below?

14-OCT-2012 06:38:59 - Program: Z:\Ora6i\BIN\ifrun60.EXE Host: 11.90.24.239 User: mickey
14-OCT-2012 06:39:15 - Program: Z:\Ora6i\BIN\RWRBE60.exe Host: 11.90.24.239 User: mickey
14-OCT-2012 06:40:48 - Program: Z:\Ora6i\BIN\ifrun60.EXE Host: 11.217.35.94 User: mouse
14-OCT-2012 07:01:04 - Program: LOCAL Host: server911 User: oracle

Thanks again for your help.

nawk -F'[=|\(|\)]' '/^[0-9]/{
 for(i=1;i<NF;i++) {
  d=$1; sub(/\*/,"",d);
  if($i=="PROGRAM") { p=$(i+1); p=(p=="")?"LOCAL":p; }
  if($i=="HOST")  h=$(i+1);
  if($i=="USER") u=$(i+1);
 }
 printf "%s - Program: %s Host: %s User: %s\n", d,p,h,u; d=p=h=u="";;
} ' file
1 Like

Hi,

Marvelous !!! It works like a charm. I've been trying to work out how it should be. Are you able to refer a link to try out some more examples of these. It is becoming a lot of FUN.

$: cat x
14-OCT-2012 06:38:59 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\ifrun60.EXE)(HOST=8000XXX05004RV)(USER=mickey))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.90.24.239)(PORT=1552)) * establish * test * 0
14-OCT-2012 06:39:15 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\RWRBE60.exe)(HOST=8000XXX05004RV)(USER=mickey))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.90.24.239)(PORT=1574)) * establish * test * 0
14-OCT-2012 06:40:48 * (CONNECT_DATA=(SID=test)(GLOBAL_NAME=test.mydb.com.ch)(CID=(PROGRAM=Z:\Ora6i\BIN\ifrun60.EXE)(HOST=8200XXX138060Z)(USER=mouse))) * (ADDRESS=(PROTOCOL=tcp)(HOST=11.217.35.94)(PORT=2525)) * establish * test * 0
14-OCT-2012 07:01:04 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=server911)(USER=oracle))(COMMAND=status)(ARGUMENTS=64)(SERVICE=test)(VERSION=135296000)) * status * 0
$: nawk -F'[=|\(|\)]' '/^[0-9]/{
>  for(i=1;i<NF;i++) {
>   d=$1; sub(/\*/,"",d);
>   if($i=="PROGRAM") { p=$(i+1); p=(p=="")?"LOCAL":p; }
 }
 printf "%s - Program: %s Host: %s User: %s\n", d,p,h,u; d=p=h=u="";;
} ' x>   if($i=="HOST")  h=$(i+1);
>   if($i=="USER") u=$(i+1);
>  }
>  printf "%s - Program: %s Host: %s User: %s\n", d,p,h,u; d=p=h=u="";;
> } ' x
14-OCT-2012 06:38:59   - Program: Z:\Ora6i\BIN\ifrun60.EXE Host: 11.90.24.239 User: mickey
14-OCT-2012 06:39:15   - Program: Z:\Ora6i\BIN\RWRBE60.exe Host: 11.90.24.239 User: mickey
14-OCT-2012 06:40:48   - Program: Z:\Ora6i\BIN\ifrun60.EXE Host: 11.217.35.94 User: mouse
14-OCT-2012 07:01:04   - Program: LOCAL Host: server911 User: oracle
  • Do you mind explaining how it works or maybe check if I understand it correctly below?

nawk -F'[=|\(|\)]' '/^[0-9]/{

  • This line specifies what are the delimiters? Not sure what the /^[0-9]/ means though.

  • The block of codes below parse each line of string, correct?
    for(i=1;i<NF;i++) {
    d=$1; sub(/\*/,"",d); <-- this remove the * asterisk?
    if($i=="PROGRAM") { p=$(i+1); p=(p=="")?"LOCAL":p; } <-- this check if p is blank then set p=LOCAL.
    if($i=="HOST") h=$(i+1);
    if($i=="USER") u=$(i+1);
    }
    printf "%s - Program: %s Host: %s User: %s\n", d,p,h,u; d=p=h=u="";;
    }

  • Thanks a lot again, you've been very helpful.

-F'[=|\(|\)]' - Specifying = ( ) as field separators. Since ( ) are meta-characters they should be escaped.

'/^[0-9]/ - Work only lines starting with numbers, I put this pattern because I saw a different line: status * 0

You are absolutely correct about rest of code.

Does these characters have to escaped in case of "awk"? I don't think so.:slight_smile:

You are right, inside square brackets, the standard meta-characters lose their meaning. But outside it should be escaped. Here is an example:

$ cat infile
bipin(ajith

$ awk -F\( '{ print $1 }' infile
bipin

$ awk -F'[(]' '{ print $1 }' infile
bipin

$ awk -F( '{ print $1 }' infile
sh: Syntax error: `(' is not expected.

I hope you understood.

1 Like