Modify log files to get uniq data

asirohi · August 24, 2009, 3:56pm

Hello,

I have a log file that has following output as below.

LAP.sun5 CC
LAP.sun5 CQ
perl.sun5 CC
perl.sun5 CQ
TSLogger.sun5 CC
TSLogger.sun5 CQ
TSLogger.sun5 KR
WAS.sun5 CC
WAS.sun5 MT
WAS.sun5 CQ

I want to output to be in the way below, i tried using awk but could not do it.

LAP.sun5 CC CQ
perl.sun5 CC CQ
TSLogger.sun5 CC CQ KR
WAS.sun5 CC MT CQ

Please help . how do i do that

Thanks
Adsi

Smiling_Dragon · August 24, 2009, 8:21pm

#!/bin/sh
for entry in `awk '{ print $1 }' < your.log.file | sort -u`
do
  string=""
  for result in `egrep "^$entry " your.log.file | awk '{ print $2 }'`
  do
    string="$result $string"
  done
  echo "$entry $string"
done

Exact grep syntax will depend on your version of grep available, egrep for Solaris, grep -e for most others, grep for the rest

durden_tyler · August 24, 2009, 10:53pm

$ 
$ cat data.txt
LAP.sun5 CC
LAP.sun5 CQ
perl.sun5 CC
perl.sun5 CQ
TSLogger.sun5 CC
TSLogger.sun5 CQ
TSLogger.sun5 KR
WAS.sun5 CC
WAS.sun5 MT
WAS.sun5 CQ
$ 
$ awk '{if (prev == ""){printf("%s",$0)}
        else if($1 == prev){printf(" %s",$2)}
             else {printf("\n%s",$0)} prev=$1
       } END {printf "\n"}' data.txt
LAP.sun5 CC CQ
perl.sun5 CC CQ
TSLogger.sun5 CC CQ KR
WAS.sun5 CC MT CQ
$ 
$

tyler_durden

asirohi · August 25, 2009, 9:28am

Hello Tyler,

Thanks for the wonderful explaination, that was superb. I have a question if(prev == "") and i dont want to display any thing what do i do. I asked this question because in the log files there are $1 fields that occur only once and i dont want to display them. For example

Linix CQ
LAP.sun5 CC
LAP.sun5 CQ
perl.sun5 CC
perl.sun5 CQ
TSLogger.sun5 CC
TSLogger.sun5 CQ
TSLogger.sun5 KR
WAS.sun5 CC
WAS.sun5 MT
WAS.sun5 CQ
Sun.log CC

So the output should be as below and should leave $1 field 
if it has no previous value.

LAP.sun5 CC CQ
perl.sun5 CC CQ
TSLogger.sun5 CC CQ KR
WAS.sun5 CC MT CQ

$ awk '{if (prev == ""){printf("%s",$0)}
        else if($1 == prev){printf(" %s",$2)}
             else {printf("\n%s",$0)} prev=$1
       } END {printf "\n"}' data.txt

danmero · August 25, 2009, 11:44am

A different approach

awk 'NF{_[$1]=$0}END{for(i in _)print _}' file | sort

asirohi · August 25, 2009, 12:28pm

Hello Danmer,

That does not work, did it work for you?

danmero · August 25, 2009, 12:41pm

Works for me, try to use GNU awk (gawk), New awk (nawk) or POSIX awk (/usr/xpg4/bin/awk).

asirohi · August 25, 2009, 1:05pm

$$$$$

vgersh99 · August 25, 2009, 1:17pm

nawk '
  { a[$1]=($1 in a)?a[$1] OFS $NF:$0}
  END {
    for(i in a)
      if(split(a,t, OFS)>2) print a
  }' myFile

asirohi · August 25, 2009, 6:01pm

$$$$$$

vgersh99 · August 25, 2009, 7:09pm

asirohi:

Can you please explain this what is happening in the above line of code?
nawk '
  { a[$1]=($1 in a)?a[$1] OFS $NF:$0}
  END {
   for(i in a)
   if(split(a,t, OFS)>2) print a
  }' myFile
Thanks
Adsi

Do you how 'associative arrays' work in awk?
If not, try reading up starting with 'man nawk' first.

asirohi · August 25, 2009, 7:43pm

Hello,

I can understant what associative array is. But if you could tell me what is happening here below, it would be great.

a[$1]=($1 in a)?a[$1] OFS $NF:$0}

vgersh99 · August 26, 2009, 11:21am

one can re-write the above as:

if ($1 in a)
   a[$1]=a[$1] OFS $NF
else
   a[$1]=$0

Is this clearER?