Modify log files to get uniq data

Hello,

I have a log file that has following output as below.

LAP.sun5 CC
LAP.sun5 CQ
perl.sun5 CC
perl.sun5 CQ
TSLogger.sun5 CC
TSLogger.sun5 CQ
TSLogger.sun5 KR
WAS.sun5 CC
WAS.sun5 MT
WAS.sun5 CQ

I want to output to be in the way below, i tried using awk but could not do it.

LAP.sun5 CC CQ
perl.sun5 CC CQ
TSLogger.sun5 CC CQ KR
WAS.sun5 CC MT CQ 

Please help . how do i do that

Thanks
Adsi

#!/bin/sh
for entry in `awk '{ print $1 }' < your.log.file | sort -u`
do
  string=""
  for result in `egrep "^$entry " your.log.file | awk '{ print $2 }'`
  do
    string="$result $string"
  done
  echo "$entry $string"
done

Exact grep syntax will depend on your version of grep available, egrep for Solaris, grep -e for most others, grep for the rest

$ 
$ cat data.txt
LAP.sun5 CC
LAP.sun5 CQ
perl.sun5 CC
perl.sun5 CQ
TSLogger.sun5 CC
TSLogger.sun5 CQ
TSLogger.sun5 KR
WAS.sun5 CC
WAS.sun5 MT
WAS.sun5 CQ
$ 
$ awk '{if (prev == ""){printf("%s",$0)}
        else if($1 == prev){printf(" %s",$2)}
             else {printf("\n%s",$0)} prev=$1
       } END {printf "\n"}' data.txt
LAP.sun5 CC CQ
perl.sun5 CC CQ
TSLogger.sun5 CC CQ KR
WAS.sun5 CC MT CQ
$ 
$ 

tyler_durden

Hello Tyler,

Thanks for the wonderful explaination, that was superb. I have a question if(prev == "") and i dont want to display any thing what do i do. I asked this question because in the log files there are $1 fields that occur only once and i dont want to display them. For example

Linix CQ
LAP.sun5 CC
LAP.sun5 CQ
perl.sun5 CC
perl.sun5 CQ
TSLogger.sun5 CC
TSLogger.sun5 CQ
TSLogger.sun5 KR
WAS.sun5 CC
WAS.sun5 MT
WAS.sun5 CQ
Sun.log CC

So the output should be as below and should leave $1 field 
if it has no previous value.

LAP.sun5 CC CQ
perl.sun5 CC CQ
TSLogger.sun5 CC CQ KR
WAS.sun5 CC MT CQ
$ awk '{if (prev == ""){printf("%s",$0)}
        else if($1 == prev){printf(" %s",$2)}
             else {printf("\n%s",$0)} prev=$1
       } END {printf "\n"}' data.txt

A different approach

awk 'NF{_[$1]=$0}END{for(i in _)print _}' file | sort

Hello Danmer,

That does not work, did it work for you?

Works for me, try to use GNU awk (gawk), New awk (nawk) or POSIX awk (/usr/xpg4/bin/awk).

$$$$$

nawk '
  { a[$1]=($1 in a)?a[$1] OFS $NF:$0}
  END {
    for(i in a)
      if(split(a,t, OFS)>2) print a
  }' myFile

$$$$$$

Do you how 'associative arrays' work in awk?
If not, try reading up starting with 'man nawk' first.

Hello,

I can understant what associative array is. But if you could tell me what is happening here below, it would be great.

a[$1]=($1 in a)?a[$1] OFS $NF:$0}

one can re-write the above as:

if ($1 in a)
   a[$1]=a[$1] OFS $NF
else
   a[$1]=$0

Is this clearER?