Counting similar lines

marcinnnn · February 1, 2010, 1:07pm

Hi,

I have a little problem with counting lines. I know similar topics from this forum, but they don't resolve my problem. I have file with lines like this:

2009-05-25 16:55:32,143 some text some regular expressions ect.
2009-05-25 16:55:32,144 some text.
2009-05-28 18:15:12,148 some error logs.
2009-05-28 18:15:12,148 some error logs.
2009-05-28 18:15:12,149 some error logs.
2009-05-28 18:15:12,150 some error logs.
2009-06-25 16:55:32,222 some text some regular expressions ect.
2009-06-25 16:55:32,223 some text some regular expressions ect.

So time stamp can be different. I want to count just messages which are the same in the line. I found this on forum:

#!/bin/ksh

cnt=0
while read line
do
   # set the variable contents
   ((cnt+=1))
   eval a${cnt}="\$line"

if [[ $cnt -eq 1 ]]
then
continue
fi

i=0
((snum=cnt-1))
v2=$(eval echo \"\$a$cnt\")
while (($i < $snum))
do
((i+=1))
v1=$(eval echo \"\$a${i}\")

# skip the line if it's equal to another
if [[ "$v1" = "$v2" ]]
then
((cnt-=1))
fi
done
done < my_file.log

# display the variables a1, a2,...a10
x=1
while (($x <= $cnt))
do
eval echo \"\$a${x}\"
((x+=1))
done

but how to cut of 2 first colums if I know that each line can have different number of words?

Thanks in advance!

jim_mcnamara · February 1, 2010, 1:53pm

awk ' { tmp=""
           for(i=1; i<=NF; i++) {
               tmp=sprintf("%s %s", tmp, $i)
           }
           arr[tmp]++
        }
        END {
            for(i in arr) {
              print i, arr
            }  
        } ' inputfile > outputfile

Try something like this

Durimar · February 2, 2010, 12:54pm

I modify script to cut a time stamp infront of the message:

#!/bin/ksh
cnt=0
while read line
do
   # set the variable contents
   ((cnt+=1))
   SHORTLN=`echo "$line" | sed 's/.*,[0-9]*//g'`
   eval a${cnt}="\$SHORTLN"
   # Added for test
   echo "After eval 1 a${cnt}"
   if [[ $cnt -eq 1 ]]
   then
      continue
   fi
   
   i=0
   ((snum=cnt-1))
   v2=$(eval echo \"\$a$cnt\" )
   #Added for test
   echo "v2 set to $v2"
   while (($i < $snum))
   do
      ((i+=1))
      v1=$(eval echo \"\$a${i}\")
      #Added for test
      echo "v1 set to $v1"
      
   # skip the line if it's equal to another
   if [[ "$v1" = "$v2" ]]
   then
     ((cnt-=1))
   fi
   done
done < my_file.log
# display the variables a1, a2,...a10
x=1
while (($x <= $cnt))
do
   eval echo \"\$a${x}\"
   ((x+=1))
done

rdcwayx · February 2, 2010, 6:59pm

Try this:

$ awk '{$1=$2=""}1' urfile |sort |uniq -c |sort -n
      1   some text.
      3   some text some regular expressions ect.
      4   some error logs.

daptal · February 2, 2010, 7:35pm

cut -f3- -d ' ' abc.txt | sort | uniq -c

HTH,
PL