Read first column and count lines in second column using awk

Padavan · October 24, 2015, 2:14pm

Hello all,

I would like to ask your help here:

I've a huge file that has 2 columns. A part of it is:

sorted.txt:

kss23 rml.67lkj
kss23 zhh.6gf
kss23 nhd.09.fdd
kss23 hp.767.88.89
fl67 nmdsfs.56.df.67
fl67 kk.fgf.98.56.n
fl67 bgdgdfg.hjj.879.d
fl66 kl..hfh.76.ghg
fl66 loedg.fdgdfg.hdfh
fl66 pi.ccxb.879..fh
fl66 jy.dggdg.8.76.436.dgdf
fl66 rt.dgd.577
nk45 uyfdhgfh.36.65
nk45 ihddfsdg.346

I want to count the second column lines for the identical first column values. O/p should be like:

kss23 4
fl67 3
fl66 5
nk45 2

So I would like to awk command to dynamically read all the 1st column name values and count 2nd column lines accordingly and throw the result as shown.

Please help. Thanks in advance !

Scrutinizer · October 24, 2015, 2:23pm

Hi, try something like:

awk '{T[$1]++} END{for(i in T) print i,T}' file

or

awk '$1!=p{if(NR>1)print p,t; t=0; p=$1}{t++} END{print p,t}' file

Padavan · October 24, 2015, 2:52pm

Hi Scrutinizer,

Unfortunately, none of your solution works Any other suggestion?

Thank you in advance for your kind help.

jgt · October 24, 2015, 4:46pm

count=0
#gt=0
first_sw="Y"
prev_code=""
while read col1 col2
do
if [ "$first_sw" = "Y" ]
then
    prev_code=$col1
    first_sw="N"
fi
if [ "$col1" != "$prev_code" ]
then
    echo  $prev_code $count
    #gt=`expr $gt + $count`
    count=0
    prev_code=$col1
fi
count=`expr $count + 1`
done <sorted.txt
echo $prev_code $count
#gt=`expr $gt + $count`
#echo total records processed $gt

Elapsed time 3 minutes.

Don_Cragun · October 24, 2015, 5:59pm

Telling us "none of your solution works" without telling us how they don't work gives us no way to help resolve your problem.

If Scrutinizer's suggestions printed diagnostic messages, what were they?

If Scrutinizer's suggestions produced no diagnostics and produced output different from what you wanted, what output did they produce?

Whenever you ask for help in these forums, it helps us help you if Scrutinizer tell us what operating system and shell you're using. If you're using a Solaris/SunOS system, Scrutinizer would have told you to change awk in both of his suggestions to /usr/xpg4/bin/awk or nawk . What operating system and shell are you using?

RudiC · October 25, 2015, 5:57am

Would this help:

cut -d" " -f1 file | sort | uniq -c
      5 fl66
      3 fl67
      4 kss23
      2 nk45