How to count occurrences in a specific column

Hi,

I need help to count the number of occurrences in $3 of file1.txt. I only know how to count by checking one by one and the code is like this:

awk '$3 ~ /aku hanya poyo/ {++c} END {print c}' FS="\t" file1.txt

But this is not wise to do as i have hundreds of different occurrences in that column that i need to count. my sample input file as follow:

file1.txt

123   ghtd   tidak mahu   
645   pled   aku hanya poyo
944   pom    ngeh3
3351  bhg    tidak mahu
5545  polo   ngeh3
4474  klsa   tidak mahu

output

tidak mahu       3
aku hanya poyo   1
ngeh3            2

Thanks in advance.

Is TAB the delimiter in your file?

 awk '{sub(" *$","");s=substr($0, index($0,$3));o++;}END { for (i in o) printf("%-20s %d\n", i, o);}' file1.txt

Hi bartus11,

yes, it is tab delimited. but for each column, i have data in strings.

hi rdrtx1,
i tried your code but it didnt work :frowning:

thanks

Try:

awk -F"\t" '{a[$3]++}END{for (i in a) print i"\t"a}' file
1 Like

Hi bartus11,

It works great. Thanks..But i have another issue, after i run your code, i just realized that in $3, there could be more than 1 value that are separated by comma and it is not being counted. The sample is like this:-

123   ghtd   tidak mahu    
645   pled   aku hanya poyo 
944   pom    ngeh3
3351  bhg    tidak mahu 
545   polo   ngeh3 
4474  klsa   tidak mahu
1141  meh    tidak mahu, ngeh3, dtg sini
457   nah    aku hanya poyo, tidak mahu 

where the output should be

tidak mahu       5 
aku hanya poyo   2 
ngeh3            3
dtg sini        1

appreciate your kind help again. Thanks

awk '{sub("[\t ]*$","");gsub(" *, *",",");s=substr($0,index($0,$3));c=split(s,sa,",");for (i in sa)o[sa]++;}END{for(i in o)printf("%-20s %d\n",i,o);}' file1.txt
1 Like

hi rdrtx1,

it is still not working. it displays all the info in the input file and it looks a bit weird.

Try:

awk -F"\t" '{gsub(", ",",");sub(" *$","");n=split($3,s,",");for (i=1;i<=n;i++) a++}END{for (i in a) print i,a}' file
1 Like
perl -F'\t' -alne '{$F[2]=~s/ *$//;@array=(@array,split(/, /,$F[2]));} 
END{
foreach $item(@array){$hash{$item}++;} 
foreach $key (keys %hash) {print $key."\t".$hash{$key};}
}' input_file

ngeh3   3
tidak mahu      5
dtg sini        1
aku hanya poyo  2


---------- Post updated at 05:38 PM ---------- Previous update was at 05:33 PM ----------

Rdtrx1's solution is working for me ! Check once again...

1 Like

Hi bartus11,

Your code worked great!! Thanks so much for your kind help. u really saved my day :smiley:

hi Msabhi and rdrtx1,

I am going to try again for both of your codes. Maybe there is something that i miss on rdrtx1 code. Same goes to the perl code. will try that one too. Thanks so much guys :slight_smile: