not sure if it's called "group by" , but what i'm going to do is like this:
i have a file below:
192.168.1.10
192.168.1.10
192.168.1.10
192.168.1.11
192.168.1.15
192.168.1.15
192.168.1.20
192.168.1.22
then i hope to get the result like this:
192.168.1.10 : 3
192.168.1.11 : 1
192.168.1.15 : 2
192.168.1.20 : 1
192.168.1.22 : 1
the number in second column is how many times it appears in the file.
- use sort , and uniq to get how many unique record in this file
- use grep with wc -l command to get how many times it appears
is there any better to do so ?? any advice?
Thanks.
Something like this ?
$ awk '{count[$1]++}END{for(j in count) print j":"count[j]}' file.txt
Jaduks awk solution does not sort the output as requested by the OP. This can be remedied by piping the output to sort.
You can do it completely within a shell. For example using ksh93 the following script
#!/bin/ksh93
typeset -A count
# create an associative array
while read ip
do
(( count[$ip]++ ))
done < infile
# sort print the associative array
while (( 1 ))
do
(( !${#count[@]} )) && break;
k=(${!count[@]})
for j in ${!count[@]}
do
(( ${count[$j]} > ${count[$k]} )) && k=$j
done
echo "$k : ${count[$k]}"
unset count[$k]
done
and produces the following sorted list
192.168.1.10 : 3
192.168.1.15 : 2
192.168.1.20 : 1
192.168.1.11 : 1
192.168.1.22 : 1
Associate arrays are also supported in bash V4 but use a slightly different syntax.
Here is another simple solution:
for i in $(cat file.txt)
do
echo "$i: $(grep -c $i file.txt)"
done | sort -u
sort file|uniq -c
although the output is different, but you could do, to get the desired format:
sort file|uniq -c|awk '{print $2 " : " $1}'
sort in awk:
awk '{count[$1]++}END{for(j in count) print j":"count[j] |"sort -t: -k2r"}' urfile
I like shahhe's offering for quickest and most likely to become a one-liner, except that it needs one (or two) last tweak to get the sort requested by the OP:
for i in $(<file.txt)
do
echo "$i: $(grep -c $i file.txt)"
done | sort -u -t. -k4