Count the repetition of a Field in File

Hi,
Thanks for keeping such a help-full platform active and live always.
I am new to this forum and to unix also.
Want to know how to count the repetition of a field in a file. Anything of awk, sed, perl, shell script, solution are welcomed.

Input File------------------
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888

Output File-----------------
abc,12345,3
pqr,51223,1
mno,72121,2
stu,34567,1
aaa,12345,3
pqp,11224,1
plm,72121,2
zxy,88888,2
fgh,12345,3
jkl,88888,2

As 12345 is repeated 3 times in files as second field, so wherever it is "3" is suffixed as last field.
Thanks for the solution in advance.

Ace

Here is what I get so far. Of course, you'll surely have other replies that will do the same in a simpler way :stuck_out_tongue:

#!/bin/sh

sort -t',' -k2,2n file | uniq -c -s4 > tmp

while read line; do
  echo "$line,$(grep ${line##*,} tmp | awk '{print $1}')"
done < file

exit 0

Your data file need to be named file, in the same directory as the script.
I use a tmp file to keep the number of occurences of the second field.

Ok, another one:D:

awk -F, 'NR==FNR{a[$2]++;next}{print $0 "," a[$2]}' file file

Regards

how about below perl:

my (%result,%cnt);
while(<DATA>){
	chomp;
	my @tmp=split(",",$_);
	$result{$_}=$.;
	$cnt{$tmp[1]}++;
}
map  {s/([0-9]+)/$1.",".$cnt{$1}/e;print $_,"\n";} 
  sort {$result{$a} <=> $result{$b}} keys %result;
__DATA__
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888

---------- Post updated at 03:31 AM ---------- Previous update was at 03:29 AM ----------

[/COLOR]Hi Tukuyomi
Thanks for the solution but it has a deviation than expected result, and eating out some inputs. The output was like this.
1 pqp,11224
3 aaa,12345
1 stu,34567
1 pqr,51223
2 mno,72121
2 jkl,88888

can you please amend it if possible.:confused:

---------- Post updated at 03:40 AM ---------- Previous update was at 03:31 AM ----------

Hi frank,
there is no output for this awk script, its just publishing the same optput as input except a field saparator at the end as ",". Please can you correct it.

This is what I get:

$ cat file
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888
$ awk -F, 'NR==FNR{a[$2]++;next}{print $0 "," a[$2]}' file file
abc,12345,3
pqr,51223,1
mno,72121,2
stu,34567,1
aaa,12345,3
pqp,11224,1
plm,72121,2
zxy,88888,2
fgh,12345,3
jkl,88888,2

Am I missing something?

1 Like

Franklin,This is what i am getting, as you know much more abt this you can find out if I am doing something wrong I have Solaris10 as OS.
root@sunmc01>cat file
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888
root@sunmc01>awk -F, 'NR==FNR{a[$2]++;next}{print $0 "," a[$2]}' file file
abc,12345,
pqr,51223,
mno,72121,
stu,34567,
aaa,12345,
pqp,11224,
plm,72121,
zxy,88888,
fgh,12345,
jkl,88888,
abc,12345,
pqr,51223,
mno,72121,
stu,34567,
aaa,12345,
pqp,11224,
plm,72121,
zxy,88888,
fgh,12345,
jkl,88888,
root@sunmc01>
Thanks for your consistent support.:frowning:

Use nawk or /usr/xpg4/bin/awk on Solaris.

Regards

Thanks a lot Franklin-that sounds like an expert ;).
The nawk worked fine and resulted the expected output!!
:b:

you be on Solaris :wink:

use nawk or /usr/xpg4/bin/awk to get Franklins results....

EDIT: A bit late �_�

Same for me:

tukuyomi@dejikochan:~/test$ cat file
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888
tukuyomi@dejikochan:~/test$ ./script.sh
abc,12345,3
pqr,51223,1
mno,72121,2
stu,34567,1
aaa,12345,3
pqp,11224,1
plm,72121,2
zxy,88888,2
fgh,12345,3
jkl,88888,2

Can you tell which version of which shell you are using?
Something like

echo $SHELL
$SHELL -version
$SHELL --version

can probably tell you those informations