how to add the number of row and count number of rows

Hi experts a have a very large file and I need to add two columns: the first one numbering the incidence of records and the another with the total count

The input file:

 21 2341 A 
 21 2341 A 
 21 2341 A
 21 2341 C
 21 2341 C
 21 2341 C
 21 2341 C
 21 4567 A 
 21 4567 A 
 21 4567 C
 21 4567 C
 23 4567 A 
 23 4567 A 
 23 4567 A 
 23 4567 A 
 23 4567 C
 23 4567 C
 23 4567 C

desired output file:

 21 2341 A 1 3
 21 2341 A 2 3
 21 2341 A 3 3
 21 2341 C 1 4
 21 2341 C 2 4
 21 2341 C 3 4
 21 2341 C 4 4
 21 4567 A 1 2
 21 4567 A 2 2
 21 4567 C 1 2
 21 4567 C 2 2
 23 4567 A 1 4
 23 4567 A 2 4
 23 4567 A 3 4
 23 4567 A 4 4
 23 4567 C 1 3
 23 4567 C 2 3
 23 4567 C 3 3

thanks in advance

Here's one way to do it with Perl -

$
$
$ cat f3
21 2341 A
21 2341 A
21 2341 A
21 2341 C
21 2341 C
21 2341 C
21 2341 C
21 4567 A
21 4567 A
21 4567 C
21 4567 C
23 4567 A
23 4567 A
23 4567 A
23 4567 A
23 4567 C
23 4567 C
23 4567 C
$
$
$
$ perl -ne 'chomp; s/\s*$//; s/\s+/:/g; $fmt="%s %s %s %d %d\n";
          if (! defined $x{$_} && $.>1) {
            while (($k,$n) = each(%x)) {
              foreach $i (1..$n) {printf($fmt, split(/:/,$k), $i, $n)}
            }
            %x = (); $x{$_}++;
          } else {$x{$_}++}
          END {
            while (($k,$n) = each(%x)) {
              foreach $i (1..$n) {printf($fmt, split(/:/,$k), $i, $n)}
            }
          }' f3
21 2341 A 1 3
21 2341 A 2 3
21 2341 A 3 3
21 2341 C 1 4
21 2341 C 2 4
21 2341 C 3 4
21 2341 C 4 4
21 4567 A 1 2
21 4567 A 2 2
21 4567 C 1 2
21 4567 C 2 2
23 4567 A 1 4
23 4567 A 2 4
23 4567 A 3 4
23 4567 A 4 4
23 4567 C 1 3
23 4567 C 2 3
23 4567 C 3 3
$
$

tyler_durden

try this:

awk ' { ++a[$0] } END { for(s in a) { for(i=1;i<=a;++i) { print s, i, a } } }  ' inputfile

Note: your input data is assumed to be sorted.

1 Like

Try...

gawk '{sub(/ $/,"");c=++b[$0];a[NR]=$0 FS c;d[NR]=$0;e[$0]=c}END{for(i=1;i<=NR;i++)print a FS e[d]}' file1
while(<DATA>){
	chomp;
	$hash{$_}->{CNT}++;
	$hash{$_}->{SEQ}=$.;
}
foreach my $key(sort {$hash{$a}<=>$hash{$b}} keys %hash){
	print $key," ",$_," ",$hash{$key}->{CNT},"\n" foreach (1..$hash{$key}->{CNT});
}
__DATA__
21 2341 A
21 2341 A
21 2341 A
21 2341 C
21 2341 C
21 2341 C
21 2341 C
21 4567 A
21 4567 A
21 4567 C
21 4567 C
23 4567 A
23 4567 A
23 4567 A
23 4567 A
23 4567 C
23 4567 C
23 4567 C
awk 'NR==FNR{a[$0]++;next}{b[$0]++;print $0,b[$0],a[$0]}' urfile urfile

Another one:

awk 'END {
  for (i = 0; ++i <= c;)
    print rec, i, c
    }
$0 != p {
  for (i = 0; ++i <= c;)
    print rec, i, c
  c = 0    
  }
{ 
  p = $0; rec[++c] = $0
  }' infile