Perl script to find particular field and sum it

Hi,
I have a file with format

a b c d e
1 1 2 2 2
1 2 2 2 3
1 1 1 1 2
1 1 1 1 4
1 1 1 1 6

in column e i want to find all similar fields ( with perl script )and sum it how many are there
for instance in format above.

2 - 2 times
4 - 1 time
6 - 1 time

what i use is

@a=<STDIN>;
foreach $i (@a){
   @element= $i;
   if (@element[4]=~/2/){
    $abc+=@element[4];
    
   }

this is not working

Allso my other question is, here in example that i gave i have only 2 or 3 different numbers so i can repeat this above if statement and get the result. but what if we need to find this from 5000 different words?

Thnaks

 

 use strict;
use warnings;
use Data::Dumper;
open FH, "<new" or die "Can't Open $!";
my @array1;
my %hash;
<FH>;
while(<FH>)
{
  my @array=split(' ',$_);
  push(@array1,$array[4]);
}
foreach(@array1)
{
  $hash{$_}++;
}

foreach my $key (keys%hash) {
       print " $key => $hash{$key} times\n";
   }

Use the following code

open FH, "<inp" or die "Can't open file : $!\n";
my @data=<FH>;
my @result;
my @count;
foreach (@data )
{
    push(@result,split);
}
for ( my $i=4; $i <= $#result; $i=$i+5 )
{
    push(@count,$result[$i]);
}

my %count_hash;
shift @count; #to remove the e from the array
foreach my $word (@count)
{
       $count_hash{$word}=$count_hash{$word}+1;
}

foreach my $word (keys %count_hash)
{
            print $word," Comes for ",$count_hash{$word}," times\n";
}

The output I am getting is follows

6 Comes for 1 times
4 Comes for 1 times
3 Comes for 1 times
2 Comes for 2 times

try the following code:

                open FH,"sum" or die $!; //Open the file 'sum'.
                my %sum;
                my @lines;
                my $key;
                my $val;
                while(<FH>)//Read lines from the opened file.
                {
                      @lines=split(' ',$_);
                      $sum{$lines[4]}=$sum{$lines[4]}+1;//Forming Hash

                }
                while(($key,$val)=each(%sum))
                {
                    print "$key - $val times \n";//Printing the keys and values in hash
                }

Here 'sum' is a file which contains the following input data.

1 1 2 2 2
1 2 2 2 3
1 1 1 1 2
1 1 1 1 4
1 1 1 1 6

Try:

perl -lane '$A{(split //)[-1]}++; END {while (($k,$v) = each(%A)) { print "$k $v times" if(int($k)); } }' file

How you call this..This script is giving error

What error you are getting?

it is keep on saying cant open: no such file when i call script on my file

but both files are there and both have execute permission

use strict;
use warnings;

open FH, "<inp" or die "Can't open file : $!\n";
my @data=<FH>;
my @result;
my @count;
my $word;
foreach (@data )
{
        push(@result,split);
}
for ( my $i=4; $i <= $#result; $i=$i+5 )
{
        push(@count,$result[$i]);
}

my %count_hash;
shift @count;
foreach $word (@count)
{
        unless (defined($count_hash{$word}) )
        {
                $count_hash{$word} = 0;

        }
       $count_hash{$word}=$count_hash{$word}+1;
}

foreach my $word (keys %count_hash)
{
                print $word," Comes for ",$count_hash{$word}," times\n";
}

Use the above code.
Here the file inp is input file.
This input file is having your input data.
Thats why you are getting error. May be your input file name is different.
So correct it.

Yes thanks, i was using wrong format .pl instead of .txt.

Now one more question arise, here we knew that we r doint 4th column but how will you do if you have many colums and you want the last one. like in awk i thin we just put $ sign at the end. here in this script what we will chang to get the last field, no matter which column number it is

Thanks for teaching

Simple!!! . $#array will have the last index of the array. So using this we can easily get the last column number.
Example:

print $#data; #this will give 5.

So we can easily identify that totally there are 5 columns are there
Index starts from 0. So $data[4] represents the last column value.

I understand this, but what i am saying is..let's imagine you have more than more than 2000 colums ( just imagine )..and you don't want to count all of them 0 to 1999. so what will you do so that the script just take the last field without specifying the index number

print $array_name[-1]

This will give the last element in that array.

print $array_name[-2]

This will give the 2nd last element in that array.
Are you expecting this?

Yes yes thanks for teaching all this.. thank you

---------- Post updated at 08:22 AM ---------- Previous update was at 05:13 AM ----------

can you please tell why you did +5

for ( my $i=4; $i <= $#result; $i=$i+5 )

Thanks

Yeah!!! Pleasure!!!

If you print the "@result" array it will have the following values.

a b c d e 1 1 2 2 2 1 2 2 2 3 1 1 1 1 2 1 1 1 1 4 1 1 1 1 6

Your actual requirement is to find the occurrence of the values in the e column

Your input data:

a b c d e
1 1 2 2 2
1 2 2 2 3
1 1 1 1 2
1 1 1 1 4
1 1 1 1 6

So from the above values every fifth value is the e column's value.
Thats why I have incremented by 5 to get every 5th element.
Got it? :slight_smile: