Computations on a text file

hey,

I have text file which has some data for some device Id's. i want to perform AVG, MAX,2nd MAX on these device ids. The .txt file looks like below. please help me in finding the computations per Device id. My output file should contatin with DeviceID,Avg, max and 2nd max of the device ID.

***********************************************
bash-3.00$ cat temp.txt
[281207], (Mid) 100581038,(Rid) 200394032, (date) [1323664218], 2011.12.12-04.30.18, (Val) 0.0d
[281208], (Mid) 100581038,(Rid) 200394032, (date) [1323665119], 2011.12.12-04.45.19, (Val) 0.0d
[281209], (Mid) 100581038,(Rid) 200394034, (date) [1323662419], 2011.12.12-04.00.19, (Val) 0.0d
[281210], (Mid) 100581038,(Rid) 200394034, (date) [1323663318], 2011.12.12-04.15.18, (Val) 0.0d
[281211], (Mid) 100581038,(Rid) 200394034, (date) [1323664218], 2011.12.12-04.30.18, (Val) 0.0d
[281212], (Mid) 100581038,(Rid) 200394034, (date) [1323665119], 2011.12.12-04.45.19, (Val) 0.0d
[281213], (Mid) 100581038,(Rid) 200394035, (date) [1323662419], 2011.12.12-04.00.19, (Val) 0.0d
[281214], (Mid) 100581038,(Rid) 200394035, (date) [1323663318], 2011.12.12-04.15.18, (Val) 0.0d
[281215], (Mid) 100581038,(Rid) 200394035, (date) [1323664218], 2011.12.12-04.30.18, (Val) 0.0d
[281216], (Mid) 100581038,(Rid) 200394035, (date) [1323665119], 2011.12.12-04.45.19, (Val) 0.0d
[282093], (Mid) 100581049,(Rid) 200393955, (date) [1323662416], 2011.12.12-04.00.16, (Val) 0.0d
[282094], (Mid) 100581049,(Rid) 200393955, (date) [1323663317], 2011.12.12-04.15.17, (Val) 0.0d
[282095], (Mid) 100581049,(Rid) 200393955, (date) [1323664216], 2011.12.12-04.30.16, (Val) 0.0d
[282096], (Mid) 100581049,(Rid) 200393955, (date) [1323665115], 2011.12.12-04.45.15, (Val) 0.0d
[279065], (Mid) 100580981,(Rid) 200393809, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279066], (Mid) 100580981,(Rid) 200393809, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279067], (Mid) 100580981,(Rid) 200393809, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279068], (Mid) 100580981,(Rid) 200393809, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279215], (Mid) 100580982,(Rid) 200393809, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279216], (Mid) 100580982,(Rid) 200393809, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279217], (Mid) 100580982,(Rid) 200393815, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279218], (Mid) 100580982,(Rid) 200393815, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279219], (Mid) 100580982,(Rid) 200393815, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279220], (Mid) 100580982,(Rid) 200393815, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279221], (Mid) 100580982,(Rid) 200393826, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279222], (Mid) 100580982,(Rid) 200393826, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279223], (Mid) 100580982,(Rid) 200393826, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279224], (Mid) 100580982,(Rid) 200393826, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279225], (Mid) 100580982,(Rid) 200393828, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279226], (Mid) 100580982,(Rid) 200393828, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279227], (Mid) 100580982,(Rid) 200393828, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279228], (Mid) 100580982,(Rid) 200393828, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279229], (Mid) 100580982,(Rid) 200393852, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279230], (Mid) 100580982,(Rid) 200393852, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279231], (Mid) 100580982,(Rid) 200393852, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279232], (Mid) 100580982,(Rid) 200393852, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279233], (Mid) 100580982,(Rid) 200393855, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279354], (Mid) 100580982,(Rid) 200394082, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279355], (Mid) 100580982,(Rid) 200394082, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279356], (Mid) 100580982,(Rid) 200394082, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279230], (Mid) 100580978,(Rid) 200393852, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279231], (Mid) 100580978,(Rid) 200393852, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279232], (Mid) 100580978,(Rid) 200393852, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
[279233], (Mid) 100580989,(Rid) 200393855, (date) [1323666018], 2011.12.12-05.00.18, (Val) 0.0d
[279354], (Mid) 100580989,(Rid) 200394082, (date) [1323666918], 2011.12.12-05.15.18, (Val) 0.0d
[279355], (Mid) 100580989,(Rid) 200394082, (date) [1323667818], 2011.12.12-05.30.18, (Val) 0.0d
[279356], (Mid) 100580989,(Rid) 200394082, (date) [1323668718], 2011.12.12-05.45.18, (Val) 0.0d
***********************************************
here 1005XXXXX is the DeviceID and 0.0d is the value of the device id at that time stamp.
now i want the AVG, Max and 2nd Max values for each deviceID.

Its a bit urgent!!

Thanks,
Mahi

avg, max etc of what? 0.0d?
Need more info!

Provide a sample output!

--ahamed

Yeah.. on 0.0d for each deviceID. but here you dont need to worry about value type (assume it as integer) i jus need the code to achieve output. the output file should look like this

#deviceID Avg Max 2ndMax
100581038 0.0 0.0 0.0

Thanks!!

@mahi_mayu069: With what will you compute the avg, max and 2nd max for each device id? Please provide an example.

Hi Balajesuri,

Computations to be done on the values 0.0d(assume it to be integer), last coloumn of the file i.e (Val) 0.0d.

Thanks,
Mahi

Try this.

#!/usr/bin/perl

use List::Util qw(min max sum);

%d = ();

while (<STDIN>)
{
    /\(Mid\) ([0-9]+),.*\(Val\) ([0-9.-]+)d/;
    next if ($1 eq '' || $2 eq '');
    $d{$1} = [] if (!defined($d{$1}));
    push(@{$d{$1}}, $2);
}

foreach (keys %d)
{
    @a = sort {$a <=> $b} @{$d{$_}};
    $max  = (@a > 0) ? max(@a) : 0;
    $max2 = (@a > 1) ? $a[@a - 2] : 0;
    $avg = (@a > 0) ? sum(@a) / @a : 0;
    print "$_ $avg $max $max2\n";
}

Run by:

./script.pl < temp.txt
1 Like

Thanks....but I want the script to be shell.

Thanks Again,

--Mahi

Try this...

sed 's/,[^ ]/ /g; s/d$//g' 1 | awk '{
total[$3]+=$NF;
count[$3]++;
max[$3]=(max[$3]<$NF)?$NF:max[$3];
} END {
for (x in total) {print x,total[x]/count[x],max[x]?max[x]:0}
}'

Try this...

awk '{
  j++; v=$NF+0;split($3,_,",")
  a[_[1]]+=v;b[_[1]]=b[_[1]]" "v
}
END{
  for(i in a) {
    split(b,c," ");asort(c,d);len=length(c)
    printf("%s\t%f\t%f\t%f\n",i,a/len,d[len],d[len-1])
  }
}' input_file

If solaris, use nawk!

--ahamed