Merging Adjacent Lines Using Gawk

Hi all,

I have a text file consisting of 4 columns. What I am trying to do is see whether column 2 repeats multiple times, and collapse those repeats into one row. For example, here is a snippet of the file I am trying to analyze:

1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Sure_Thing 15.043 0.39
1 Gamble_Loss 15.496 1.236
1 Gamble_Loss 16.982 0.402
1 Gamble_Loss 17.647 0.19
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

Here is what I am trying to do: For the conditions where "Sure_Thing" and "Gamble_Loss" repeat, I want to collapse it into a single line, adding up all of column 4 over the repeats. So after I gawk it, I want it to look something like this:

1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.564
1 Gamble_Loss 15.496 1.828
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

Here is the code I have used to analyze it so far, but it only works for 2 adjacent repeats; I want to generalize it for multiple repeats:

igawk '

BEGIN{

OFS=" "

prevTrial = "-";
prevTime = "0";
prevDur = "0";
}

{

if ($2 == prevTrial)
print $1, prevTrial, prevTime, prevDur+$4;
else if ($2 != prevTrial)
print $0;


prevTrial = $2;
prevTime = $3;
prevDur = $4;
}

' $*

I appreciate any input!

this close to what you want ? :

#  awk '$2==t{s+=$4}$2!=t{print x,s;x=$1" "$2" "$3;t=$2;s=$4}END{print x,s}' infile
 
1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.564
1 Gamble_Loss 15.496 1.828
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

you need to modify as below:-

gawk '

BEGIN{

OFS=" "
prevTrial = "-";
prevTime = "0";
prevDur = "0";

}

{

if ($2 == prevTrial) { next ;}
else if ($2 != prevTrial) {print $0; prevTrial = $2 ; prevTime = $3; prevDur = $4}

}
' infile.txt
O/P:-

1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Gamble_Loss 15.496 1.236
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

---------- Post updated at 19:03 ---------- Previous update was at 18:59 ----------

Even better you can use the below short code.

gawk '($2==p){ next ; }{print $0 ; p=$2 }' infile.txt
O/P:
1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Gamble_Loss 15.496 1.236
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371

:cool::cool:;):wink:

Thanks Tytalus, that was exactly what I was looking for.

my $val="---";
while(<DATA>){
  my @tmp = split;
   if($val eq $tmp[1]){
     $suffix+=$tmp[3];
   }
   else{
    print $prefix," ",$suffix,"\n" unless $.==1;
    $prefix=$tmp[0]." ".$tmp[1]." ".$tmp[2];
    $suffix=$tmp[3];
    $val=$tmp[1];
   }
  }
print $prefix," ",$suffix,"\n";
__DATA__
1 Gamble_Win 14.282 0.502
1 Sure_Thing 14.858 0.174
1 Sure_Thing 15.043 0.39
1 Gamble_Loss 15.496 1.236
1 Gamble_Loss 16.982 0.402
1 Gamble_Loss 17.647 0.19
1 Gamble_Win 17.914 0.236
1 Arrow 18.203 0.371
1 Arrow 18.203 0.371