sed/awk: Delete matching words leaving only the first instance

I have an input text that looks like this (comes already sorted):

on Caturday 22 at 10:15, some event
on Caturday 22 at 10:15, some other event
on Caturday 22 at 21:30, even more events
on Funday 23 at 11:00, yet another event

I need to delete all the matching words between the lines, from the start of each line, leaving only the first instance of each date.
To clarify, i need to turn it into something like this:

on Caturday 22 at 10:15, some event
                         some other event
               at 21:30, even more events
on Funday 23 at 11:00, yet another event

So then I could format it like this to make it shorter, which is what I'm after:

on Caturday 22 at 10:15, some event; some other event; at 21:30, even more events
on Funday 23 at 11:00, yet another event

Is there a way to do something like this with sed and awk?

$ cat data
on Caturday 22 at 10:15, some event
on Caturday 22 at 10:15, some other, comma event
on Caturday 22 at 21:30, even more events
on Funday 23 at 11:00, yet another event

$ awk -F' at |, ' 'd!=$1 {if(s)print s; s=$0; d=$1; t=$2; next} t!=$2 {t=$2; s=s"; at "$2","substr($0,index($0,",")+1); next} {s=s";"substr($0,index($0,",")+1)} END {print s}' data
on Caturday 22 at 10:15, some event; some other, comma event; at 21:30, even more events
on Funday 23 at 11:00, yet another event

d=day
t=time
s=string being built for printing

Alister

---------- Post updated at 05:14 PM ---------- Previous update was at 04:40 PM ----------

A bit shorter, if not clearer :slight_smile:

awk -F' at |, ' 'd!=$1 {if(s)print s; s=$0; d=$1; t=$2; next} (e=substr($0,index($0,",")+1)) && t!=$2 {t=$2; s=s"; at "$2","e; next} {s=s";"e} END {print s}' data

e=event text

---------- Post updated at 05:21 PM ---------- Previous update was at 05:14 PM ----------

If you are certain that ", " (comma-space) and " at " (space-a-t-space) sequences will not appear in the event text, then this simpler code will do:

awk -F' at |, ' 'd!=$1 {if(s)print s; s=$0; d=$1; t=$2; next} t!=$2 {t=$2; s=s"; at "$2", "$3; next} {s=s"; "$3} END {print s}' data

In perl,

while(<>)  {
        chomp;

        # parse the required words
        @word = split /\s+/, $_;

        #print "$prev_time : $word[4]: $prev_day : $word[2] \n";

        # if the current lines day & previous day equals substitute it with space.
        if ( $word[2] == $prev_day )  {
                if ( $word[4] eq $prev_time )  {
                        # if the current lines time & previous time equals substitute it with space.
                        $word[$_] =~ s/./ /g for ( 0 .. 4 );
                }  else  {
                        # store previous time.
                        $prev_time = $word[4];
                        $word[$_] =~ s/./ /g for ( 0 .. 2 );
                }
        }  else  {
                # store the previous day & time.
                $prev_day = $word[2];
                $prev_time = $word[4];
        }
        print "@word\n";
}
$ perl t.pl t
on Caturday 22 at 10:15, some event
                         some other event
               at 21:30, even more events
on Funday 23 at 11:00, yet another event