make multiple line containing a pattern into single line

VTAWKVT · May 22, 2008, 12:02pm

I have the following data file.

zz=aa azxc-1234 aa=aa
zz=bb azxc-1234 bb=bb
zz=cc azxc-1234 cc=cc
zz=dd azxc-2345 dd=dd
zz=ee azxc-2345 ee=ee
zz=ff azxc-3456 ff=ff
zz=gg azxc-4567 gg=gg
zz=hh azxc-4567 hh=hh
zz=ii azxc-4567 ii=ii

I want to make 2nd field pattern matching multiple lines into single line.
So the output will be:

zz=aa azxc-1234 aa=aa zz=bb azxc-1234 bb=bb zz=cc azxc-1234 cc=cc
zz=dd azxc-2345 dd=dd zz=ee azxc-2345 ee=ee
zz=ff azxc-3456 ff=ff
zz=gg azxc-4567 gg=gg zz=hh azxc-4567 hh=hh zz=ii azxc-4567 ii=ii

help!

joeyg · May 22, 2008, 1:38pm

My input file had customer info and auto info. My goal was to combine records where a customer had more than one auto, so only one letter would be sent. Random notes:
comptxt is what I am comparing, the customer info
autotxt is the vehicle specifics
comptxts hold the saved information
you will see me write lots of "|" pipes, so I can later easily manipulate the data
the script must do a last write since I fall out of the loop when done reading, but I did not write the last record
For you, compare cut -c1-15 similar to my comptxt, and cut -c17-20 similar to my autotxt.
Lastly, there might be a few other variables in that code section you can ignore as they were more for my purposes (like vehcnt).

#combine lines where key is same
echo "** extracting info from "$wkfile30" to "$wkfile40
comptxts=""
vehicle=""
recno=0
vehcnt=0

while read zf
   do
      recno=$((recno + 1))
#grab the key name for comparison
      comptxt=$(echo "$zf" | cut -d"|" -f2)
      autotxt=$(echo "$zf" | cut -d"|" -f3)
#need to skip first time through since of course different
      if [ -n "$comptxts" ]
         then
         if [ "$comptxts" != "$comptxt" ]
            then
#format output dataline
               vehcnttxt=$(printf "%.4d" "$vehcnt")
               autotxto=$(printf "%-100s" "$autotxts")
               echo "$outtxt1""|""$vehcnttxt""|""$autotxto""|-" >>$wkfile40
               comptxts="$comptxt"
               autotxts="$autotxt"
               vehcnt=0
            else
               autotxts="$autotxts""$autotxt"
         fi 
         else
            comptxts="$comptxt"
            autotxts="$autotxt"
      fi


      vehcnt=$((vehcnt+1))
      outtxt1="$zf"

   done < $wkfile30

#still need to output the last record !!
vehcnttxt=$(printf "%.4d" "$vehcnt")
autotxto=$(printf "%-100s" "$autotxts")
echo "$outtxt1""|""$vehcnttxt""|""$autotxto""|-" >>$wkfile40

radoulov · May 22, 2008, 4:14pm

If the input is ordered by the second field:

awk 'END{print RS}$0=_[$2]++||NR==1?$0:RS$0' ORS= input

Otherwise:

awk 'END{for(r in _)print _[r]}{_[$2]=_[$2]?_[$2] FS $0:$0}' input

Use awk or /usr/xpg4/bin/awk on Solaris.

VTAWKVT · May 23, 2008, 9:30am

It worked so well!!!!

Sara-sh · August 6, 2008, 5:08pm

Can you guys please help me do the following.

I have a file with a server name and a list of mac addresses (variable length) on separate lines, need to put them all in one line separated by space:

Have this:

servername
00:03:6e:4e:4f:5c
00:09:7b:4e:4c:6d
00:04:55:5a:8d:ec
....

want this:
servername 00:03:6e:4e:4f:5c 00:09:7b:4e:4c:6d 00:04:55:5a:8d:ec ....

Can someone please help, thanks so much....
Sara

radoulov · August 6, 2008, 5:27pm

Use nawk or /usr/xpg4/bin/awk on Solaris:

awk 'END { print _ }
!/:/ && _ { print _; _ = "" }
{ _ = _ ? _ FS $0 : $0 }
' filename

Or:

perl -nle'BEGIN { $, = " " }
  print @x and undef @x if not /:/ and @x;
  push @x, $_;
  END { print @x }
' filename

Sara-sh · August 6, 2008, 11:13pm

Thanks so much for your response however, I am getting error when running the command, any chance you can take a look? Thanks again.

>awk 'END { print _ }
> !/ && x++ { print _; _ = "" }
> { _ = _ ? _ FS $0 : $0 }
> ' file.mac
awk: syntax error near line 2
awk: bailing out near line 2

summer_cherry · August 6, 2008, 11:20pm

awk:

awk '{
	a[$2]=sprintf("%s %s",a[$2],$0)
}
END{
for(i in a)
 print a
}' file

perl:

open(FH,"<file");
while(<FH>){
	@arr=split(" ",$_);
	$_=~tr/\n//d;
	$hash{$arr[1]}=sprintf("%s %s",$hash{$arr[1]},$_);
}
close(FH);
for $key(keys %hash){
print $hash{$key},"\n";
}

invinzin21 · August 6, 2008, 11:33pm

$ cat file
00:03:6e:4e:4f:5c
00:09:7b:4e:4c:6d
00:04:55:5a:8d:ec

$ cat file | tr '\n' ' '
00:03:6e:4e:4f:5c 00:09:7b:4e:4c:6d 00:04:55:5a:8d:ec $

is this wat you needed?

radoulov · August 7, 2008, 3:41am

Did you try to use nawk or /usr/xpg4/bin/awk as suggested?

Sara-sh · August 7, 2008, 9:54am

Thanks so much for all your help, I appreciate it ...
Sara

SoMoney · December 4, 2008, 5:06pm

I have a H U G E file with over 1million entries in it.
Looks something like this:

I would like a way to output only the uniq veriables onto a single line like so:

USER0001|DEVICE001|VAR1|VAR2|VAR3|VAR4|VAR5|VAR6
USER0001|DEVICE002|VAR1|VAR2|VAR3|VAR4|VAR5
USER0002|DEVICE001|VAR1|VAR2|VAR3|VAR4|VAR5
USER0002|DEVICE002|VAR1|VAR2|VAR3|VAR4|VAR5
USER0003|DEVICE003|VAR1|VAR2|VAR3|VAR4|VAR5|VAR6
USER0003|DEVICE003|VAR1|VAR2|VAR6

Is this doable with awk or do I need to use perl?

vgersh99 · December 4, 2008, 5:21pm

Please don't hijack threads - start a new thread for new unrelated questions.
Forum's rules can be found here.
Yes, it is possible with either awk or perl.

vgersh99 · December 4, 2008, 5:40pm

ok, borrowing from the previous suggestion on this thread:

nawk 'BEGIN {FS="|"}END{for(r in _)print r FS _[r]}{idx=$1 FS $2;_[idx]=_[idx]?_[idx] FS $3:$3}' myFile