awk join lines based on keyword

Hello ,

I will need your help once again.

I have the following file:

cat file02.txt 
PATTERN XXX.YYY.ZZZ. 500 
ROW01  aaa. 300 XS 14
ROW 45 29 AS XD.FD.
PATTERN 500 ZZYN002
ROW gdf gsste 
ALT 267 fhhfe.ddgdg.
PATTERN ERE.MAY. 280
PATTERRNTH 5000 rt.rt.
ROW SO a 678
PATTERN dsjsdh.sdshb 400 80
PATTERN ssds.500. 60
ROW 3389 LAST ROW 

I'm trying to join all the lines which start with pattern

PATTERN

. Also I need to remove the last . if occurs .

The desired results should be:

XXX.YYY.ZZZ 500 aaa 300 XS 14 45 29 AS XD.FD 
500 ZZYN002 gdf gsste  267 fhhfe.ddgdg 
ERE.MAY 280 5000 rt.rt SO a 678 
dsjsdh.sdshb 400 80 
ssds.500 60 3389 LAST ROW  

I somehow managed to join the lines but cannot figure out how to get rid of the word PATTERN from output , remove the dot and delete the first word from the lines to be joined.

The command I came with is:

awk '/PATTERN/ && c{print c;c=""}{c=c $0" "}END{if(c) print c}' file02.txt 

which produces (in red are the words / characters I don't need):

PATTERN XXX.YYY.ZZZ. 500  ROW01  aaa. 300 XS 14 ROW 45 29 AS XD.FD. 
PATTERN 500 ZZYN002 ROW gdf gsste  ALT 267 fhhfe.ddgdg. 
PATTERN ERE.MAY. 280 PATTERRNTH 5000 rt.rt. ROW SO a 678 
PATTERN dsjsdh.sdshb 400 80 
PATTERN ssds.500. 60 ROW 3389 LAST ROW  

Thanks in advance for your help.

The spacing in your output is inconsistent and it would have helped if you had explicitly stated that you wanted to remove a trailing period, if present, from each word present in lines being joined; but this seems to come close to what you seem to want:

awk '
/^PATTERN/ {
	if(NR > 1) {
		print out
		out = ""
	}
}
{	for(i = 2; i <= NF; i++)
		out = out substr($i, 1, length($i) - ($i ~ /[.]$/)) OFS
}
END {	print out
}' file02.txt

As always, if you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

1 Like

Thank you , the above script works very well.

Best Regards