My file in ksh consists of message data of varying lengths (lines), separated with headers.
I would like to find a string from this file, and print out the whole message data including the headers.
my plan of attack is to search the strings, print the top header, and print the whole message data below it.
i can't figure out the perfect awk command. i am creating this
situation for my own tool in data extraction. any ideas is
greatly appreciated.
The following awk script watches for string1 and string2 as it stores each line of a message in an array, and at end of message, outputs the array if both strings were found. I took the lazy way and just matched on the entire line, which means that it would also find string1 or string2 even if embedded. If you want to locate string1 and string2 only if they are whole words, a slight change would be needed.
Since the last message does not have a following REPT line to trigger end-of-message processing, I have to call printmsg at END, and that is why I put that code in a function. Some awk versions will not support "function". You might have to use /usr/xpg4/bin/awk instead. Or we can always eliminate it as a function and just put the printmsg logic in both places. It prints a blank line at end of each message, including the last one.
#!/bin/sh
awk '\
function printmsg() {
if (flag1==1 && flag2==1)
{for (l=1;l<=lcnt;l++)
print lines[l]
print ""} }
{if ($1=="REPT")
{printmsg()
split("",lines)
lines[1]=$0
lcnt=1
flag1=0
flag2=0
next}
lines[++lcnt]=$0
if (match($0,"string1"))
flag1=1
if (match($0,"string2"))
flag2=1
}
END {printmsg()}' $infile > $outfile
exit 0