How to parse a text file with \034 as field and \035 as end of message delimiter?

I need some tips to write a unix korn shell script that will parse an input text file. Input text file has messages that span several lines, each field in the message is delimited by /034 and the end of message is delimited by /035.

Input file looks something similar to

messge1:field1/034field2/034/n
field3/034field4/034field5/034/n
field6/034/field7/034/035/n
messge2:field1/034field2/034/n
field3/034field4/034field5/034/n
field6/034/field7/034/035/n

I want to write a script to parse the input file that results in an output file similar to

messge1:field1,field2,field3,field4,field5,field6,field7/n
messge2:field1,field2,field3,field4,field5,field6,field7/n

The idea is to convert each message into one single line instead of several lines so that it becomes easier to grep or awk for patterns or fields.

Throw in some ideas.

Use Awk:

BEGIN { FS="\034"; RS="\035"; OFS="," }
{ gsub( /\n/, "" )
  $1=$1
  # If last field is empty, remove it.
  if ( ""==$NF )  NF--
  print
}
1 Like