SED remove line feed and add to certain area

Hi All,

I have a xml file and requirement is to remove the line feed and add line feed after some element.

<?xml version="1.0" ?>
<AUDITRECORDS>
   <CARF>
      <HED>
         <VN1>20090616010622</VN1>
         <VN2>0</VN2>
         <VN3>1090</VN3>
         <VN4>CONFIG_DATA</VN4>
         <VN5>20090616010622</VN5>
         <VN6>0</VN6>
         <VN7>1090</VN7>
      </HED>
   </CARF>
   <CARF>
      <HED>
         <VN1>20090616010651</VN1>
         <VN2>0</VN2>
         <VN3>1130</VN3>
         <VN4>11LOWE</VN4>
         <VN5>20090616010651</VN5>
         <VN6>0</VN6>
         <VN7>1130</VN7>
      </HED>
   </CARF>
</AUDITRECORDS>

The output needed as below:

<?xml version="1.0" ?>
<AUDITRECORDS>
<CARF><HED><VN1>20090616010622</VN1><VN2>0</VN2><VN3>1090</VN3><VN4>CONFIG_DATA</VN4><VN5>20090616010622</VN5><VN6>0</VN6><VN7>1090</VN7></HED></CARF>
<CARF><HED><VN1>20090616010651</VN1><VN2>0</VN2><VN3>1130</VN3><VN4>11LOWE</VN4><VN5>20090616010651</VN5><VN6>0</VN6><VN7>1130</VN7></HED></CARF>
</AUDITRECORDS>

Please advice.

Regrads,
Sreejit

Use GNU awk (gawk), New awk (nawk) or POSIX awk (/usr/xpg4/bin/awk).

awk -F'[<|>]' '{ORS=($2~"xml\|AUDITRECORDS\|\/CARF")?RS:OFS}1' OFS="" file

Hi Danmero,

I am getting this error while running the command

awk: syntax error near line 1
awk: illegal statement near line 1
awk: syntax error near line 1
awk: bailing out near line 1

Please advice.

Regards,
Sreejit

---------- Post updated at 05:26 PM ---------- Previous update was at 05:23 PM ----------

Hi Danmero,

Sorry I have main xml file as :

<?xml version="1.0" ?><AUDITRECORDS>
   <CARF>
      <HED>
         <VN1>20090616010622</VN1>
         <VN2>0</VN2>
         <VN3>1090</VN3>
         <VN4>CONFIG_DATA</VN4>
         <VN5>20090616010622</VN5>
         <VN6>0</VN6>
         <VN7>1090</VN7>
      </HED>
   </CARF>
   <CARF>
      <HED>
         <VN1>20090616010651</VN1>
         <VN2>0</VN2>
         <VN3>1130</VN3>
         <VN4>11LOWE</VN4>
         <VN5>20090616010651</VN5>
         <VN6>0</VN6>
         <VN7>1130</VN7>
      </HED>
   </CARF>
</AUDITRECORDS>

Please see if you can help

Regards,
Sreejit

  1. Use a different awk, works for me using
    text # awk --version awk version 20070501 (FreeBSD)
  2. Fix your second xml file or find the workaround by yourself, my solution works for original data sample.
    text # cat file <?xml version="1.0" ?> <AUDITRECORDS> <CARF> <HED> <VN1>20090616010622</VN1> <VN2>0</VN2> <VN3>1090</VN3> <VN4>CONFIG_DATA</VN4> <VN5>20090616010622</VN5> <VN6>0</VN6> <VN7>1090</VN7> </HED> </CARF> <CARF> <HED> <VN1>20090616010651</VN1> <VN2>0</VN2> <VN3>1130</VN3> <VN4>11LOWE</VN4> <VN5>20090616010651</VN5> <VN6>0</VN6> <VN7>1130</VN7> </HED> </CARF> </AUDITRECORDS> # awk -F'[<|>]' '{ORS=($2 ~ "xml\|AUDITRECORDS\|\/CARF")?RS:OFS}1' OFS="" file <?xml version="1.0" ?> <AUDITRECORDS> <CARF><HED><VN1>20090616010622</VN1><VN2>0</VN2><VN3>1090</VN3><VN4>CONFIG_DATA</VN4><VN5>20090616010622</VN5><VN6>0</VN6><VN7>1090</VN7></HED></CARF> <CARF><HED><VN1>20090616010651</VN1><VN2>0</VN2><VN3>1130</VN3><VN4>11LOWE</VN4><VN5>20090616010651</VN5><VN6>0</VN6><VN7>1130</VN7></HED></CARF> </AUDITRECORDS>

---------- Post updated at 03:21 PM ---------- Previous update was at 01:10 PM ----------

When I try to reply to your second post i seen the original file format :confused: .... PLEASE read the Forum Rules and Guidelines and use [code] tags when you post data sample or code.

# cat f1.xml
<?xml version="1.0" ?><AUDITRECORDS>
   <CARF>
      <HED>
         <VN1>20090616010622</VN1>
         <VN2>0</VN2>
         <VN3>1090</VN3>
         <VN4>CONFIG_DATA</VN4>
         <VN5>20090616010622</VN5>
         <VN6>0</VN6>
         <VN7>1090</VN7>
      </HED>
   </CARF>
   <CARF>
      <HED>
         <VN1>20090616010651</VN1>
         <VN2>0</VN2>
         <VN3>1130</VN3>
         <VN4>11LOWE</VN4>
         <VN5>20090616010651</VN5>
         <VN6>0</VN6>
         <VN7>1130</VN7>
      </HED>
   </CARF>
</AUDITRECORDS>

# awk -F'[<|>]' '{sub(/^[ \t]+/, "");gsub("><",">\n<");ORS=($2~"xml\|AUDITRECORDS\|\/CARF")?RS:OFS}1' OFS="" file.xml
<?xml version="1.0" ?>
<AUDITRECORDS>
<CARF><HED><VN1>20090616010622</VN1><VN2>0</VN2><VN3>1090</VN3><VN4>CONFIG_DATA</VN4><VN5>20090616010622</VN5><VN6>0</VN6><VN7>1090</VN7></HED></CARF>
<CARF><HED><VN1>20090616010651</VN1><VN2>0</VN2><VN3>1130</VN3><VN4>11LOWE</VN4><VN5>20090616010651</VN5><VN6>0</VN6><VN7>1130</VN7></HED></CARF>
</AUDITRECORDS>

Hi Danmero,

Thanks for help and solution.
I think my awk version is different, I am getting the same error.

But thanks for ur solution.

Regards,
Sreejit

---------- Post updated at 12:19 PM ---------- Previous update was at 10:30 AM ----------

Hi Danmero,

I have used nawk and it is working, is it problem if we use nawk?

I am not strong in unix. I have understood some of the line used in ur command.

I may sound greedy
But if you can help, can you please let me know the what is -F '[<|>]' is this say that whatever in between <> take it as input.

sub(/^[ \t]+/, ""); means change all tab to blank

gsub("><",">\n<"); means to change the >< with new line in between.

Sorry I didn't understand ORS=($2~"xml\|AUDITRECORDS\|\/CARF")?RS:OFS}1

If you can please explain.

But anyway thanks a lot for your help.

Regards,
Sreejit

To keep the forums high quality for all users, please take the time to format your posts correctly.

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags

```text
 and 
```

by hand.)

Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums

Hi Danmero,

I have used nawk and it is working, is it problem if we use nawk?

I am not strong in unix. I have understood some of the line used in ur command.

I may sound greedy
But if you can help, can you please let me know the what is

-F '[<|>]'

is this say that whatever in between <> take it as input.

sub(/^[ \t]+/, ""); 

means change all tab to blank

gsub("><",">\n<");

means to change the

>< 

with new line in between.

Sorry I didn't understand

ORS=($2~"xml\|AUDITRECORDS\|\/CARF")?RS:OFS}1

If you can please explain.

But anyway thanks a lot for your help.

Regards,
Sreejit

Here we change the OtherRecordSeparator, if the condition is true set the ORS to RecordSeparator(the default is new line), else set the ORS to OFS(Other filed separator) declared at the end.
1 is true and will print each record after processing.

Maybe a solution more clear for you will be:

awk -F'[<|>]' '{sub(/^[ \t]+/, "");gsub("><",">\n<");if($2~"xml\|AUDITRECORDS\|\/CARF"){print}else{printf}}' file

Thanks a lot ... I understood it completely...

Regards,
Sreejit