Hi everyone,
I have a data.xml file which only contains thousands of data (tag) blocks. A part of the file looks exactly like this;
<data>
Line
Line
Line
</data>
<data>
Line
Line
Line
</data>
the rest of the file is simply a repetition of this part. Here each data block contains a number of multiple data lines. I need to separate each data lines so that it will look like;
<data>
Line
</data>
<data>
Line
</data>
<data>
Line
</data>
<data>
Line
</data>
<data>
Line
</data>
I need to write a bash script which reads data.xml and then executes the necessary operations. I guess the script file should have a loop with a conditional statement (if...then...else).
Any help will be highly appreciated.
thanks a lot.
You can do it in sed -- read in second and third lines with N, N and if
open data, line, line
then make it
open data, line, close data, open data, line
then P out and discard three lines, N another line and branch back to if.
Else P out and discard one line and back to second N and if.
awk '/<data>/{ # If line contains pattern: <data>
f=1; # Set flag variable f = 1
next; # Skip current record
} /<\/data>/ { # If line contains pattern: </data> - Since / is meta-character I escaped it \/
f=0; # Set flag variable f = 0
next; # Skip current record
} f==1 { # If line flag variable is equal to 1 f == 1
printf "<data>\n%s\n</data>\n",$0; # Print <data> -- newline -- followed by current record : $0 -- newline -- followd by </data>
}' xml_filename # Read filename: xml_filename
I'd think you would need to print '</data>\n<data>\n%s\n",$0 to close out the first and put the second in a new tag. OK, you are discarding the old tags. That works.