Hi,
I have an xml file and I want to convert it with awk in to a csv file
Test.xml
<Worksheet ss:Name="Map1">
<Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1"
x:FullRows="1" ss:DefaultColumnWidth="60">
<Row>
<Cell><Data ss:Type="String">Column1</Data></Cell>
<Cell><Data ss:Type="String">Column2</Data></Cell>
</Row>
<Row>
<Cell><Data ss:Type="String">Val1</Data></Cell>
<Cell><Data ss:Type="String">Val2</Data></Cell>
</Row>
</Table>
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
<PageSetup>
<Header x:Margin="0.4921259845"/>
<Footer x:Margin="0.4921259845"/>
<PageMargins x:Bottom="0.984251969" x:Left="0.78740157499999996"
x:Right="0.78740157499999996" x:Top="0.984251969"/>
</PageSetup>
<Selected/>
<Panes>
<Pane>
<Number>3</Number>
<ActiveRow>2</ActiveRow>
<ActiveCol>1</ActiveCol>
</Pane>
</Panes>
<ProtectObjects>False</ProtectObjects>
<ProtectScenarios>False</ProtectScenarios>
</WorksheetOptions>
</Worksheet>
Should be
Test.csv
Column1;Column2
Val1;Val2
Any suggestions??
Thanks in advance !!
One way - but does not use awk
$ grep Cell Test.csv | sed 's/<[^>]*>//g' | sed '$!N;s/\n/;/' | sed 's/ //g'
Column1;Column2
Val1;Val2
$
thanks fpmurphy, is there are a way to solve this issue with awk ?
AWK:
awk '/<Cell>/{sub(".*<Cell><[^>]+>","",$0);sub("<.*","",$0);s=s";"$0}/<\/Row>/{sub("^;","",s);print s;s=""}' file
I don't know what is the different between this xml file but it not works:
<Table x:FullColumns='1' x:FullRows='1'>
<Row ss:AutoFitHeight='0' ss:Height='35'>
<Cell ss:MergeAcross='2' ss:StyleID='title'><Data ss:Type='String'>Export Results for Table:</Data></Cell>
<Cell ss:MergeAcross='1' ss:StyleID='titletext'><Data ss:Type='String'></Data></Cell>
<Cell ss:MergeAcross='2' ss:StyleID='titletext'><Data ss:Type='String'>Results: 4 Rows</Data></Cell>
</Row>
<Row ss:AutoFitHeight='0' ss:Height='35'>
<Cell ss:MergeAcross='2' ss:StyleID='title'><Data ss:Type='String'>Date of creation spreadsheet:</Data></Cell>
<Cell ss:MergeAcross='1' ss:StyleID='titletext'><Data ss:Type='String'>June 3, 2011, 12:01</Data></Cell>
</Row>
<Row ss:Height='12'>
<Cell ss:StyleID='columntitleleft'><Data ss:Type='String'>test1</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
<Cell ss:StyleID='columntitlecenter'><Data ss:Type='String'>test2</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
<Cell ss:StyleID='columntitlecenter'><Data ss:Type='String'>test3</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
<Cell ss:StyleID='columntitlecenter'><Data ss:Type='String'>test4</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
</Row>
awk '/<Cell>/{sub(".*<Cell><[^>]+>","",$0);sub("<.*","",$0);s=s";"$0}/<\/Row>/{sub("^;","",s);print s;s=""}' test4.xml
the result is empty
and
grep Cell test4.xml | sed 's/<[^>]*>//g' | sed '$!N;s/\n/;/' | sed 's/ //g'
ExportResultsforTable:;
Results:4Rows;Dateofcreationspreadsheet:
June3,2011,12:01;test1
test2;test3
test4
Is that posible to convert the Rows tags line by line ?
Try:
awk '/<Cell.*>/{sub(".*<Cell[^>]+><[^>]+>","",$0);sub("<.*","",$0);s=s";"$0}/<\/Row>/{sub("^;","",s);print s;s=""}' file
1 Like