awk convert xml to csv

Hi,

I have an xml file and I want to convert it with awk in to a csv file

Test.xml

 <Worksheet ss:Name="Map1">
    <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1"
     x:FullRows="1" ss:DefaultColumnWidth="60">
     <Row>
      <Cell><Data ss:Type="String">Column1</Data></Cell>
      <Cell><Data ss:Type="String">Column2</Data></Cell>
     </Row>
     <Row>
      <Cell><Data ss:Type="String">Val1</Data></Cell>
      <Cell><Data ss:Type="String">Val2</Data></Cell>
     </Row>
    </Table>
    <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
     <PageSetup>
      <Header x:Margin="0.4921259845"/>
      <Footer x:Margin="0.4921259845"/>
      <PageMargins x:Bottom="0.984251969" x:Left="0.78740157499999996"
       x:Right="0.78740157499999996" x:Top="0.984251969"/>
     </PageSetup>
     <Selected/>
     <Panes>
      <Pane>
       <Number>3</Number>
       <ActiveRow>2</ActiveRow>
       <ActiveCol>1</ActiveCol>
      </Pane>
     </Panes>
     <ProtectObjects>False</ProtectObjects>
     <ProtectScenarios>False</ProtectScenarios>
    </WorksheetOptions>
   </Worksheet>
  

Should be

Test.csv

  Column1;Column2
  Val1;Val2
   

Any suggestions??

Thanks in advance !!

One way - but does not use awk

$ grep Cell Test.csv | sed 's/<[^>]*>//g' | sed '$!N;s/\n/;/' | sed 's/ //g'
Column1;Column2
Val1;Val2
$

thanks fpmurphy, is there are a way to solve this issue with awk ?

AWK:

awk '/<Cell>/{sub(".*<Cell><[^>]+>","",$0);sub("<.*","",$0);s=s";"$0}/<\/Row>/{sub("^;","",s);print s;s=""}' file

I don't know what is the different between this xml file but it not works:

<Table x:FullColumns='1' x:FullRows='1'>
<Row ss:AutoFitHeight='0' ss:Height='35'>
        <Cell ss:MergeAcross='2' ss:StyleID='title'><Data ss:Type='String'>Export Results for Table:</Data></Cell>
        <Cell ss:MergeAcross='1' ss:StyleID='titletext'><Data ss:Type='String'></Data></Cell>
        <Cell ss:MergeAcross='2' ss:StyleID='titletext'><Data ss:Type='String'>Results: 4 Rows</Data></Cell>
</Row>
<Row ss:AutoFitHeight='0' ss:Height='35'>
        <Cell ss:MergeAcross='2' ss:StyleID='title'><Data ss:Type='String'>Date of creation spreadsheet:</Data></Cell>
        <Cell ss:MergeAcross='1' ss:StyleID='titletext'><Data ss:Type='String'>June 3, 2011, 12:01</Data></Cell>
</Row>
<Row ss:Height='12'>
<Cell ss:StyleID='columntitleleft'><Data ss:Type='String'>test1</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
<Cell ss:StyleID='columntitlecenter'><Data ss:Type='String'>test2</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
<Cell ss:StyleID='columntitlecenter'><Data ss:Type='String'>test3</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
<Cell ss:StyleID='columntitlecenter'><Data ss:Type='String'>test4</Data><NamedCell ss:Name='_FilterDatabase'/></Cell>
</Row>
awk '/<Cell>/{sub(".*<Cell><[^>]+>","",$0);sub("<.*","",$0);s=s";"$0}/<\/Row>/{sub("^;","",s);print s;s=""}' test4.xml

the result is empty
and

grep Cell test4.xml | sed 's/<[^>]*>//g' | sed '$!N;s/\n/;/' | sed 's/ //g'
ExportResultsforTable:;
Results:4Rows;Dateofcreationspreadsheet:
June3,2011,12:01;test1
test2;test3
test4

Is that posible to convert the Rows tags line by line ?

Try:

awk '/<Cell.*>/{sub(".*<Cell[^>]+><[^>]+>","",$0);sub("<.*","",$0);s=s";"$0}/<\/Row>/{sub("^;","",s);print s;s=""}' file
1 Like

Great , thank you Guys !