Generating XML from a flatfile

Hi all,

I am trying to generate an XML file from a flatfile in ksh/bash (could also use perl at a pinch, but out of my depth there!).

I have found several good solutions on this very forum for cases where the header line in the file forms the XML tags, however my flatfile is as follows:

Object,Type
Table1,Tables
Table2,Tables
Table3,Tables
View1,Views
View2,Views
Proc1,Procs
Proc2,Procs

And I want to create the following:

<Whatever>
 <Tables>
    Table1
    Table2
    Table3
 </Tables>
 <Views>
    View1
    View2
 </Views>
 <Procs>
    Proc1
    Proc2
 </Procs>
</Whatever>

So I essentially want the data to be segregated by one of the data columns in the flatfile, rather than just a more straightforward 'header becomes a tag' scenario.

All pointers much appreciated!

*Edit*

Although the data should always be in sequence of the different types, I would be interested to see if it could handle:

Object,Type
Table1,Tables
Proc1,Procs
View1,Views
Table2,Tables
Table3,Tables
View2,Views
Proc2,Procs

Thanks,
Ian

Something like this:

awk -F\, 'ant!=$NF{if(ant!=""){print "</"ant">"};print "<"$NF">\n   "$1;ant=$NF;next}{print "   "$1}END{print "</"ant">"}' infile
1 Like

hey, thanks for the quick response! It works perfectly for the ordered dataset.

Would it take much to adapt it to do the following:

  1. Ignore a header line (i..e line 1)
  2. Work with a non-ordered flatfile as per my edit above.

Even if not, this is great - would have taken me hours to come up with!

$ cat flat2xml.awk

BEGIN { FS="," }
NR==1 { next }

{       D[$2,++T[$2]]=$1        }

END {
        print "<whatever>";
        for(X in T)
        {
                print "\t<" X ">";
                        for(N=1; N<=T[X]; N++)  print "\t\t" D[X,N];
                print "\t</" X ">";
        }
        print "</whatever>";
}

$ awk -f flat2xml.awk data

<whatever>
        <Procs>
                Proc1
                Proc2
        </Procs>
        <Views>
                View1
                View2
        </Views>
        <Tables>
                Table1
                Table2
                Table3
        </Tables>
</whatever>

$

It doesn't need to handle more than 2 columns, does it?

1 Like

No it doesn't, although they have decided the flatfile will have the two columns in the opposite order. Any suggestion on the tweak I need to make to the first script to make keep the output the same, but the input is Col2,Col1 instead of Col1,Col2?

Switch $1 and $2 around :slight_smile: