Parse an XML task list to create each task.xml file

I have an task definition listing xml file that contains a list of tasks such as

<TASKLIST
    <TASK definition="Completion date" id="Taskname1" Some other 
         <CODE name="Code12" 
               <Parameter pname="Dog" input="5.6" units="feet" etc /Parameter>
               <Parameter pname="Cat" input="cute" units="NA"      /Parameter>
          /CODE>
     /TASK>
<TASK definition="Completion date" id="Taskname2" Some other 
         <CODE name="Code3" 
          /CODE>
         <CODE name="Code2" 
           /CODE>
     /TASK>
/TASKLIST>

I need to parse the task list into seperate task xml files starting from the <TASK tag to the /TASK> tag using the TASK id for the *.xml name.

I have written a grep command to capture all id names and then a sed command along with another grep command to clean up everything except the id leaving me with a list of just Task id names. My code is not eloquant, but works.

Now I am stuck. Please help. I am a novice at Unix scripting.

Just show us what you have got and we will work from there. Don't be shy, we dislike people relying on others doing their work for them much more then people not well versed in the art of scripting. The second is tolerable, the first is not.

bakunin

hi,

below perl code should be ok, pls delete the last and first file of your xml file and run it.

$/="TASK>";
open FH,"<file";
while(<FH>){
        if(m/id="(.*)"/){
                $file=sprintf("%s.xml",$1);
                open FH1,">$file";
                print FH1 $_;
                close FH1;
        }
}
close FH;

Thanks so much.

Have never worked with Perl before and downloaded tutorial to help me understand the syntax. Would you suggest a good book on it?

This code was awesome. The input file does consist of lots more information and the task tag has about 2 dozen attributes. (This is an embedded software file created by another integrator and has about 1000 tasks in it with multiple tags underneath). I do software code checking as code changes, this whole file was being recreated. Now all that has to be done is change the task xml file and re-imbed it. Magic..

The xml code is on an intranet machine and cannot be copied from, so the file and problem was abbreviated for convenience.

The only problem was

(m/id="(.*)"/)

actually captured the whole line including other attributes.

Each task is defined by the same version number appended to the end. I got the correct name by adding

(m/id="(.*)VERS5"/).

The last part was not captured, but it did end there. I tried the
(m/s/id="(.*)VERS5"/)

but it did not work. I got what I wanted by doing the perl then

ls *.xml > temp1
sed -e "s|.xml||g" temp1 > temp2
cat temp2 |xargs -n 1| while read TASKNAME
do
mv $TASKNAME.xml $TASKNAME'Vers5.xml'
done

This is the first time I have worked with Sun workstations and don't have this scripting thing down but with the help of this website, I have been able to do lots of neat things which have saved time.

Thanks all, this site is fantastic.

PS I am thinking about getting a MAC now that I am starting to understand UNIX. Any suggestions on what I should get?

Thanks again,

MissI