we would like to have three separate files like below for the above file
1) DSFilenames.txt
2) JMSModule.txt
3) Subdeployment.txt
1) DSFilenames.txt should contains only the value of <descriptor-file-name> .
In the above example, the expected output should be like below
dsfilename=jms/UMSJMSSystemResource-jms.xml
2) JMSModule.txt should contains values of <name> , <target> under <jms-system-resource>
In the above example, the expected output should be like below
3) Subdeployment.txt should contains values of <name> , <target> under each <sub-deployment> tag
In the above example, the expected output should be like below
You may use simple shell techniques (reading file line by line, grep, sed/cut, output redirection) to achieve this; or go for bit more advanced techniques provided by, let's say python's xml modules.
awk -F"[><]" '
/<jms-system-resource>/{
a=1
}
a && /<name>/{
print "JMSModuleName ="$3
next
}
a && /<target>/{
next
}
a && /<sub-deployment>/{
b=1
}
b && /<name>/ {
print "SubdeploymentName ="$3
next
}
b && /<target>/{
print "TargetServersName ="$3
a=""
b=""
next
}
' InputFile
Now, this is different from what you requested in post#1, and it is not too clear to me what is actually requested. Given the .xml- file has EXACTLY the structure shown, how far would this get you?
awk '
/<jms-system-resource>/,\
/<\/jms-system-resource>/ {if (/<.?jms-system-resource>/) {FN = "JMSModule.txt"
next
}
if (/<.?sub-deployment>/) {FN = "Subdeployment.txt"
next
}
if (/<.?descriptor-file-name>/) FN = "DSFilenames.txt"
sub (/^ *</, _)
sub (/>/, "=")
sub (/<.*$/, _)
if (FN) print > FN
}
' file
cf *.txt
---------- DSFilenames.txt: ----------
descriptor-file-name=jms/UMSJMSSystemResource-jms.xml
---------- JMSModule.txt: ----------
name=UMSJMSSystemResource
target=soa_server1,bam_server1
---------- Subdeployment.txt: ----------
name=UMSJMSServer522129776
target=UMSJMSServer_auto_1
name=UMSJMSServer1709690790
target=UMSJMSServer_auto_2