XML parsing in KSH

Hi All,

Although I was able to find past XML parsing questions, as the questions were a little different than this, I had a pretty hard time adapting those answers to this scenario.

-> May I ask if anyone knows how to extract the IP addresses of the below "servers.xml" file into an array called IP[]?
(Hopefully using KSH in Solaris)
(note: there can be a varying # of server lines [sometimes zero] within this list, and will most likely be formatted on multiple lines, like below--Also note lines 1-2 and the last line below are guaranteed to be there.)

<?xml version="1.0" encoding="UTF-8"?>
<serverList xmlns="http://www.w3.org/1999/xhtml">
 <server hostname="server01" ip="192.168.1.101" loc="rm1" region="NY"/>
 <server hostname="server02" ip="192.168.1.117" loc="rm1" region="NY"/>
 <server hostname="server03" ip="192.168.1.154" loc="rm1" region="NY"/>
 <server hostname="server04" ip="192.168.1.159" loc="rm2" region="NY"/>
</serverList>

Thanks in advance...
CG

As you may have found, XML parsing can be very complex.

Each case should be treated uniquely.

As for the data sample you provided, try the code below:

sed -n '/ip="/s/.*ip="\(.*\)" loc.*/\1/p' input_xml
1 Like

To get the IPv4 addresses into an array:

$ set -A myarray `sed -n '/ip="/s/.*ip="\(.*\)" loc.*/\1/p' infile`
$ echo ${myarray[@]}
192.168.1.101 192.168.1.117 192.168.1.154 192.168.1.159
1 Like

xmllint is a very handy tool to work with XML files. Here is a foolproof way to get the ip addresss no matter how your XML file is formatted, even it is packed in one line.

echo "
setns a=http://www.w3.org/1999/xhtml
xpath /a:serverList/a:server[@ip]" | \
	xmllint --shell sample.xml | \
	awk '/ATTRIBUTE ip$/{getline;getline;split($0,a,"=");print a[2]}'

I blog about xmllint sometime ago because I cannot find any useful examples on the internet. See this Chi Hung Chan

1 Like

Thank you both Shell_Life & fpmurphy!

For those that need an XML parser to work similar to this scenario reading in the future, it turns out a perfect combination of the 2 suggestions above were needed. Thanks guys!

This is what works:

set -A myarray `sed -n '/ip="/s/.*ip="\(.*\)" ad.*/\1/p' /servers.xml`
echo ${myarray[@]}
192.168.1.101 192.168.1.117 192.168.1.154 192.168.1.159

Edit:
Chi Hung: Just before hitting submit, I noticed your post as well. Thank you! Hm, very interesting. I'll check this out now...especially since I plan to do a lot more XML parsing pretty soon! Thank you!

Sorry to contradict you but the script you came up with is not an XML parser in any shape, fashion or form. It is strictly a shell script. An XML parser is a completely different piece of software.