awk Script to parse a XML tag

I have an XML tag like this:
<property name="agent" value="/var/tmp/root/eclipse" />

Is there way using awk that i can get the value from the above tag. So the output should be:
/var/tmp/root/eclipse

Help will be appreciated.

Regards,
Adi

If you're using a gnu-like awk that supports a record separator pattern, this might work for you:

awk '
    /property name=/ {
        gsub( ".*value=\"", "" );
        gsub( "\".*", "" );
        print;
    }
' RS="[<>]"  input-file 

Hi Agama,

Thanks for the reply, but that does not work. The script that you provided just removes the <> from the line and displays
property name="agent" value="/var/tmp/root/eclipse" /
as output.

-Adi

If that xml tag does not have a different attribute, you can simply do:-

awk -F\" '/property name=/ { print $(NF-1); } ' xml_file

How were you testing it? if you were using echo to echo it and pipe it into awk, were you using double quotes round the whole string? That won't work. Use single quotes:

echo '<property name="agent" value="/var/tmp/root/eclipse" />' | awk '
    /property name=/ {
        gsub( ".*value=\"", "" );
        gsub( "\".*", "" );
        print;
    }
' RS="[<>]"

If you're testing some other way, then I'd be curious what your version of awk is. Works for me with gnu awk 4.0; output from above is

/var/tmp/root/eclipse

Hi Agama,

You solution works but i just confirmed the XML tag is:
<property name='agent' value='/var/tmp/root/eclipse' />

instead of a ", its ' to represent String attribute.
How would i modify your script now?

-Adi

Ah, very good, thanks.

Have a go with this:

 awk '
    /property name=/ {
        gsub( ".*value=" Q, "" );
        gsub( Q ".*", "" );
        print;
    }
' Q="'" RS="[<>]" 

Embedding single quotes inside of an awk programme contained inside of single quotes is tricky. Several ways of dealing with it; I think this is the easiest. It assigns the single quote to Q, and then appends it to the strings in the substitution commands where needed.

1 Like