You could also try this awk
script. It can handle single-quoted strings, double-quoted strings, and unquoted strings terminated by a space or ">". It requires an equal-sign (with optional leading and trailing spaces) between keyword and its value. If the value is an empty string, it must be quoted; otherwise the value doesn't need to be quoted unless the value contains a space or ">". Single-quotes can be included in double-quoted strings and double-quotes can be included in single-quoted strings.
#!/bin/ksh
file="$1"
tag="$2"
shift 2
printf '%s\n' "$@" | awk -v tag="$tag" -v sq="'" -v dq='"' '
FNR == NR {
# Get keyword list.
list[++n] = $0
#printf("list[%d] set to %s\n", n, list[n])
}
$1 == "<" tag {
# Look for the requested keywords in this tag...
for(i = 1; i <= n; i++) {
if(match($0, "[: ]" list " *= *") <= 0) {
# No match for this keyword.
print "***No match"
continue
}
val1 = RSTART + RLENGTH
if((c1 = substr($0, val1, 1)) == dq || c1 == sq) {
# We have a single-quoted string or double-quoted
# string value. Find the end of the string value.
val_len = index(substr($0, val1 + 1), c1) - 1
# Extract the string value.
val = substr($0, val1 + 1, val_len)
} else {# We have a space or ">" terminated value.
# Find the end of the value.
val_len = match(substr($0, val1), /[ >]/) - 1
val = substr($0, val1, val_len)
}
print val
}
}' - "$file" | (
while [ $# -gt 0 ]
do read -r value
printf 'tag %s keyword %s=%s\n' "$tag" "$1" "$value"
shift
done
)
Invoke it with the 1st operand being the name of the XML file to be processed, the 2nd operand being the tag on the line to be processed, and the remaining operands being the keywords on that line whose values are to be printed with one output line for each keyword requested printed in the same order as the keywords were given on the command line.
As always, if you want to try this script on a Solaris/SunOS system, change awk
to /usr/xpg4/bin/awk
. (Note that nawk
will not correctly process this script.)
If you have a file named file.xml
containing:
<NotjvmEntries xmi:id="NotJavaVirtualMachine_1337159909831" verboseModeClass="true" verboseModeGarbageCollection="false" verboseModeJNI="true" initialHeapSize="2560" maximumHeapSize="5120" runHProf="true" hprofArguments="null" debugMode="true" debugArgs='-DDQ=" -Djava.compiler=NONE -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=7777' enericJvmArguments="-DSq=' -Dawt.headless=true -Xjit:disableIdiomRecognition -Dsun.net.inetaddr.ttl=120">
<jvmEntries xmi:id="JavaVirtualMachine_1337159909831" verboseModeClass="false" verboseModeGarbageCollection="true" verboseModeJNI="false" initialHeapSize="256" maximumHeapSize="512" runHProf="false" hprofArguments="" debugMode="false" debugArgs="-Djava.compiler=NONE -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=7777" enericJvmArguments="-Dawt.headless=true -Xjit:disableIdiomRecognition -Dsun.net.inetaddr.ttl=120">
xmi:id="JavaVirtualMachine_1337159909831" verboseModeClass="false" verboseModeGarbageCollection="true" verboseModeJNI="false" initialHeapSize="256" maximumHeapSize="512" runHProf="false" hprofArguments="" debugMode="false" debugArgs="-Djava.compiler=NONE -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=7777" enericJvmArguments="-Dawt.headless=true -Xjit:disableIdiomRecognition -Dsun.net.inetaddr.ttl=120"
<test xmi:id=NotJavaVirtualMachine_1337159909831 verboseModeClass=true verboseModeGarbageCollection=false verboseModeJNI=true initialHeapSize=2560 maximumHeapSize=5120 runHProf=true hprofArguments=null debugMode=true debugArgs='-DDQ=" -Djava.compiler=NONE -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=7777' genericJvmArguments="-DSq=' -Dawt.headless=true -Xjit:disableIdiomRecognition -Dsun.net.inetaddr.ttl=120">
and you have saved the above script as an executable script named tester
, then the command:
tester file.xml test id verboseModeClass hprofArguments maximumHeapSize minimumHeapSize initialHeapSize debugArgs genericJvmArguments enericJvmArguments
produces the output:
tag test keyword id=NotJavaVirtualMachine_1337159909831
tag test keyword verboseModeClass=true
tag test keyword hprofArguments=null
tag test keyword maximumHeapSize=5120
tag test keyword minimumHeapSize=***No match
tag test keyword initialHeapSize=2560
tag test keyword debugArgs=-DDQ=" -Djava.compiler=NONE -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=7777
tag test keyword genericJvmArguments=-DSq=' -Dawt.headless=true -Xjit:disableIdiomRecognition -Dsun.net.inetaddr.ttl=120
tag test keyword enericJvmArguments=***No match
and the command:
tester file.xml jvmEntries id verboseModeClass hprofArguments maximumHeapSize minimumHeapSize initialHeapSize debugArgs genericJvmArguments enericJvmArguments
produces the output:
tag jvmEntries keyword id=JavaVirtualMachine_1337159909831
tag jvmEntries keyword verboseModeClass=false
tag jvmEntries keyword hprofArguments=
tag jvmEntries keyword maximumHeapSize=512
tag jvmEntries keyword minimumHeapSize=***No match
tag jvmEntries keyword initialHeapSize=256
tag jvmEntries keyword debugArgs=-Djava.compiler=NONE -Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=7777
tag jvmEntries keyword genericJvmArguments=***No match
tag jvmEntries keyword enericJvmArguments=-Dawt.headless=true -Xjit:disableIdiomRecognition -Dsun.net.inetaddr.ttl=120