Hi.
Gathering these suggestions together with the call to the older version of xmllint
(posted by stomp) and adding xml2
to process the modified XML (posted by greet_sed):
#!/usr/bin/env bash
# @(#) s1 Demonstrate string extraction from XML file.
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C xml_grep xmlstarlet xmllint xml2
FILE=${1-data1}
E=expected-output.txt
pl " Input data file $FILE, $(wc -l <$FILE) lines:"
cat $FILE
pl " Expected output:"
cat $E
pl " Results, xml_grep:"
xml_grep //PORTED_NUM --text_only $FILE
pl " Results, xmlstarlet:"
xmlstarlet sel -t -v //PORTED_NUM $FILE
pe
pl " Results, xmllint:"
xmllint $FILE --xpath '//PORTED_NUM/text()'
pe
pl " Results, xmllint:"
xmllint --shell $FILE <<<'cat //PORTED_NUM' |
perl -ne '/(?<=>)(.*)(?=<)/ and print($1)'
pe
pl " Results, xml2:"
xml2 < data1 |
awk -F= '/PORTED_NUM/ { print $2 }'
exit 0
producing:
$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.6 (jessie)
bash GNU bash 4.3.30
xml_grep /usr/bin/xml_grep version 0.9
xmlstarlet - ( /usr/bin/xmlstarlet, 2014-09-14 )
xmllint: using libxml version 20901
xml2 - ( /usr/bin/xml2, 2012-04-16 )
-----
Input data file data1, 1 lines:
<?xml version="1.0" encoding="UTF-8"?><PORT_RESPONSE><HEADER><ORIGINATOR>XMG</ORIGINATOR><DESTINATION>ENSEMBLE</DESTINATION><MESSAGE_ID>NXT107349698</MESSAGE_ID><MSGTYPE>PRI</MSGTYPE><TIMESTAMP>12232016061452</TIMESTAMP></HEADER><ADMIN><WICIS_REL_NO>5.0.0</WICIS_REL_NO><NNSP>9664</NNSP><OLSP>6529</OLSP><ONSP>6529</ONSP><REQ_NO>6664016358514349</REQ_NO><VER_ID_REQ>00</VER_ID_REQ><VER_ID_RESP>00</VER_ID_RESP><RT>C</RT><RESP_NO>652901635838480144</RESP_NO><CD_TSENT>122220160614</CD_TSENT><REP>Port Center</REP><TEL_NO_REP>000-207-8009</TEL_NO_REP><CHC></CHC><DD_T>122320160909</DD_T><NPQTY>00001</NPQTY></ADMIN><LINE_DATA><PORTED_NUM>990-799-1234</PORTED_NUM></LINE_DATA></PORT_RESPONSE>
-----
Expected output:
990-799-1234
-----
Results, xml_grep:
990-799-1234
-----
Results, xmlstarlet:
990-799-1234
-----
Results, xmllint:
990-799-1234
-----
Results, xmllint:
990-799-1234
-----
Results, xml2:
990-799-1234
Observations:
1) the code for the early xmllint
and xml2
need additional work, perl
, awk
, grep
, etc. to isolate the string of interest.
2) A few of the codes (the ones that have a pe
afterwards, seem to omit the trailing newline -- not an error, just something to be noted.
Details for xml2:
xml2 convert xml documents in a flat format (man)
Path : /usr/bin/xml2
Version : - ( /usr/bin/xml2, 2012-04-16 )
Type : ELF 64-bit LSB executable, x86-64, version 1 (SYSV ...)
Repo : Debian 8.6 (jessie)
Best wishes ... cheers, drl