mpang
March 21, 2007, 5:14am
1
Hey all, I need to retrieve something from a line, say
<TEST><A>123</A><B>456</B><ID>1111</ID><C>789</C><D>000</D></TEST><TEST><A>123</A><B>456</B><ID>2222</ID><C>789</C><D>000</D></TEST>
I need to match <ID>111</ID>, so I want to retrieve
<ID>1111</ID><C>789</C><D>000</D></TEST>
is this possible, can anyone help? Thank you!
anbu23
March 21, 2007, 5:38am
2
sed "s;\(</TEST>\)\(<TEST>\);\1\\
\2;g" f | sed -n "s/.*\(<ID>1\{1,\}.*\)/\1/p"
mpang
March 21, 2007, 5:50am
3
I don't quite get what exactly that command does, do you mind explain it a little? Thanks!
if you have Python and know the language, here's an alternative:
#!/usr/bin/python
import sys
choice=sys.argv[1]
for line in open("file"):
for li in line.split("<TEST>"):
if "<ID>%s</ID>" % choice in li:
ind = li.index("<ID>1111</ID>")
print li[ind:]
output:
# ./test.py 1111
<ID>1111</ID><C>789</C><D>000</D></TEST>
anbu23
March 21, 2007, 6:10am
5
$ cat file
<TEST><A>123</A><B>456</B><ID>1111</ID><C>789</C><D>000</D></TEST><TEST><A>123</A><B>456</B><ID>2222</ID><C>789</C><D>000</D></TEST>
First sed separates TEST tags into separate lines
$ sed "s;\(</TEST>\)\(<TEST>\);\1\\
> \2;g" file
<TEST><A>123</A><B>456</B><ID>1111</ID><C>789</C><D>000</D></TEST>
<TEST><A>123</A><B>456</B><ID>2222</ID><C>789</C><D>000</D></TEST>
Second sed matches <ID>1</ID> to till end of the line
1\{1,\} matches from one 1 to n number of 1s.
for example matches 1, 11, 111, 1111 and so on
$ sed "s;\(</TEST>\)\(<TEST>\);\1\\
> \2;g" file | sed -n "s/.*\(<ID>1\{1,\}.*\)/\1/p"
<ID>1111</ID><C>789</C><D>000</D></TEST>
mpang
March 21, 2007, 6:32am
6
but my ID is random, say 28654..then what should I do?
anbu23
March 21, 2007, 6:36am
7
id="28654"
sed "s;\(</TEST>\)\(<TEST>\);\1\\
\2;g" file | sed -n "s/.*\(<ID>${id}.*\)/\1/p"
mpang
March 21, 2007, 6:58am
8
it doesn't work...nothing returned
anbu23
March 21, 2007, 7:09am
9
check whether your input contains 28654?
b=1111
sed 's/\(.*<\/TEST>\)<TEST>\(.*<\/TEST>\)/\1\
<TEST>\2/' filename | sed -ne "/<ID>$b/s/^.*<\/B>\s*//p"
one more with ' \s '
anbu23
March 21, 2007, 7:41am
12
$ cat file
<TEST><A>123</A><B>456</B><ID>1111</ID><C>789</C><D>000</D></TEST><TEST><A>123</A><B>456</B><ID>2222</ID><C>789</C><D>000</D></TEST>
$ id="1111"
$ sed "s;\(</TEST>\)\(<TEST>\);\1\\
> \2;g" file | sed -n "s/.*\(<ID>${id}.*\)/\1/p"
<ID>1111</ID><C>789</C><D>000</D></TEST>
$ id="2222"
$ sed "s;\(</TEST>\)\(<TEST>\);\1\\
> \2;g" file | sed -n "s/.*\(<ID>${id}.*\)/\1/p"
<ID>2222</ID><C>789</C><D>000</D></TEST>
I tried with your sample and its working.
Can you show your input?
Am sorry if am wrong,
are you trying with the following code posted earlier
sed "s;\(</TEST>\)\(<TEST>\);\1\\
> \2;g" file | sed -n "s/.*\(<ID>1\{1,\}.*\)/\1/p"
then it would nt work for values specified through variable
Could you please check that ?
mpang
March 21, 2007, 7:55am
14
thanks everyone, especially anbu23!
My input is a very big xml file with foreign language characters, but I would assume it doesn't make much different. Structure of the xml is just as I described, I would spend more time to check whether I miss any slash or space on the command.
Although things are exactly working out right yet, but I know which direction I should look into, I will read upon "sed", thanks again!
mpang
March 21, 2007, 7:56am
15
yes, i am, I simply copy and paste (in case I mis-type anything), why would be work?