legolad
1
I have an xml file where every line which has the word CDATA in it follows this pattern
(line number) <word1><![CDATA[something]]></>
I need only these lines editing so that the end result is that
(line number) <word1><![CDATA[something]]></word1>
so it copies the first bit to the end.
Anyone know how I can do this I am having trouble with awk and sed
p.s this is in UNIX
You want to fetch "something" from "somethingelse" ?? :wall:
or you just want to have </word1> as the last tag?
--ahamed
sk1418
3
kent$ echo '(line number) <word1><![CDATA[somethingelse]]></>'|sed -r '/word1/{s#(<\!\[CDATA\[).*(\]\]>)#\1something\2#;s#</>#<word1/>#}'
(line number) <word1><![CDATA[something]]><word1/>
legolad
4
just word1 as the last tag sorry my bad it was a type
I have edited now to make amends
legolad,
(line number) <word1><![CDATA[something]]></word1>
Just for making clear..
You want :
(line number) <word1><![CDATA[something]]><word1/>
(line number) <word2><![CDATA[something]]><word2/>
(line number) <word3><![CDATA[something]]><word3/>
OR
(line number) <word1><![CDATA[something]]><word1/>
(line number) <word1><![CDATA[something]]><word1/>
(line number) <word1><![CDATA[something]]><word1/>
which one..???
legolad
6
Sorry it is this one
(line number) <word1><![CDATA[something]]><word1/>
(line number) <word2><![CDATA[something]]><word2/>
(line number) <word3><![CDATA[something]]><word3/>
however there are other lines in the code that I don't want editing so only the ones which have CDATA in them if you understand?
also it should be
(line number) <word1><![CDATA[something]]></word1>
(line number) <word2><![CDATA[something]]></word2>
(line number) <word3><![CDATA[something]]></word3>
not
(line number) <word1><![CDATA[something]]><word1/>
(line number) <word2><![CDATA[something]]><word2/>
(line number) <word3><![CDATA[something]]><word3/>
One more doubt, is the tag name "word" constant? or will it keep changing to some other tag name?
And this is what you want?
Input
line number) <word1><![CDATA[something]]></>
(line number) <word1><![CDATA[something]]></>
(line number) <word1><![CDATA[something]]></>
(line number) <word1><![CDATA[something]]></>
Output
(line number) <word1><![CDATA[something]]></word1>
(line number) <word2><![CDATA[something]]></word2>
(line number) <word3><![CDATA[something]]></word3>
(line number) <word4><![CDATA[something]]></word4>
--ahamed
legolad
8
No the tag name word changes from line to line
the code at the moment is
(line number) <word1><![CDATA[something]]></>
(line number) <word2><![CDATA[something]]></>
(line number) <word3><![CDATA[something]]></>
(line number) <word4><![CDATA[something]]></>
But it needs to end up as
(line number) <word1><![CDATA[something]]></word1>
(line number) <word2><![CDATA[something]]></word2>
(line number) <word3><![CDATA[something]]></word3>
(line number) <word4><![CDATA[something]]></word4>
also the something tag word changes each line as well
Try this
awk -F"<" '{t=length($2);x=substr($2,0,t-1);sub(/\/>/,"/"x">");print}' input_file
--ahamed
---------- Post updated at 07:58 AM ---------- Previous update was at 07:57 AM ----------
If solaris, please use nawk!
--ahamed
legolad
11
worked a dream, thanks a lot for your time
great work ahamed..
test$ cat new7
(line number) <word1><![CDATA[something]]></>
(line number) <word2><![CDATA[something]]></>
(line number) <word2><![CDATA[something]]></>
test$ cat new7 |awk -F"<" '{t=length($2);x=substr($2,0,t-1);sub(/\/>/,"/"x">");print}'
(line number) <word1><![CDATA[something]]></word1>
(line number) <word2><![CDATA[something]]></word2>
(line number) <word2><![CDATA[something]]></word2>