I have a file like below
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><ns2:executeMDXResponse xmlns:ns2="http://webservices.quartetfs.com"><aggregates><axes><axis><name>ROWS</name><positions><position><members><member><depth>0</depth><dimensionName>AsOfDate</dimensionName><displayName>AllMember</displayName><levelName>ALL</levelName><path><items><item>AllMember</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>04-01-2012</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>04-01-2012</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>20-12-2011</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>20-12-2011</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>12-12-2011</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>12-12-2011</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>09-12-2011</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>09-12-2011</item></items></path></member></members></position></positions></axis></axes><cells><cell><formattedValue>3840769</formattedValue><ordinal>0</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">3840769</value></cell><cell><formattedValue>444930</formattedValue><ordinal>1</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">444930</value></cell><cell><formattedValue>1136654</formattedValue><ordinal>2</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1136654</value></cell><cell><formattedValue>1081680</formattedValue><ordinal>3</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1081680</value></cell><cell><formattedValue>1177505</formattedValue><ordinal>4</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1177505</value></cell></cells><slicerAxis><name>SlicerAxis</name><positions><position><members><member><depth>0</depth><dimensionName>Measures</dimensionName><displayName>contributors.COUNT</displayName><levelName>Measures</levelName><path><items><item>contributors.COUNT</item></items></path></member></members></position></positions></slicerAxis></aggregates></ns2:executeMDXResponse></soap:Body></soap:Envelope>
not in properly aligned and everything in one line. So If I try to serach by grep or sed for a particular tag and value in between them, returns whole file ?
can anyone how can I search it?
I need to search date between displayName tag?
ctsgnb
January 5, 2012, 9:52am
2
$ awk -F"</?displayName>" '{for(i=1;++i<=NF;) if(length($i)==10) print $i}' yourfile.xml
04-01-2012
20-12-2011
12-12-2011
09-12-2011
1 Like
mirni
January 5, 2012, 10:23am
3
awk '/displayName/ && $2~/^[0-9][0-9]-/{print $2}' FS="[><]" RS='><' xmlFile
1 Like
hello there,
how could I get the value between formatted tag ? by below logic
which is based on Ordinal tag
let's say in my first xml posted on the top
<formattedValue>1177505</formattedValue>
<ordinal>4</ordinal>
<value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1177505</value>
ordinal value is 4 , so that means it should search for 4th displayName tag
(skip ordinal value 0)
and o/p will be
09-12-2011 1177505
as each ordinal tag relates to the displayName tag seqentially
so final output should be
04-01-2012 444930 as ordinal 1 and it's formattedValue
20-12-2011 1136654 as ordinal 2 and it's formattedValue
12-12-2011 1081680 as ordinal 3 and it's formattedValue
09-12-2011 1177505 as ordinal 4 and it's formattedValue
mirni
January 6, 2012, 12:39am
5
Try this:
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[++cnt]=$2}
/^formattedValue>/{fv=$2; getline; print dt[$2],fv,$2 }
' FS="[><]" RS='><' xmlFile
It assumes that ordinal tag is the next tag right after formattedValue tag. If that is not always the case, you could try this a little more general approach:
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[c1++]=$2}
/^formattedValue>/{fv[c2++]=$2}
/^ordinal>/{o[c3++]=$2}
END{
for(i=0; i<c1; i++)
print dt[o],fv[o+1]
}' FS="[><]" RS='><' xmlFile
1 Like
this is amazing mi. this is waht exactly looking for
thanks again. can you please point out, where exactly I'm doing it wrong, if I gave little formatting behaviour to below awk.
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[c1++]=$2}
/^formattedValue>/{fv[c2++]=$2}
/^ordinal>/{o[c3++]=$2}
END{
for(i=0; i<c1; i++){
cnt=split(dt[o],a,"-")
for (j=cnt,j<=1;j--){ date=a[j] }
print date,fv[o+1]
}date=""} ' FS="[><]" RS='><' file.txt
I want o/p to be
20111209 1177505
mirni
January 6, 2012, 4:37am
7
Well, you do have a bunch of syntax errors in this line:
Commas, semicolons and logic in the for statement are messed up.
Here:
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[c1++]=$2}
/^formattedValue>/{fv[c2++]=$2}
/^ordinal>/{o[c3++]=$2}
END{
for(i=0; i<c1; i++) {
split(dt[o],a,"-");
print a[3] a[2] a[1],fv[o+1]
}
}' FS="[><]" RS='><' xmlFile
you know mi, I tried the same for first one which goes like this and working fine.
awk '/displayName/ && $2~/^[0-9][0-9]-/{dt[++cnt]=$2}
> /^formattedValue>/{fv=$2; getline;
> if ( dt[$2] == "" ){
> print "==================================="
> print "TotalValue", fv,$2
> print "===================================" }
> else
> { cnt=split( dt[$2], a, "-")
> { name=a[3]a[2]a[1] }
> print name,fv,$2
> }name=""}' FS="[><]" RS='><' output.txt
===================================
TotalValue 1721994 0
===================================
20120105 1141169 1
20120104 580825 2
but for the second one (which is more good way represnting it), I was thinking if spliting it into array and then join it thru for loop getting the value from that array rather than joining thru their hardcoded value. There I went crazy but thanks anyway.
Ever used command xmllint and trying xpath in that. Should be able to extract any thing from XML.
Not a good solution as compared to awk
( cat XMLFILE|tr '>' '\012'|egrep "</formattedValue$"|cut -d "<" -f1 >FILE1 ;cat XMLFILE|tr '>' '\012'|egrep "</displayName$"|cut -d "<" -f1 >FILE2;paste FILE1 FILE2|grep -- - ;rm FILE1 FILE2 )
FYI : I tried on RHE4 machine
$ cat display|tr '>' '\012'|egrep "</displayName$"|cut -d "<" -f1 >displayName
$ cat display|tr '>' '\012'|egrep "</formattedValue$"|cut -d "<" -f1 >formattedValue
$ paste displayName formattedValue|grep -- - ;rm displayName formattedValue
04-01-2012 444930
20-12-2011 1136654
12-12-2011 1081680
09-12-2011 1177505
---------- Post updated 01-08-12 at 12:23 AM ---------- Previous update was 01-07-12 at 11:25 PM ----------
@chakrapani : Can you please post the command ?
Tried but unable to get the required output .