HI Guys,
I have Big XML file with Below Format :-
Input :-
<pokl>MKL=1,FN=1,GBNo=B10C</pokl>
<d>192</d>
<d>315</d>
<d>35</d>
<d>0,7,8</d>
<pokl>MKL=1,dFN=1,GBNo=B11C</pokl>
<d>162</d>
<d>315</d>
<d>35</d>
<d>0,5,6</d>
<pokl>MKL=1,dFN=1,GBNo=B12C</pokl>
<d>188</d>
<d>315</d>
<d>33</d>
<d>0,3,4</d>
<pokl>MKL=1,dFN=1,GBNo=B13C</pokl>
<d>192</d>
<d>315</d>
<d>35</d>
<d>0,1,2</d>
Output:-
B10C 192;315;35;0,7,8
B11C 162;315;35;0,5,6
B12C 188;315;35;0,3,4
B13C 192;315;35;0,1,2
---------- Post updated at 08:57 PM ---------- Previous update was at 08:41 PM ----------
Got Ans.....
Thanks
MasWag
2
With sed, like this
cat input.xml | tr -d "\n" | sed 's:</d><pokl>\|$:\n:g;' | sed 's/^.*GBNo=//;s:</pokl><d>: :;s:</d><d>:;:g;s:</d>::;'
1 Like
danmero
3
awk -F'[<>]' '/pokl/{split($3,a,"[,|=]");printf "%s ",a[6];for(i=1;i<5;i++){getline;printf "%s%s",$3,(i==4)?RS:";"}}' file
Hi pareshkp,
You might also want to try a slightly simpler awk
script:
awk -F'[<=>]' '{printf("%s%s",$(NF-2),(NR%5)?(NR%5==1)?" ":";":ORS)}' Input
which with your sample input produces the output:
B10C 192;315;35;0,7,8
B11C 162;315;35;0,5,6
B12C 188;315;33;0,3,4
B13C 192;315;35;0,1,2
which differs from the output you said you wanted in two places:
- there is no space character following the 8 at the end of the first line, and
- there is a
33
in the 3rd line where you said you wanted a 35
.
The output shown here seems to match the sample input provided better than the output you said you wanted.
2 Likes
danmero
5
Hi Don Cragun, your solution is simpler and run 30% faster that my. Thanks.
1 Like