happyv
June 26, 2007, 12:05am
1
Hello,
I have the following xml formatted file. I would like to get the newnumber field number and replace into customernumber for each record.
For example:
<XMLFORMAT>
<customernumberR11>9</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>20</customernumberR11>
<newnumberR11>18</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>11</customernumberR11>
<newnumberR11>32</newnumberR11>
<customerdetailR11>
</XMLFORMT>
expected output:
<XMLFORMAT>
<customernumberR11>30</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>18</customernumberR11>
<newnumberR11>18</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>32</customernumberR11>
<newnumberR11>32</newnumberR11>
<customerdetailR11>
</XMLFORMT>
Many thank...and please advise.
What have you done to attempt to solve this problem yourself?
Post your sample script, and we'll see how we can assist.
Cheers,
ZB
happyv
June 26, 2007, 1:56am
3
i have no idea to write the script to fix...
normally, I need to do it manually...
Try something like this
#!/usr/bin/perl
open( FH, "input.txt" ) || die "Couldn't open file...\n";
while ( <FH> ) {
$data .= $_;
}
$data=~ s/(<customernumberR11>)[0-9]*(<\/customernumberR11>\n<newnumberR11>)([0-9]*)(<\/newnumberR11>)/$1$3$2$3$4/g;
print $data;
close( FH );
exit( 0 );
Cheers
ZB
anbu23
June 26, 2007, 2:14am
5
sed "/<customernumberR11>/{N;s/\(>[0-9]*<\)\(.*\)\(>[0-9]*<\)/\3\2\3/;}" filename
you have 186 posts, by now you should know how to start writing scripts or at least start something.
awk '/<customernumberR11>/{ line=$0 }
/<newnumberR11>/{
current=$0
gsub("<newnumberR11>|</newnumberR11>","",$0)
gsub(/>(.*)</ , ">"$0"<",line)
print line
print current
next
}
!/<customernumberR11>/ && !/<newnumberR11>/ {print}
' "file"
happyv
June 26, 2007, 3:00am
7
thank...but it look like not work
awk: syntax error near line 4
awk: illegal statement near line 4
awk: syntax error near line 5
awk: illegal statement near line 5
awk: syntax error near line 10
awk: bailing out near line 10
aigles
June 26, 2007, 3:05am
8
Try nawk instead of awk
Another way always with awk (or nawk)
awk -F '[<>]' '
$2=="customernumberR11" { next }
$2=="newnumberR11" { print "<customernumberR11>" $3 "</customernumberR11>"}
1
' file
happyv
June 26, 2007, 4:08am
9
yes...I tried to you nawk or awk. but the script look like not work. It move the cusomernumber after the newnumber and duplicated the newnumber
<XMLFORMAT>
<newnumberR11>30</newnumberR11>
<customernumberR11>30</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
aigles
June 26, 2007, 5:05am
10
The two awk solutions (ghostdog74 and mine) work fine on my AIX box
Input file:
$ cat xml.txt
<XMLFORMAT>
<customernumberR11>9</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>20</customernumberR11>
<newnumberR11>18</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>11</customernumberR11>
<newnumberR11>32</newnumberR11>
<customerdetailR11>
</XMLFORMT>
$
The ghostdog74' solution :
$ cat xml2.sh
awk '/<customernumberR11>/{ line=$0 }
/<newnumberR11>/{
current=$0
gsub("<newnumberR11>|</newnumberR11>","",$0)
gsub(/>(.*)</ , ">"$0"<",line)
print line
print current
next
}
!/<customernumberR11>/ && !/<newnumberR11>/ {print}
' xml.txt
$ sh xml2.sh
<XMLFORMAT>
<customernumberR11>30</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>18</customernumberR11>
<newnumberR11>18</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>32</customernumberR11>
<newnumberR11>32</newnumberR11>
<customerdetailR11>
</XMLFORMT>
$
My solution :
$ cat xml.sh
awk -F '[<>]' '
$2=="customernumberR11" { next }
$2=="newnumberR11" { print "<customernumberR11>" $3 "</customernumberR11>"}
1
' xml.txt
$ sh xml.sh
<XMLFORMAT>
<customernumberR11>30</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>18</customernumberR11>
<newnumberR11>18</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
<customernumberR11>32</customernumberR11>
<newnumberR11>32</newnumberR11>
<customerdetailR11>
</XMLFORMT>
$
happyv
June 26, 2007, 6:07am
11
oh...very sorry..
forgot provide one more IMPORTANT information. The customernumber and newnumber field is not the same field number for each records. Sometime, the customernumber in row 1, sometime in row 2. Therefore, it difficult for me..
<XMLFORMAT>
....
...
<customernumberR11>9</customernumberR11>
<newnumberR11>30</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
....
<customernumberR11>20</customernumberR11>
...
<newnumberR11>18</newnumberR11>
<customerdetailR11>
</XMLFORMT>
<XMLFORMAT>
....
...
...
<customernumberR11>11</customernumberR11>
....
..
<newnumberR11>32</newnumberR11>
<customerdetailR11>
</XMLFORMT>
while read line
do
str1=`grep 'customernumberR11' file1 | cut -c 20-21`
str2=`grep 'newnumberR11' file1 | cut -c 15-16`
if [ "$str1" -ne "$str2" ] ; then
sed -e "s/newnumberR11/ s/$str2/\$str1/g" $line
fi
done < file1
aigles
June 26, 2007, 7:56am
13
Try this :
awk -F '[<>]' '
$2=="customernumberR11" { customernumber = 1; next}
$2=="newnumberR11" { number = $3 ; newnumber = 1; next}
customernumber && newnumber {
printf("<customernumberR11>%d</customernumberR11>\n<newnumberR11>%d</newnumberR11>\n", number, number);
customernumber = newnumber = 0;
}
1
' xml.txt