I have a XML file where there is a tag with like
<wd:address_line_1>1234 Street</wd:address_line_1>
I want to replace the values "1234 Street" with "Test Data". Different people have different address lines and i want to replace with a fixed value to mask the file. I was trying to use sed with regex but mostly the characters like < and </ . Need some help on this.
Is the address line always like you describe, so that the two "keys" are:
one at the start of the line
the next always at the end of the line follwed by a \n
-- and the address lines are always on a single line ??
If so,
awk ' /^wd:address_line_/ {sub("\>.*\<", ">Test data"<") }
{print} ' somefile.xml
awk: cmd. line:1: warning: escape sequence `\>' treated as plain `>'
awk: cmd. line:1: warning: escape sequence `\<' treated as plain `<'
In this case it is just a warning, but they should not be there*, so remove the back slashes. The issue in this case is with
/^wd:address_line_/
which should be changed to
/^<wd:address_line_/
.
--
An alternative approach would be:
awk '/^wd:address_line_/{$2="Test data"}1' RS=\< ORS=\< FS=\> OFS=\> file.xml
--
*Note: In GNU awk \<
and \>
have a special meaning (left and right word boundary).
awk '/^wd:address_line_/{$2="Test data"}1' RS=\< ORS=\< FS=\> OFS=\> file.xml
samething i am doing for national_id but both national_id and national_id_type is getting replaced. I need only one to be replaced.
Use code tags for code please.
```text
stuff
```
An attempt with sed
sed '
s|\(<wd:address_line_1\)>.*<|\1>Test data<|
s|\(<wd:national_id\)>.*<|\1>Test data2<|
'
Try:
awk '$1=="wd:address_line_1"{$2="Test data"}1' RS=\< ORS=\< FS=\> OFS=\> file.xml