My data is xml'ish (here is an excerpt) :-
<bag name="mybag1" version="1.0"/>
<contents id="coins"/>
<bag name="mybag2" version="1.1"/>
<contents id="clothes"/>
<contents id="shoes"/>
<bag name="mybag3" version="1.6"/>
I want to delete line containing mybag2 and its subsequent contents (number of contents lines can vary). Thus I wish to delete from pattern mybag2 up to (but not including) the next "bag name" tag and result in :-
<bag name="mybag1" version="1.0"/>
<contents id="coins"/>
<bag name="mybag3" version="1.6"/>
I have tried this a few different ways with sed and awk and have yet to find a solution. Any help would be appreciated.
This is my solution:
file : awk_tets
BEGIN { flag = 0}
/^<bag name="mybag2"/ { flag = 1}
/^<bag name="mybag3"/ { flag = 0}
{ if (flag == 0) { print; } }
in command line:
$ awk -f awk_test your_data_file > result
Let's try it
kholostoi:
This is my solution:
---------file : awk_tets --------------
BEGIN { flag = 0}
/^<bag name="mybag2"/ { flag = 1}
/^<bag name="mybag3"/ { flag = 0}
{ if (flag == 0) { print; } }
---------------------------------------
in command line: $ awk -f awk_test your_data_file > result
Let's try it
this is what i had in mind too, problem is that it may not handel nested ceses too well.....
I tried that solution :-
me@myserver $ nawk ' BEGIN { flag = 0} /^<bag name="mybag2"/ { flag = 1} /^<bag name="mybag3"/ { flag = 0} { if (flag == 0) { print; }}' test2
<bag name="mybag1" version="1.0"/>
<contents id="coins"/>
<bag name="mybag3" version="1.6"/>
It works, but the issue is that the second bag name could be any value, its not specifically "mybag3", so the second pattern must be the more generic "/^bag name=/
So i tried that also :-
me@myserver $ nawk ' BEGIN { flag = 0} /^<bag name="mybag2"/ { flag = 1} /^<bag name=/ { flag = 0} { if (flag == 0) { print; }}' test2
<bag name="mybag1" version="1.0"/>
<contents id="coins"/>
<bag name="mybag2" version="1.1"/>
<contents id="clothes"/>
<contents id="shoes"/>
<bag name="mybag3" version="1.6"/>
it failed because the flag was being reset straight after it was set as the more generic pattern also matched the mybag2 line.
Then I switched the pattern order and BINGO !
me@myserver $ nawk ' BEGIN { flag = 0} /^<bag name=/ { flag = 0} /^<bag name="mybag2"/ { flag = 1} { if (flag == 0) { print; }}' test2<bag
name="mybag1" version="1.0"/>
<contents id="coins"/>
<bag name="mybag3" version="1.6"/>
Thanks very much for leading me in the right direction kholostoi
Or:
awk '/^<bag name=/{f=0}$0~v{f=1}!f' v="mybag2" file
Use nawk or /usr/xpg4/bin/awk on Solaris.
much more elegant, thanks radoulov
I will use that