Non printing option in sed is behaving oddly

Paul_Walker · July 7, 2019, 12:59pm

Hi I'm having a problem with a sed command that I thought I was using correctly but apparently that's not the case.

I was hoping someone here could point out what it is I am doing wrong?
I am using the print, no print option for a matched pattern in sed. Everything seemed to be working fine except I noticed that some lines that were matching my pattern are missing from my output.

After doing a little digging I found that after the command matched a pattern, if the next line also matched the pattern it would fail to output the second matched pattern line.
Can any one see why its behaving this way?
Below is the command I am using

sed -n -e '/\<PreviousJobNum\>[A-Z]*[0-9][0-9]*[A-Z]*\<\/PreviousJobNum\>/{p;n;}' /MyInputFile > /MyOutPutFile

The file I am reading in is MyInputFile, the content of which is

<PreviousJobNum>93296</PreviousJobNum>
<PreviousJobNum>95879D</PreviousJobNum>

When I run the above command my output in MyOutPutFile is

<PreviousJobNum>93296</PreviousJobNum>

If I change the data slightly in the original MyInputFile to have line between the two matched pattern lines, like below,

<PreviousJobNum>93296</PreviousJobNum>


<PreviousJobNum>95879D</PreviousJobNum>

Then my output Picks up both matched pattern lines in the MyOutPutFile as below.

<PreviousJobNum>93296</PreviousJobNum>
<PreviousJobNum>95879D</PreviousJobNum>

I suppose I could double space my input file to get around this but I think it would be better to understand why its behaving this way?
if anyone was able to offer some assistance with this I would be very grateful.
Thank you very much
Paul

MadeInGermany · July 7, 2019, 1:59pm

The n command fetches the next input line. No sed code follows that does something on it, so it goes to the next cycle that - as usual - fetches the next input line, in this case the over-next line.

Note that the n command behaves different from the next instruction in awk and perl (that jumps to the end of the input loop). This is like the b command in sed (without a label).

If there is no further code then you can simply omit the n command.

RudiC · July 7, 2019, 4:17pm

Note that you can simplify your sed script using a "back reference" on top of other sed idiosyncrasies (cf man sed ):

sed -rn  '\#<(PreviousJobNum>)[A-Z]*[0-9]+[A-Z]*</\1#p' file
<PreviousJobNum>93296</PreviousJobNum>
<PreviousJobNum>95879D</PreviousJobNum>

Paul_Walker · July 7, 2019, 8:15pm

Thank you for clearing that up

Scrutinizer · July 7, 2019, 9:35pm

To add, you can use the d command to go to the next cycle without further processing.

Paul_Walker · July 8, 2019, 10:23am

Thank you I will try using that as well!