Binary pattern matching in UNIX

I think what I'm trying to do is pretty straightforward but I just can't find a way to do it.

I'm trying to run a double pattern match in a three column file. If the first two columns match, I need to output the third.

So in the file

AAA BBB 1
BBC CCC 5
CCC DDD 7
DDD EEE 12

If the pattern match 'BBB' && 'CCC' is run, the output if '5'. Only one line in the file could ever match the precise pattern.

So far straightforward. But if no single line matches the pattern, I need it to output 'NaN' or something to indicate that there is no pattern in the file.

So the output is always a single line, either '5' or 'NaN'.

Conceptually this seems straightforward but I can't get it to work or find an example where you can generate a simple one-line output.

Thank you for any help

In fact it is quite simple to implement. As i sense you want to do this in sed here is how you do it:

1 - if a match is found (in any line, including the first) output that third field (or whatever) and immediately exit sed (you can use the "q" command to do so).

2) - for the last line, put a rule as the last into the script of writing "NaN" before quitting from the script. If the script gets thee the last line didn't match either.

Here is the implementation (supposing your desired match is "BBB" and "CCC"):

sed '/^BBB CCC/ {
         s/^BBB CCC//p
         q
      }
      $$ s/.*/NaN/p' /path/to/input

I hope this helps.

bakunin

Hi bakunin

Thank you so much for your quick reply.

The only thing I don't really understand is how 'q' links the statements. At the moment when I try and implement it I get the error.

bad flag in substitute command: 'q'

Best wishes

Matthew

Please show us the exact code you used for this and let us know what operating system and shell you're using.
There was no q flag in the substitute command in the script bakunin suggested. There was, however, a q command on the line after the substitute command.

I'm afraid two small modifications are needed to make Scrutinizer's code run :

  • use -n option to suppress printing all lines
  • use one single $ only to indicate last line
sed -n '/^BBB CCC/ {
         s/^BBB CCC//p
         q
      }
      $ s/.*/NaN/p' /path/to/input

And, a little trick might save some typing: reuse the last regex in the first s ubstitute command like s///p .