Understanding pattern matching used in a grep command

I have the following code. I want to remove the --sort=num/num/... and am
using grep to exclude it as shown below:

I have a bit of problem figuring out the use of [-]- at the front

echo "--sort=4/5/6"  | grep -ivE '[-]-((sort|group)=[0-9]+/[0-9]+(/[0-9]+)*)$'

Now suppose I want to remove --quiet

I can do

echo "--sort=4/5/6"  | grep -ivE '[-]-(quiet)'

or

 echo "--sort=4/5/6"  | grep -ivE '(--quiet)'
 

They both seem to work find but cannot figure out the different. I think using

  echo "--sort=4/5/6"  | grep -ivE '(--quiet)'
  

will suffice.

I am also thinking of using

echo "--sort=4/5/6"  | grep -ivE '(--(sort|group)=[0-9]+/[0-9]+(/[0-9]+)*)$'

instead of using

echo "--sort=4/5/6"  | grep -ivE '[-]-((sort|group)=[0-9]+/[0-9]+(/[0-9]+)*)$'

The reason [-]- is used is to prevent grep from interpreting your pattern as a command line flag. You could also use this:

echo "--sort=4/5/6"  | grep -ivE -- '--quiet'

The -- outside of your pattern is a flag to cause grep to stop looking for command line flags.

1 Like

Which is the most popular?
[-]- or --

---------- Post updated at 08:53 PM ---------- Previous update was at 08:47 PM ----------

I tried to look up the [-]- or -- declaration for grep and cannot find a reference for it.

Placing -- ahead of the pattern is the defacto standard way to indicate end of command line flags. Most commands that expect dashed command line flags recognise the double dash with trailing whitespace to mean the end of the list and to assume the next token is the first positional parameter.

Using the [-]- obviously works, but makes the intent of the pattern less obvious.

I was going to quote the grep man page, but it makes no mention of this.

1 Like

The thing is that if I bracket --help, grep does not run the --help command line option.

echo "--help" | grep -iE '(--help)'

But doing

echo "--help" | grep -iE '--help'

does run the help option of grep.

Using the brackets did not require use of [-]- or -- in front of the pattern.

Correct me if I am wrong on the following:

In the regular expression [-]-help, the - inside the square brackets matches a character set which contains a single character ( - in this case) and the literal string "-help".

echo "--help" | grep -iE '--help'

grep sees '--help' as an option.

If I were putting this into a script I'd code it this way:

echo "--help" | grep -iE --    '--help'

Extra spaces aren't needed, just set the pattern off a bit. The first pair of dashes end greps search for flags, and thus it treats --help as the pattern.

---------- Post updated at 21:36 ---------- Previous update was at 21:34 ----------

You are correct.

Yes, [-] is any one character from the list -
i.e. the list is only a single hyphen.

A very similar trick is also quite commonly used to avoid grep finding it's self when searching ps listings:

$ ps -ef | grep "xntpd"
   root  528  1394   0   Mar 12      - 15:00 /usr/sbin/xntpd 
chubler 1882 15279   0 12:37:00 pts/20  0:00 grep xntpd
 
$ ps -ef | grep "[x]ntpd"
   root  528  1394   0   Mar 12      - 15:00 /usr/sbin/xntpd

It's documented by POSIX in the Shell and Utilities volume, since it applies to many utilities. See the OPTIONS section of http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3\_chap01.html\#tag\_17_04 for more info.

Of interest may be the fact that POSIX requires compliant echo implementations to NOT support the -- argument's semantics. echo

This means that conformant echos cannot support any command options. Of course, in practice, most do implement some options and there's usually no way to disable the option processing. This renders the general case of echo $text problematic. If $text expands to a valid option, the result is something unintended.

This is why these days I use printf for most everything.

Regards,
Alister

1 Like