String of exclusions failed.

We are in a conversion where a list of six digit numbers needs to be excluded from an existing report. As new ones are added we have an ever longer string of "grep -v" commands like:

  grep -v 020516 | grep -v 020522 | grep -v 030132 | \
  grep -v 030330 | grep -v 030357 | grep -v 050111 | \
  grep -v 070301 | grep -v 070319 | grep -v 070321 | \
  grep -v 090907 | grep -v 090914 | \
   ...........

Today we got "fork failed - too many processes"

The above is preceded by a grep that reads the report to filter out lines without an error code and pass them to many lines of "grep -v"s.

My question is how to fix this? Is there a better way to exclude lines than a string of "grep -v" commands?

TIA

We are using SCO Unix.

Not sure about the grep version you run, but how about

  • collecting the numbers in a file and use grep -Ff pattern-file
  • using "alternation" like grep -v "number\|number\|..."

or, just to be complete (although the way your problem sounds, i'd prefer RudiCs first method of a pattern file), using:

grep -ve "first" -ve "second" -ve "third" .....

I hope this helps.

bakunin

@bakunin: yes, equivalent to alternation. The -v is needed only once.

I couldn't find "-fF" in my Unix books or "man grep", but putting up to three "-ve" reduces the number of times I have to run it by nearly a third. Also runs faster.

Thank you, gentlemen.

I've used the command file for sed many times, but not grep.

It's not one option, it's two options, both fairly standard AFAIK.

-f to load from a file.
-F to consider it to hold fixed strings instead of regexes.

Wouldn't it be so much nicer to have a file full of

string1
string2
string3

than a mess of grep statements marginally below the system limit?

Recent versions of grep have a -E option (to specify using extended regular expressions in patterns instead of the default basic regular expression) and a -F option (to specify using fixed strings instead of regular expressions in patterns).

Older versions of grep had an fgrep utility that behaved like newer versions behave with grep -F and older versions of grep had an egrep utility that behaved like newer versions with grep -E .

The -f file option and option-argument have been specified for a long time but might not have been around when SCO UNIX was in active development. On systems that have a -f option, -f file specifies using a file named file to get a list of patterns instead of (or in addition to) getting them from the command line.

Note that if you use an option that requires an option-argument, that option must be specified last if you combine a group of options in one command-line argument. The command-line:

grep -vFf file

is a request to print lines from standard input that do not match ( -v option) fixed strings ( -F option) found in the file named file ( -f file option and option-argument).
On the other hand the command-line:

grep -vfF file

is a request to print lines from the file named file (an operand on the command-line) that do not match ( -v option) found in the file named F ( -fF option and option-argument).

For SCO UNIX, I would suggest that you try:

egrep -v 'number|number|...'

The egrep uses EREs instead of BREs for the pattern (which allows you to use | as a separator between alternatives in a single pattern) instead of specifying multiple patterns with -e options. The length of a command-line given to the shell is essentially unlimited, but if you're creating a script you may be using an editor that limits line lengths to about 2,048 bytes per line. That should be more than enough to create a pipeline that won't break active process count limits even if the SCO UNIX egrep doesn't have a -f option.

Perhaps your grep takes multiple patterns divided by newline?
Then it can shrink to

grep -v "\
number
number
...
number"
1 Like