How does egrep work?

I use grep all the time. Don't have a problem understanding it. But I am looking at a script that uses 'eprep'. Basically we have something that uses ssh to log into a remote host and execute a "df -h" and emails us. We use that to check space on remote machines. Lately I have noticed the 'egrep' command isn't returning anything. Someone else who is no longer around wrote this. I don't understand what the statement is doing. On one machine it returns what I would expect, on others, it returns nothing. On a working machine:

 ssh remote_machine  df -h | /usr/bin/egrep "100%|9[0-9]%|8[0-9]%|7[0-9]%|capacity"
Filesystem             size   used  avail capacity  Mounted on
/dev/md/dsk/d33         87G    65G    20G    77%    /opt
/dev/md/dsk/d34        7.9G   6.8G   983M    88%    /export/home
/opt/u01                87G    65G    20G    77%    /u01
/opt/u02                87G    65G    20G    77%    /u02
/opt/u01/u03            87G    65G    20G    77%    /u03
/u02/u04                87G    65G    20G    77%    /u04


 

Machine that doesn't work:

  ssh remote_host  df -h | /usr/bin/egrep "100%|9[0-9]%|8[0-9]%|7[0-9]%|capacity"
Filesystem             size   used  avail capacity  Mounted on

  

If I leave off the egrep, I do see the output of the df command, so I am sucessfully logging into the remote machine.

Please post output from the second df command.

I did post the output. It simply returns the heading to a df, no content:

ssh remote_machine df -h | /usr/bin/egrep "100%|9[0-9]%|8[0-9]%|7[0-9]%|capacity"

Filesystem             size   used  avail capacity  Mounted on

I meant the df output without egrep filtering.
And, I assume that you realise that the filter only passes through specific percentages/text from its input.

Ah, OK. No I didn't understand that. I tried the egrep and just grepped for capacity, but didn't understand the percentages. Here is the df without the egrep. Now that you said that I am guessing it is looking for output that is 70% or above:

Filesystem             size   used  avail capacity  Mounted on
/dev/md/dsk/d30         16G   4.6G    11G    30%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                   8.1G   1.4M   8.1G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
sharefs                  0K     0K     0K     0%    /etc/dfs/sharetab
/platform/SUNW,T5140/lib/libc_psr/libc_psr_hwcap2.so.1    16G   4.6G    11G    30%    /platform/sun4v/lib/libc_psr.so.1
/platform/SUNW,T5140/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1    16G   4.6G    11G    30%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/md/dsk/d32         16G   5.8G    10G    37%    /var
swap                   8.1G   8.3M   8.1G     1%    /tmp
swap                   8.1G    48K   8.1G     1%    /var/run
/dev/md/dsk/d33         87G    47G    39G    55%    /opt
/dev/md/dsk/d34        7.9G   4.7G   3.1G    61%    /export/home
/opt/u01                87G    47G    39G    55%    /u01
/opt/u02                87G    47G    39G    55%    /u02
/opt/u01/u03            87G    47G    39G    55%    /u03

The "|" in egrep is an "or".
It problably make more sense to convert to

ssh remote_machine df -h | awk 'NR==1 {header=$0 RS} $5~/[7-9][0-9]%|100%/ {print header $0; header=""}'

awk takes the same ERE as egrep, namely "[7-9][0-9]%|100%" but this operates on $5=column#5 only.
The main advantage is that it only prints the header (from NR=line#1) if needed.

1 Like