I am learning regular expression in sed,Please help me understand the use curly bracket in sed,

I am learning SED and just following the shell scripting book, i have trouble understanding the grep and sed statement,

Question : 1
__________

/opt/oracle/work/antony>cat teledir.txt
jai sharma 25853670
chanchal singhvi 9831545629
anil aggarwal 9830263298
shyam saksena 23217847
lalit chowdury 26688726


[quint2]PCID
/opt/oracle/work/antony> grep '[0-9]\{10\}' teledir.txt 
chanchal singhvi 9831545629
anil aggarwal 9830263298

To my understanding , the above select will look for the number between 0 to 9 which should no exceed character 10.

If my above statement correct, then i should get the 8 digit mobile number in below select statement, but i am getting 10 and 8 digit number. it suppose to look for 0 to 9 and give 8 digit mobile number. please correct me if i am wrong.

/opt/oracle/work/antony>  grep '[0-9]\{8\}' teledir.txt
jai sharma 25853670
chanchal singhvi 9831545629
anil aggarwal 9830263298
shyam saksena 23217847
lalit chowdury 26688726

Question 2
-----------

/opt/oracle/work/antony>  ls -l | sed -n '/^.\{2,3\}w/p'
-rw-r-----   1 oracle     dba             11 Jun  2 15:28 control.sql
-rw-r-----   1 oracle     dba             50 Mar 24 16:40 db_list
-rw-r-----   1 oracle     dba            386 Jun  3 16:34 emp.1st
-rw-r-----   1 oracle     dba              0 Jun  3 16:39 sed
-rw-r-----   1 oracle     dba            327 Jun  2 15:09 stuff.sql
-rw-r-----   1 oracle     dba            120 Jun  3 20:10 teledir.txt
drwxr-x---   2 oracle     dba             96 Jun  3 17:14 test
-rwxr-xr-x   1 oracle     dba            330 Mar 24 16:42 trex.sh

From the above select statement,

  1. i am not able to understand what is the use of curly brackets?
  2. \{2,3\}w/p' - what does each argument meant for ?
  3. what is the use of w,{2,3\}?

Please explain

Thanks

Antony Ankrose J

It will look for exactly 10 digits in a row.

These exact 10 digits can appear anywhere in the line.

It will find 8 digits inside 10 digits, the same way it will find 'gas' inside 'gasoline'.

To find only 8 digits you must be more specific. Perhaps

grep ' [0-9]{8}$'

to tell it there must be a space in front, and that the string must end at the end of the line.

1 Like

A snippet from man regexp

Matching a specified number of occurrences
    A BRE interval expression has the syntax ``\{l\}'', ``\{l,\}'',
   or ``\{l,u\}''.
    An ERE interval expression has the syntax ``{l}'', ``{l,}'', or
   ``{l,u}''.

Also,
I found this site:
http://http://txt2re.com/index-python.php3
very helpful when dealing with regular expressions

Thanks you very much , i got it clarified.

Please help me to understand sed statement

Question 2
-----------

Code:

/opt/oracle/work/antony>  ls -l | sed -n '/^.\{2,3\}w/p'
-rw-r-----   1 oracle     dba             11 Jun  2 15:28 control.sql
-rw-r-----   1 oracle     dba             50 Mar 24 16:40 db_list
-rw-r-----   1 oracle     dba            386 Jun  3 16:34 emp.1st
-rw-r-----   1 oracle     dba              0 Jun  3 16:39 sed
-rw-r-----   1 oracle     dba            327 Jun  2 15:09 stuff.sql
-rw-r-----   1 oracle     dba            120 Jun  3 20:10 teledir.txt
drwxr-x---   2 oracle     dba             96 Jun  3 17:14 test
-rwxr-xr-x   1 oracle     dba            330 Mar 24 16:42 trex.sh

From the above select statement,

  1. i am not able to understand what is the use of curly brackets?
  2. what is the use of {2,3\}w meant for?

Please explain

Starting at the beginning of the line i.e. ^, match either 2 or 3 of any character, i.e. "." before a "w" character.

Thanks Corona688

Still i am not able to get you completely what you say
^ Matches the beginning of lines.

. Matches any single character.

Which means the command matches any character in the beginning of the line followed by any single character and then followed by checking any character between 2 or 3
in the first column and not able to get what this w for ?

Please help me understand

No, it matches 2 or 3 of "any character". It'd match AB, QRZ, --, or anything that's two or three characters long.

So basically, this regex means "the third or fourth character must be a w".

The letter "w" in the regex matches the letter "w" in the string.

ls -l | sed -n '/^.\{2,3\}w/p'
...
-rwxr-xr-x   1 oracle     dba            330 Mar 24 16:42 trex.sh
...
1 Like

Now i got it clarified , Thanks again