Help with regular expressions

I have a file that I'm trying to find all the cases of phone number extensions and deleting them. So input file looks like:

abc
x93825
def
13234
x52673
hello

output looks like:
abc
def
13234
hello

Basically delete lines that have 5 numbers following "x". I tried: x\(4)[0-9] but it doesn't seem to work. Can any regex experts help? thx.

sed -n '/^x[0-9][0-9][0-9][0-9][0-9]/!p' file
1 Like

bash:

while read line; do [[ $line =~ ^x[0-9]{5}$ ]] || echo "$line"; done <<<"abc
x93825
def
13234
x52673
hello"
abc
def
13234
hello

sed:

sed -r '/^x[0-9]{5}$/d' <<<"abc
x93825
def
13234
x52673
hello"
abc
def
13234
hello

kurumi's code worked, thank you. So I didn't try daPeach's.

Is there a way to find lines that have more than 3 capitalized letters in them?

bash:

while read line; do precount="${line//[[:lower:] ]}"; (( ${#precount} > 3 )) && echo "$line"; done <<<"a B c D e F g H
aBcDeFgH
a B c D e F g h
aBcDeFgh"
a B c D e F g H
aBcDeFgH

Sed one liner to print lines with three or more capital letters:

sed  -r -n  '/([A-Z][^A-Z]*){3,}/p'

Replace -r with -E if you're using a BSD system, or AST's (AT&T) sed.

Thanks agama. Just to learn, what does the /p at the end of the code do?

Hi.

it means print whatever expression is matched (in this case). You need it because you're using the -n option (which means don't print anything). Without it, you wouldn't get any output.

1 Like