Hi,
I've been trying to write a regex to use in egrep (in a shell script) that'll fetch the names of all the files that match a particular pattern. I expect to match the following line in a file:
But I now want to search all filenames which have the name quoted in either " OR '
i.e., the pattern to match could be:
Name = "abc"
OR
Name = 'abc'
I've tried various ways to include ' in the regex I've posted above, but I'm not able to get it right. I've tried using a backslash (\) and also tried things like [\'\"].
Any help to get this right is appreciated.
Thanks.
Thanks Franklin52,
Is there a way to do the same in grep itself. I wanted the regex to be part of the grep regex itself. Are you suggesting that I use awk instead of grep?
When you need to protect special characters on the command line, you need to use quoting. However, as you've found, if the special characters are the quote symbols themselves, you can run into trouble.
One solution for some versions of grep is to have the pattern in a file so that it does not appear on the command line. That can be accomplished by using a here document to create the file. There are features in the here document syntax to ignore special characters, in addition to creating a file from within a script. Once that is done, we can use grep to read the regular expressions from the newly-created file. Here is an example:
#!/usr/bin/env bash
# @(#) s1 Demonstrate isolation of quotes in file for grep.
echo
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) grep
set -o nounset
echo
FILE=${1-data1}
echo " Data file $FILE:"
cat $FILE
echo
echo " Results:"
cat > my-pattern <<'EOF'
[nN][aA][mM][eE] *= *['"].*['"]
EOF
grep -f my-pattern $FILE
exit 0
producing:
% ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution : Debian GNU/Linux 5.0
GNU bash 3.2.39
GNU grep 2.5.3
Data file data1:
name = "double-1"
Name = "double-2"
name = 'single-1'
name = none
name = 'single-2'
Results:
name = "double-1"
Name = "double-2"
name = 'single-1'
name = 'single-2'
Another method is to surround the regular expression on the command line with double quotes. Inside of double quotes you may have escaped double quotes, \", and single quotes. However, you may not have escaped single quotes within a single-quoted string.
See man pages for details. Good luck ... cheers, drl
Modifications to your regular expressions, all searches done with egrep:
#!/usr/bin/env bash
# @(#) s2 Demonstrate isolation of quotes in file for egrep.
echo
set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) egrep
set -o nounset
echo
FILE=${1-data1}
echo " Data file $FILE:"
cat $FILE
echo
echo " Results with here-document solution:"
cat > my-pattern <<'EOF'
[nN][aA][mM][eE] *= *['"].*['"]
EOF
egrep -f my-pattern $FILE
echo
echo " Results with command-line solution 1:"
# egrep -l "(^[nN][aA][mM][eE]) *= *[\"\'] *[a-zA-Z0-9_+-]* *[\"\']$" /FILE
egrep "^[nN][aA][mM][eE] *= *[\"'] *[a-zA-Z0-9_+-]* *[\"']$" $FILE
echo
echo " Results with command-line solution 2:"
# egrep -l "(^[nN][aA][mM][eE]) *= *\"\|\' *[a-zA-Z0-9_+-]* *\"\|\'$" /FILE
egrep "^[nN][aA][mM][eE] *= *(\"|') *[a-zA-Z0-9_+-]* *(\"|')$" $FILE
exit 0
producing:
% ./s2
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution : Debian GNU/Linux 5.0
GNU bash 3.2.39
egrep GNU grep 2.5.3
Data file data1:
name = "double-1"
Name = "double-2"
name = 'single-1'
name = none
name = 'single-2'
Results with here-document solution:
name = "double-1"
Name = "double-2"
name = 'single-1'
name = 'single-2'
Results with command-line solution 1:
name = "double-1"
Name = "double-2"
name = 'single-1'
name = 'single-2'
Results with command-line solution 2:
name = "double-1"
Name = "double-2"
name = 'single-1'
name = 'single-2'
See man pages, experiment on small cases ... cheers, drl