Grepping for hex characters - explanation?

mregine · January 26, 2011, 4:49pm

Hello,

Yesterday I was looking for a way to grep for a tab in the shell, and found this solution in several places:

grep $'[\x09]a[\x09]' # Grep for the letter 'a' between two tabs

I'm fine with most of this, but I don't understand what the $ (dollar sign) before the first quote does. It doesn't work without, but I couldn't find any explanation in the grep man or info pages. The only mention of $ there is as a meta-character that matches the end of a regular expression.

Can someone explain and/or point me to other documentation where I can read it up?

Thanks!

Corona688 · January 26, 2011, 5:10pm

The $ part actually happens in the shell.

$ echo $'[\x09]a[\x09]' | hexdump -C
00000000  5b 09 5d 61 5b 09 5d 0a                           |[.]a[.].|
00000008
$

So it's not a special option to grep, it's actually an expression that feeds these characters into grep's expression raw.

That's interesting. I've occasionally used that syntax for making tabs like $'\t' but didn't know you could put whole strings in that.

DGPickett · January 26, 2011, 5:17pm

My take is the $ is literal, but regex keep morphing under my feet:

Regular-Expressions.info - Regex Tutorial, Examples and Reference - Regexp Patterns

I usually just use the tab key or ctrl-V ctrl-whatever, in ' ', or $(echo a|tr 'a' '\027') (but my tr only does octal). But I use ksh, this is a bash-ism:

$ bash <<!
echo $'\x61'
!
a
$

mregine · January 26, 2011, 6:06pm

Can you expand on what the "raw" means? I think I sort of understand what you mean but I'm not sure... I compared the command you gave with the same without the $ and it helps a little, but not completely...?

Corona688 · January 26, 2011, 7:26pm

Consider this:

$ echo -e "hello world\n\n"
hello world


$ echo "hello world\n\n"
hello world\n\n
$

The string is fed into echo as is, leaving \n as two characters, \ and n. When you give echo -e you tell it to understand and translate that sort of escape sequence.

but if you tell echo this:

$ echo $'hello world\n\n'
hello world


$

...echo doesn't have to translate. The argument is translated before the command is run, by the shell. The characters generated by the escape sequence get fed straight into it.

Scrutinizer · January 27, 2011, 8:39am

This construction can be used in bash and ksh93:
man ksh:

mregine · January 27, 2011, 8:44am

Thanks to both of you. I've learned something and I even know where to look for more

DGPickett · January 27, 2011, 1:03pm

UNIX is not binary-shy. Command line arguments can have any character but NULL \0 ^@. Some are hard to type! Some are hard to get past the stty cooker, others are hard to get past the shell, so there is quoting, explicit and implicit, and escape characters and sequences.