Hi -
I have file which contains high bit unicode chars like � etc.. How can I do grep to find out lines which contain copyright symbol �
I tried using
grep \x{00A9}
grep \x\{00A9\}
Thanks-
Hi -
I have file which contains high bit unicode chars like � etc.. How can I do grep to find out lines which contain copyright symbol �
I tried using
grep \x{00A9}
grep \x\{00A9\}
Thanks-
Any suggestion ?
I need to use grep only..
Try this:
grep '�' filename
How you will type '�' in unix ??? I am not sure whether you can type it in unix...
In windows I can type it using 'Alt+0169'..
'�' in Unix is:
Press Shift+Alt+0 simultaneously.
Thanks for your reply.
However, I am not able to type � in unix
I tried shift+alt+0...
POSIX grep does not look past a nul character. 00A9 is the unicode sequence number for what you want. The first byte is 00 - the nul character.
grep will not do what you need. Cosnider wiritng something in C - reads in short integers (2 byte integers) from the file. Compare each one with 169. When you find 169 that is character offset in the file where the symbol is.
You are probably better off using a Windows editor.
Found a version og grep from mkssoftware that claims to support unicode:
grep, egrep, fgrep -- match patterns in a file
GNU grep has a -U switch to support binary character files (UTF-16, unicode, etc)
Thanks Jim.
I will try to see what I can do to implement it...