I have a very large file in Unix that I would like to search for all instances of the unicode character 0x17. I need to remove these characters because the character is causing my SAX Parser to throw an exception. Does anyone know how to find a unicode character in a file?
Thank you for your assistance.
"0x17" is not a Unicode (UTF-16 or UTF-32) character per se.
For those not familiar with Unicode, UTF-16 basically means that
every "character" is stored as 2 bytes whereas UTF-32 means every
"character" is stored as 4 bytes.
On a practical level, it means that most standard ASCII characters are
either preceded by or followed by either a single NUL (0x00) or 3 NULs
depending on whether data storage is Big-Endian or Little-Endian.
Which Unicode "format" is your file using?