Convert Windows Metacharacters to Regular Text

Hi, all.

I have a need to take a flat file FTP'd from Windows to Unix and convert it for loading into a MySQL database without manual intervention. However, some characters are "fancified" (e.g. the fancy Beginning and End double-quotes from Windows) that show up as codes using vi. I need to convert these to regular text in Unix. The codes are below:
<85>=...
<92>=�
<93>=�
<94>=�

Example Text in WordPad

(like a telephone, except �0� at top left

Same Text in Unix

(like a telephone, except <93>0<94> at top left

[Note the difference in the double quotes.]

The codes show up in Putty vi as blue, so I know it recognizes they are metacharacters, but none of the methods I've tried in Unix (sed, awk) to search for metacharacters seems to find these.

Here is a solution that uses bash arrays (easier to extend for other chars you find - like A9 for the copyright symbol).

#!/bin/bash
src=(85 92 93 94 A9)
dest=("..." \' \" \" "(c)")
for((i=0;i<${#src[@]};i++)) {
   SEDSTR="${SEDSTR};$(printf "s/%c/%s/" $(echo -e \\x${src}) ${dest})"
}
sed -i $SEDSTR myfile.txt
1 Like

Why not provide some real samples and your expect O/P for easily understanding your question?

rdcwayx, if you type "(like a telephone, except "0" at top ... left" into MS word and save as .txt you will have an example file. Here is and od -c of the file I use to test my solution:

$ od -c myfile.txt
0000000   (   l   i   k   e       a       t   e   l   e   p   h   o   n
0000020   e   ,       e   x   c   e   p   t     223   0 224       a   t
0000040       t   o   p     205       l   e   f   t  \r  \n

Totally what I needed to know. Thanks a lot.