Issue with Keyboard or Char Encoding During Migration

hicksd8 · April 28, 2020, 7:35am

Hi All,

As Neo says I have been spending a bit of time on this migration integrity issue.

The irritating "Thingy" (white diamond with question mark in the middle) is officially the Unicode symbol called "Replacement character". The character set inserts this as a placeholder for a character that it doesn't understand. IMHO, the issue here is simply that the migration script (or whatever process) SHOULD understand all the characters on our old site. Yes, we already have "Replacement characters" on the old site switch probably emanated from a long ago upgrade from ascii to Unicode, or from Unicode version x to Unicode version y. As Neo says, replacement character symbols in our old site must be ignored because there's nothing we can do about them now apart from manually edit them out as time goes on.

However, I believe that the currently used (Discourse provided??) process is stuffed because it doesn't understand some of the perfectly correct text on our old site. It even screws up a thread title on the old site containing the replacement character symbol - look at this......

Post migration
How to grep i?1/2 symbol? - Shell Programming and Scripting - UNIX.COM Community

Pre migration
How to grep � symbol?

So the process doesn't even understand it's own Unicode character set!!!!

So FWIW, I've come to the conclusion that trying to modify our old dB is futile as the process will probably find something else to screw up.

Indeed, if you follow the first link I posted on this thread further back, others are having the same issue.

That's my update thus far. I'll report back again as my investigation continues.

EDIT: Replacement character symbol is U+FFFD