Remove characters and replace with space

tr -cd '\11\12\15\40-\176' < file-with-binary-chars > clean-file

This removes special characters but how can I replace it with space

You can use the command man tr and will find the -s switch which stands for substitute. But now you might not have to look it up anymore.

Hi zaxxon,
Sorry, but the tr -s option is not a substitute option; it is a request to squeeze repeated occurrences of a character in the output to a single occurrence.

Hi essay,
Try:

tr -c '\11\12\15\40-\176' '[ *0]' < file-with-binary-chars > clean-file
2 Likes

I'm afraid it's not that easy - in UTF8 (and other) encoded files, characters above the ASCII set are represented by more than one byte, of which every single one will be replaced by a space when running above command. Using the -s option, on the other hand, will squeeze any count of adjacent non-ASCII chars into one single byte.

Would this come close to what you need:

LC_ALL=C sed 's/[\xC0-\xDF]./ /g; s/[\xE0-\xFF]../ /g; s/[^[:alnum:][:space:]\o011\o012\o015]/ /g' file

?