Writing umlauts to a file

Hello all,

I have a strange Problem with writing umlauts like (�, �) to a file, which has an ISO-8859-1 Encoding.

My Shell-script is reading a file. The Encoding differs. Sometimes US-ASCII, UTF-8, ISO-8859-1. Then a I have to replace all "{" with a "�".
I am reading the file line by line and do it with a sed on each line. Then I write the corrected line with an echo to a new file.

When the file is ready, within the hex Editor I can see, that the "�" is represented as a "c3 a4" - thats an UTF-8 Encoding. What I Need is an ISO-8859 Encoding - a "e4".

Thats my code:

#!/bin/bash


ConvTmpFile=$1.out
rm -f $ConvTmpFile
while read line
do
  echo "$line" | sed 's/{/\�/g' >> $ConvTmpFile
done < $1

My env-variables are as follows:

LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8

  • Is it possible to force to write an ISO-8859-1 encoded file?
  • How do you would handle the various encoded files for reading? Should I convert them first with "iconv" to ISO-8859-1?

CU,
API

1 Like

Did you consider using the iconv tool to convert the files between all the encodings?
And, provided a program was compiled "locale-aware", you can force it to work e.g. in the C locale by setting the LC_ALL variable for just this single run:

LC_ALL=C program arg1 ... argn
2 Likes

Thanks for this hint.

It was not the solution for my Problem - but you gave me a hint to solve it. The Problem I had has been at another Point.

Therefor thanks for it

Why don't you rephrase your problem, then, so other members / searchers can understand it, and, en plus, post your solution so people can benefit from it?

1 Like