Hello all,
I have a strange Problem with writing umlauts like (�, �) to a file, which has an ISO-8859-1 Encoding.
My Shell-script is reading a file. The Encoding differs. Sometimes US-ASCII, UTF-8, ISO-8859-1. Then a I have to replace all "{" with a "�".
I am reading the file line by line and do it with a sed on each line. Then I write the corrected line with an echo to a new file.
When the file is ready, within the hex Editor I can see, that the "�" is represented as a "c3 a4" - thats an UTF-8 Encoding. What I Need is an ISO-8859 Encoding - a "e4".
Thats my code:
#!/bin/bash
ConvTmpFile=$1.out
rm -f $ConvTmpFile
while read line
do
echo "$line" | sed 's/{/\�/g' >> $ConvTmpFile
done < $1
My env-variables are as follows:
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
- Is it possible to force to write an ISO-8859-1 encoded file?
- How do you would handle the various encoded files for reading? Should I convert them first with "iconv" to ISO-8859-1?
CU,
API