foreign characters in flat file

Hey,

Is there anyway I anks,

Pocha

Usually there is a way.

You could use something like

echo "Hell�!" | sed 's/�/oe/g'

Since you did not show how your file looks like, I can only guess, but this sed thing up there is universal so you exchange the � and oe just vs. the letters you want to change.

sorry

Thanks Zaxxon,

You may be familiar with
[:lower:]
[:upper:]

But there is also in some unix flavors
[:alnum:] printable characters, including space
[:cntrl:] control characters
[:print:] printable characters

Perhaps a

tr -d "[:cntrl:]"

might help?

it doesn't help joe. i just tried.

Does iconv work? (convert to another locale?)

i tried it says invalid characters found

it is a ascii text

a simple sed command works as you'll know but do u ' ll hv any clue hw to tk cr of so many foreign characters. Please gv me a clue.

this might work, this should remove all characters except those listed.:

sed 's/[^A-Za-z0-9 ]//g' /path/to/your/file > /path/to/newfile

EDIT: had to add a space inside [^A-Za-z0-9 ] otherwise it deleted all spaces... :stuck_out_tongue:

We can't tell what is in the file. If it is not a foreign language then try to remove all "weird characters".

tr -dc '[:print:]'  < inputfile

If that does not do it then

/* file should be named badchar.c */
#include <stdio.h>
int main(void)
{
      int ch=0;
      while(fgetc(stdin)!=EOF)
         if(ch<128 ) fprintf(stdout, "%c", ch);
      return 0;
}

Your c compiler is either cc or gcc so I use [g]cc below -- you pick.

[g]cc badchar.c -o badchar

to run the program do this

badchar < badinputfile > newfile

Ikon, I tried that syntax that break the file totally. Anyways thanks so much for your concern.tk cr.