Extended ASCII Characters keep on getting reintroduced to text files

I am working with a log file that I am trying to clean up by removing non-English ASCII characters. I am using Bash via Cygwin on Windows.

Before I start I set:

export LC_ALL=C 

I clean it up by removing all non-English ASCII characters with the following command;

grep -v $'[^\t\r -~]' filename_01.csv > filename_02.csv

I then check whether there is any non-English ASCII characters left with the following command and it returns nothing, indicating that there is no non-English ASCII characters left.

perl -ane '{ if(m/[[:^ascii:]]/) { print  } }' filename_02.csv

I then deleted the first line with the following command;

tail -n +2 filename_02.csv > filename_03.csv

When I check filename_03.csv again for non-English characters it returns quite a few lines with non-English ASCII characters :mad: Why is this happening, what am I doing wrong? It somehow got reintroduced when I ran the tail command, how is this possible?

perl -ane '{ if(m/[[:^ascii:]]/) { print  } }' filename_03.csv

Example of the characters that got introduced back into my text file after I ran the tail command that I initially cleaned.

Gateway1,Gateway2 4dU4'E"morel

I think you'll get better response if you post representative samples of your log file indicating which characters you want to remove, or a sample of the result file.

One thing I noticed is that this:

[[:^ascii:]]

should be:

[^[:ascii:]]

Also, this :

grep -v $'[^\t\r -~]' filename_01.csv > filename_02.csv

does not just remove non-ascii characters, it discards entire lines that contains one of those characters that are not [\t\r -~]

Using the very limited info, done longhand.
Each line in the file contains a single ' and " .
Not sure if this a just a very small part of the string but here goes.

#�/bin/bash
# nonascii.sh
# Macbbok Pro, circa August 2012, OSX 10.7.5, deafult  bash terminal.
> /tmp/nonascii.dat
> /tmp/newascii.txt
# Create 3 lines of data as per the very limited info.
printf "%s\n" "Gateway1,Gateway2 4dU4"\'"E\"morel" >> /tmp/nonascii.dat
printf "%s\n" "Gateway1,Gateway2 4dU4"\'"E\"morel" >> /tmp/nonascii.dat
printf "%s\n" "Gateway1,Gateway2 4dU4"\'"E\"morel" >> /tmp/nonascii.dat
cat /tmp/nonascii.dat
# Now remove non-ascii characters, Backspace character, (127 decimal), also removed here.
length=$( wc -c < /tmp/nonascii.dat )
deci_string=( $( od -tu1 -An < /tmp/nonascii.dat ) )
for n in $( seq 0 1 $((length-1)) )
do
	if [ "${deci_string[$n]}" -le "126" ]
	then
		printf '\x'$( printf "%x" "${deci_string[$n]}" )
	fi
done > /tmp/newascii.txt
# Prove extended characters have gone.
cat /tmp/newascii.txt

Results:-

Last login: Sun Jul 10 18:34:52 on ttys000
AMIGA:barrywalker~> cd Desktop/Code/Shell
AMIGA:barrywalker~/Desktop/Code/Shell> ./nonascii.sh
Gateway1,Gateway2 4dU4'E"morel
Gateway1,Gateway2 4dU4'E"morel
Gateway1,Gateway2 4dU4'E"morel
Gateway1,Gateway2 4dU4'E"morel
Gateway1,Gateway2 4dU4'E"morel
Gateway1,Gateway2 4dU4'E"morel
AMIGA:barrywalker~/Desktop/Code/Shell> _

Shooting in the dark: How about

tr -dc '[:alnum:][:punct:][:cntrl:][:space:]' <file

or

tr -dc '[:print:][:cntrl:]' <file

EDIT:
or even

tr -dc '\000-\177' <file