Remove Special Characters and Numbers From a Wordlist

I sux at this type of stuff. I have a huge wordlist. I want to get rid of everything in each word except the letters. I want to remove numbers and all special characters. And since this list was created using cewl I some how picked up something like so Latin characters and would like to remove them as well. If there is a way to do this and someone gives me the string to use could you also drop down and explain to me how the above string works since I would love to learn how to do things like this myself.

Thanks in advance.

Try:

$ echo "This134isastrangeword33 This134isàstrangërword33@%$" | sed 's/[^[:alpha:]]//g'
ThisisastrangewordThisisàstrangërword
$echo "This134isastrangeword33 This134isàstrangërword33@%$" | sed 's/[^a-zA-z]//g'
ThisisastrangewordThisisstrangrword

^ means negation
[^[:alpha:]] mean any non-letter.
[^a-zA-Z] means any non-ascii letter
sed 's/[^[:alpha:]]//g' means delete any non-letter on a line

So using this will remove anything not in the U.S alphabet if I understand you correctly?

Probably worth adding space characters.

my intentions are to clean out everything except the letters right now. Once that is done I will be adding back numbers that I can control. What I have right now is 81 gigs of words pulled from sites. I am currently removing duplicates which I know will cut the size down.

I am a little confused but I assume I should run this

$ echo | sed 's/[^[:alpha:]]//g'

and then this

$echo| sed 's/[^a-zA-z]//g'

and I will end up with this
Thisisastrangeword

I know I will be back for more questions. I am at work so if I do not get back right away to let you know how greatful I am for your input then I want to thank you now. So thank you and correct me if I am wrong on my input.
Thanks, Thank, Thanks!

Make a test run that will filter out only 10 lines and write to the file

sed -n '1,10s/[^[:alpha:][:blank:]]//gp' your_file > tmp_file

Open it and if you are satisfied with the result, run on the whole file.

sed 's/[^[:alpha:][:blank:]]//g' your_file > tmp_file

--- Post updated at 20:13 ---

If the file is large, probably better

sed 's/[^[:alpha:][:blank:]]//g; 10q' your_file > tmp_file

Interesting, so this

'1,10

denotes the number of lines? See this is how I learn best. If I have different variations of code in front of me and with what they do then I can look at the differences and that sticks with me better.

Yes it is a large list. It has a mass amount of numbers and Latin and even Chinese. I have no idea where those came from because I scan only U.S sites and U.S newspapers online. Normally only one or 2 links deep. But I got them from somewhere.

I am always ready to take in nuggets of information as I search for my gold. Thank you all.