Removing Symbols From a File like the copyright symbol

Hi guys,

I have a txt file full of funny symbols like the copyright symbol and other funny ones that get in the way when trying to use sed. For example, not sure if you can read this but I have a line that looks like this:

24(9):995�*1001 DOI: 10.1007/s11606-009-1053-2 ©

When I'm using sed to grab parts of this, the funny symbols don't match any regex symbols, even '.'. Like, if I said

sed -n '/^.*$/p' < example.txt

, then only lines that didn't contain funny symbols would match. Is there a way to get rid of those symbols so that I can get at those lines?

Thanks

perl -lape 's/[^[:print:]]*//g' list
1 Like