I'm trying to write a script that parses my music collection and hard link some filenames that my media player doesn't like to other names.
To do this I need to extract the name and remove alla non ASCII characters from that and do a cp -l with the result.
Problem is this:
22:16:58 $ find . -wholename "*" -print
./Simon & Garfunkel - The Essential Simon & Garfunkel (2003)/CD1/15 - Simon & Garfunkel - The Dangling Conversation (Album Version).flac
./Jos� Gonz�lez - In Our Nature/06 Abram.flac
./Ane Brun (2004) - A Temporary Dive [FLAC]/09 Ane Brun - Song No. 6.flac
22:18:28 $ find . -wholename "*" -print| while read line; do echo ${line//[^a-z]/};done
SimonGarfunkelTheEssentialSimonGarfunkelCDSimonGarfunkelTheDanglingConversationAlbumVersionflac
./Jos� Gonz�lez - In Our Nature/06 Abram.flac
AneBrunATemporaryDiveFLACAneBrunSongNoflac
Off cause I realize that those names are gibberish but what puzzels me is why the "./Jos� Gonz�lez - In Our Nature/06 Abram.flac" line is unaffected.
22:21:12 $ bash --version
bash --version
GNU bash, version 4.2.10(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
My guess is that it has something to do with � but I wouldn't know.
But they are, hexdump and translation to unicode gives
U+002E FULL STOP character (.)
U+002F SOLIDUS character (/)
U+004A LATIN CAPITAL LETTER J character
U+006F LATIN SMALL LETTER O character
U+0073 LATIN SMALL LETTER S character
U+00E9 LATIN SMALL LETTER E WITH ACUTE character (�)
U+0020 SPACE character
U+0047 LATIN CAPITAL LETTER G character
U+006F LATIN SMALL LETTER O character
U+006E LATIN SMALL LETTER N character
U+007A LATIN SMALL LETTER Z character
U+00E1 LATIN SMALL LETTER A WITH ACUTE character (�)
U+006C LATIN SMALL LETTER L character
U+0065 LATIN SMALL LETTER E character
U+007A LATIN SMALL LETTER Z character
U+0020 SPACE character
U+002D HYPHEN-MINUS character (-)
U+0020 SPACE character
U+0049 LATIN CAPITAL LETTER I character
U+006E LATIN SMALL LETTER N character
U+0020 SPACE character
U+004F LATIN CAPITAL LETTER O character
U+0075 LATIN SMALL LETTER U character
U+0072 LATIN SMALL LETTER R character
U+0020 SPACE character
U+004E LATIN CAPITAL LETTER N character
U+0061 LATIN SMALL LETTER A character
U+0074 LATIN SMALL LETTER T character
U+0075 LATIN SMALL LETTER U character
U+0072 LATIN SMALL LETTER R character
U+0065 LATIN SMALL LETTER E character
U+002F SOLIDUS character (/)
U+0030 DIGIT ZERO character (0)
U+0036 DIGIT SIX character (6)
U+0020 SPACE character
U+0041 LATIN CAPITAL LETTER A character
U+0062 LATIN SMALL LETTER B character
U+0072 LATIN SMALL LETTER R character
U+0061 LATIN SMALL LETTER A character
U+006D LATIN SMALL LETTER M character
U+002E FULL STOP character (.)
U+0066 LATIN SMALL LETTER F character
U+006C LATIN SMALL LETTER L character
U+0061 LATIN SMALL LETTER A character
U+0063 LATIN SMALL LETTER C character
U+000A <control> character
And even if they weren't, wouldn't they be changed by ${line//[^a-z]/} since they are not [a-z]?
[edit]:
And by the way, if I use sed to do the substitution it works on the Jos�... lines to... it even removes some of them completely.
22:56:50 $ find . -iname "*" -print| while read line; do echo $(line | sed -e 's/[^a-zA-Z]//g' );done
SimonGarfunkelTheEssentialSimonGarfunkelCDSimonGarfunkelTheDanglingConversationAlbumVersionflac
AneBrunATemporaryDiveFLACAneBrunToLetMyselfGoflac
STRING=$(echo " 2E 2F 4A 6F 73 C3 A9 20 47 6F 6E 7A C3 A1 6C 65 7A 20 2D 20 49 6E 20 4F 75 72 20 4E 61 74 75 72 65 2F 30 36 20 41 62 72 61 6D 2E 66 6C 61 63 0A" |
sed 's/ /\\\\x/g' | xargs echo -e)
echo "${STRING//[^a-z]/}"
osonzleznuraturebramflac
$ bash --version
GNU bash, version 4.1.7(2)-release (i686-pc-linux-gnu)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
$
It ought to substitute, I don't know why yours doesn't. Perhaps a bug, or an older shell with limited features.
Corona688: Might I ask what distro/OS you'r on? I have tried with several patched and unpatched versions of bash on Debian and Ubuntu and I never get the result you get.