finding files with unicode chars in the filename

I'm trying to check-in a repository to svn -- but the import is failing because some files waaaay down deep in some graphics-library folder are using unicode characters in the file name - which are masked using the ls command but picked up when piping output to more:

[root@dev-www-02 emoticons]# ls -l 1914*
-rwxrwxr-x 1 apache apache 1398 Dec 9 2008 1914OdiN_Presenta_-_o_?.bmp
[root@dev-www-02 emoticons]# ls -l 1914* | more
-rwxrwxr-x 1 apache apache 1398 Dec 9 2008 1914OdiN_Presenta_-_o_�.bmp

Optimally, I'd like to be able to search and quarantine these files into a directory out of the repository tree, but I'm brickwalling :mad: trying to figure out the search string...

I've tried variants of grep '^[A-Za-z0-9]' but can't turn up the right combination.

tia...

unicode has non-ASCII (>127) characters. This is not perfect but should find most files with wacky characters.

find /path/to/directory -print | grep '[^\x00-\x7F]' 

The following Perl program, when run in the root directory, will go through all files and subdirectories recursively and move files that have special/non-printable characters to the /tmp directory. Special/non-printable characters for this particular case are all those except "\w", "." and "-".

$
$ cat -n processfiles.pl
     1  #!/usr/bin/perl -w
     2  # Usage: perl processfiles.pl "<full_path_till_root_directory>"
     3
     4  use File::Find;
     5  @ARGV = qw(.) unless @ARGV;
     6  find sub { $x = $File::Find::name;
     7             $x=~s/[\w.\/-]//g;
     8             if ($x ne "") {
     9               print "File: ",$File::Find::name," will be quarantined.\n" if $x ne "";
    10               `mv "$File::Find::name" /tmp`;
    11  #             `zip -gmT "$ARGV[0]/badlynamedfiles" "$File::Find::name" 1>/dev/null 2>&1`;
    12               print "Done...\n================================\n";
    13             }
    14           }, @ARGV;
    15
$
$

If you comment line 10 and uncomment line 11, then the program uses the native zip utility to add all such files into a zip file called "badlynamedfiles.zip" that is created in the root directory. The files are added to the zip archive and removed, leaving only the good ones behind.

In case of move (mv), the full paths of the moved files are not preserved. So the latest identically named file overwrites the previous one.
In case of zip, the full paths are preserved in the zip archive.

Testing for mv:

$ 
$ cat -n processfiles.pl
     1  #!/usr/bin/perl -w
     2  # Usage: perl processfiles.pl "<full_path_till_root_directory>"
     3                                                                 
     4  use File::Find;                                                
     5  @ARGV = qw(.) unless @ARGV;                                    
     6  find sub { $x = $File::Find::name;
     7             $x=~s/[\w.\/-]//g;
     8             if ($x ne "") {
     9               print "File: ",$File::Find::name," will be quarantined.\n" if $x ne "";
    10               `mv "$File::Find::name" /tmp`;
    11  #             `zip -gmT "$ARGV[0]/badlynamedfiles" "$File::Find::name" 1>/dev/null 2>&1`;
    12               print "Done...\n================================\n";
    13             }
    14           }, @ARGV;
    15
$
$ pwd
/home/r2d2/data/unixstuff/d02
$
$ perl processfiles.pl "/home/r2d2/data/unixstuff/d02"
File: /home/r2d2/data/unixstuff/d02/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d2/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d2/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d1/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d1/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
$
$ ls -1 /tmp/*.bmp
/tmp/1914OdiN_Presenta_-_o_?.bmp
/tmp/1914OdiN_Presenta_-_o_?.bmp
/tmp/1914OdiN_Presenta_-_o_?.bmp
/tmp/1914OdiN_Presenta_-_o_?.bmp
/tmp/1914OdiN_Presenta_-_o_?.bmp
/tmp/1914OdiN_Presenta_-_o_?.bmp
$
$

Testing for zip:

$ 
$ cat -n processfiles.pl
     1  #!/usr/bin/perl -w
     2  # Usage: perl processfiles.pl "<full_path_till_root_directory>"
     3                                                                 
     4  use File::Find;                                                
     5  @ARGV = qw(.) unless @ARGV;                                    
     6  find sub { $x = $File::Find::name;                             
     7             $x=~s/[\w.\/-]//g;                                  
     8             if ($x ne "") {                                     
     9               print "File: ",$File::Find::name," will be quarantined.\n" if $x ne "";
    10  #             `mv "$File::Find::name" /tmp`;                                        
    11               `zip -gmT "$ARGV[0]/badlynamedfiles" "$File::Find::name" 1>/dev/null 2>&1`;
    12               print "Done...\n================================\n";                       
    13             }                                                                            
    14           }, @ARGV;
    15
$
$ pwd
/home/r2d2/data/unixstuff/d02
$
$ perl processfiles.pl "/home/r2d2/data/unixstuff/d02"
File: /home/r2d2/data/unixstuff/d02/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d2/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d2/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d1/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
File: /home/r2d2/data/unixstuff/d02/d1/1914OdiN_Presenta_-_o_.bmp will be quarantined.
Done...
================================
$
$ zip -T badlynamedfiles.zip
test of badlynamedfiles.zip OK
$
$ unzip -l *.zip
Archive:  badlynamedfiles.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
        0  11-20-09 21:55   home/r2d2/data/unixstuff/d02/1914OdiN_Presenta_-_o_.bmp
        0  11-20-09 21:55   home/r2d2/data/unixstuff/d02/1914OdiN_Presenta_-_o_.bmp
        0  11-20-09 21:55   home/r2d2/data/unixstuff/d02/d2/1914OdiN_Presenta_-_o_.bmp
        0  11-20-09 21:55   home/r2d2/data/unixstuff/d02/d2/1914OdiN_Presenta_-_o_.bmp
        0  11-20-09 21:55   home/r2d2/data/unixstuff/d02/d1/1914OdiN_Presenta_-_o_.bmp
        0  11-20-09 21:55   home/r2d2/data/unixstuff/d02/d1/1914OdiN_Presenta_-_o_.bmp
 --------                   -------
        0                   6 files
$
$

Hope that helps,
tyler_durden