How to manage file names with special characters

One of the common questions asked are: how do i remove/move/rename files with special (non-printable) characters in their name?

"Special" doesn't always mean the same. As there are more and less special characters, some solutions are presented, ranging from simple to very complicated. Usually a method for a "more special" case is also suitable for a "less special" one, but unnecessarily complicated to carry out. We will start with the easiest and least special specialities and go on from there until reaching the peak of specialdom, so to say...

In most cases ways to remove the files in question will be given, but you can adapt the method quickly (usually by simply exchanging "rm" for another command) to manage the file in a different way.

1) The most common uncommoners: space, tab and dash

The most easy class of uncommon characters are simple white space: spaces and tabs. While these are not printable and special to the shell they are easily protected from same: just quote the filename containing these:

# rm "file with spaces in its name"

In scripts you should do this routinely to add a modicum of runtime security to it.

The dash ("-") has a special meaning to the shell, because it introduces options. Therefore rm -file (DON'T TRY THIS!) will not remove a file called "-file", but call "rm" with the options: "-f" (force), "-i" (interactive), "-l" (no legal option) and "-e" (print a message after each deleted file). Use a pathname, which will make the filename part unambiguous in this case. It will not hurt to quote the filename too:

user@host$ rm "./-file"

2) Quite uncommon but still not rare: the unprintables

This class of characters is hard to print and usually they are also hard to enter: some of them have simply no visual representation, none of these have a key for them on the keyboard: ALT-255, which looks like a space char (but isn't) for instance.

2.a) Enter the character literally
One way to deal with file names containing these is to use the method of entering characters literally. Supposing you use Korn shell (ksh) enter:

set -o vi

to activate vi-style command line editing mode. Now you can enter even characters with a special meaning to the terminal uninterpreted by entering "<CTRL>-<V>" before. (This is true for inside vi too, so you can test the capabilities of this mechanism there, before doing it directly at the command line.) Try it with <ENTER>: you will see "^M", which is the visual representation of <ENTER>. Note that "<CTRL>-<V>" is valid only for the very next character you enter, if you want to enter another special character you have to enter "<CTRL>-<V>" again.

user@host$ set -o vi
user@host$ rm 'file^Mname'

Of course you have to quote the filename again in the strictest form (single quotes) to avoid having it interpreted by the shell when sending the command.

2.b) Use wildcards
In some cases (not in the above mentioned "file^Mname", though) you can use wildcards to avoid having to enter the problematic character. This works only, though, if the character is only unprintable, but has neither a special meaning to the shell nor a special meaning to the terminal. For instance, the character "<ALT>-255", which will look like a space on most terminals, can be avoided this way. Notice in the example, that between "e" and "n" there is NOT a space, but such a character:

user@host$ ls
file name
user@host$ echo file*name
file name
user@host$ rm "file*name"

Make sure, of course, that there are no other files matching the wildcard! This is why i used "echo" before switching to "rm". Always do this (or something to this effect) to avoid loss of data.

3) The Beauty of the Beasts!

There is a class of characters, which simply cannot be entered appropriately. As you cannot enter the character you cannot enter the files name and therefore all the methods described above fall short. It is still possible to reference them by using their Inode. Every file has such an Inode, where all the meta-information about it is stored: name, modification date, size, etc..

Every inode now has a unique number and it is easy to display it: use "ls" with the "-i" switch.

user@host $ ls -i
65649 file?name

We can now use this information to address the file with the "find" utility using the "-inum" switch and its "-exec" switch to invoke "rm":

user@host $ ls -i
65649 file?name
user@host $ find . -inum 65649 -exec rm {} \;

The same works with directories, but you will have to use "rm -r" instead. Be extremely careful about using this, because even though the directory has a malformed name there might be files of lasting values in it!

4.) World Class, the Most Special of all Specials: "/"

Finally, we are entering the realm of the absolute peak in specialdom: the slash. Because it has a special meaning to the file system itself it can't be masked, avoided or otherwise circumvented in any way. Every file managing utility using this name will fail because the file system driver will treat "na/me" as filename "me" in directory "na". There is no way around this and in fact the slash as a delimiter for directory names is built into the very kernel.

It might be possible to delete the directory above it, but even this (depending on your system) will probably fail, because the "unlink()" system call will still be confused when it tries to unlink the file in question.

The easiest solution is to get an ancient Mac (which has most probably produced this mess, with Unix methods you can't get that deep into it) with a pre-Unix MacOS and use this to change it. Umount the NFS share before this and only remount after its successful completion.

Another way which will most times work is:

If you have it you can use "filedb" or you can use "clri" to clear the inode of the file in question. Do an "fsck" after this and either pull it out from "lost+found" (not guaranteed to work) because deleting the inode destroyed it or delete its remnants there.

What surely will work but requires the most effort on your part is:

  • Identify the Inode of the file
  • identify the location of the Inode on the raw device
  • unmount the FS
  • using "dd" or some similar low-level tool (hex-editor, ...) patch the file name in the inode directly to something manageable
  • do a "fsck" and mount again

How to implement this on your system varies, because file systems, their inner workings and their layout differs from implementation to implementation.

bakunin

__________________
References:
Unix.com, HP-ux, special character on Filename.. help!!!urgent
FAQs.org: How do I remove a file whose name begins with a "-" ?
FAQs.org: How do I remove a file with funny characters in the filename ?
Unix.com, Dummies Questions and Answers, How to copy/move to a file with a special character as the 1st char in the filename?

8 Likes