Unrecognized character in filename (linux)

07
2014-07
  • smc

    I have a problem accessing file in linux Mint. The reason is obviously the unrecognized character(s) in the filename but none of the techniques I know helped me to rename it.

    So, here are details: filename is something like:

    êà_0_4àíòè_0_7-_0_7_0_8ó_0_1à333333.mp3
    

    or at least this is how my filemanager and terminal display it.

    I cannot open the file with any program I use under linux Mint. media players, etc...

    It cannot be renamed, moved or copied via file manager. All of these operations produce error messages similar to these:

    (for rename):

    Error renaming file: No such file or directory.
    

    (for copy/move):

    No such file or directory.
    

    I have also tried rename command from terminal using wildcards. The command correctly picks the file name but cannot copy, here is the output:

    cp *0_7-_* 1.mp3
    cp: cannot open `êà_0_4àíòè_0_7-_0_7_0_8ó_0_1à333333.mp3' for reading: No such file or directory
    

    I have also tried using mv command,

    mv *0_7-_* 1.mp3
    mv: cannot move `êà_0_4àíòè_0_7-_0_7_0_8ó_0_1à333333.mp3' to `1.mp3': No such file or directory
    

    If I try to sudo rename then I get:

    Unrecognized character \xC3; marked by <-- HERE after <-- HERE near column 1 at (eval 1) line 1.
    

    File itself is a valid MP3 file. It can be opened by Windows Media Player under XP.

    The problem is: I have a big music library (over 100Gb) and there are few dosens of similar files with invalid characters in names. I don't want to loose these files and I would like to figure out how to handle such situations in future (in linux preferably, because I don't own a pc that runs windows).

    Any help will be appreciated

    UPDATE: as requested by terdon, here is the outout of locale:

    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=
    

    UPDATE 2 I have just checked with my friends XP machine. And I can confirm following findings. The original file can be played by Windows Media Player but it cannot be played by Winamp. However after accessing and renaming it through filemanager it is played by both players.

    So I conclude that this is a problem with unrecognized character. I am still interested in solution under linux though,

  • Answers
  • Bill McCloskey

    Four things:

    Try the complement of the solution - move everything else, then remove everything.

    mkdir ../everything_else
    mv problematic/folder/path/* everything_else
    sudo rm -rf problematic/folder/path
    

    Make sure there is no ACL on the file, and remove any that may be causing problems:

    $ /bin/ls -le problematic/folder/path
    total 16
    -rw-r--r--+ 1 whmcclos  staff  1918 Dec 18 09:00 README
    0: user:_spotlight inherited allow read,execute,readattr,readextattr,readsecurity
    $ chmod -a "..."
    

    Try a Perl script to smooth out OS/FS naming dependencies?

    For instance, something along the lines of this code fragment - keep the filename in anonymous $_:

    $ mkdir fred; cd fred; touch a b c d e f
    $ cat > try.pl
    #!/usr/bin/perl
    opendir(D,".") or die "cannot open .\n";
    @files=readdir(D);
    closedir(D);
    foreach (@files) {
      next if /\.{1,2}/;                      # Skip directory entries
      print; print "? "; $r = <>; chop($r);   # Provide some level of control
      if($r eq "y" or $r eq "Y") {unlink;}    # Should report if cannot unlink unnamed file - tbd.
    }
    ^D
    $ /usr/bin/perl try.pl
    a? 
    b? 
    c? y
    d? 
    e? 
    f? 
    $ ls
    a       b       d       e       f       try.pl
    

    [Updated to reflect authors original intent - copy the beast to a well behaved file(name):]

    To see if Perl can copy the misbehaved file to a well behaved file, I'd use the below Perl script - still along the lines of the above - let Perl "anonymously" access the filename.

    (Also, the ACL should really be checked; the ACL can even prevent root from normally accessing the file as root, from what I recall dealing with.]

    Here's the Perl script:

    $ cat try3.pl
    #!/usr/bin/perl
    # A code fragment to ask to copy a displayed file to $TO; chg $TO on next line:
    $TO="my_new_behaved_filename";       # This is the name that will be copied to
    opendir(D,".") or die "cannot open .\n";
    @files=readdir(D);
    closedir(D);
    foreach $f (@files) {
      next if $f =~ /\.{1,2}/;           # Skip directory entries, "." & ".."
      print "$f? "; $r = <>; chop($r);   # Provide some level of control; "y" or "Y"
      if($r eq "y" or $r eq "Y") {       #   to copy the displayed filename to $TO
        print "copying it to $TO...\n";
        # Now, see if we can copy the darn thing to $TO:
        open(FROM,$f) or die "sorry - couldn't open it...";
        open(TO,">$TO");
        while(read FROM, $buf, 16384) {
          print TO $buf;
        }
        close(TO);
        close(FROM);
      }
    }
    

    Which you'd use as such, assuming try3.pl is the aforementioned Perl script:

    $ ./try3.pl 
    a? 
    b? 
    d? y
    copying it to my_new_behaved_filename...
    e? 
    f? 
    

    Access the file through a symbolic or hard link and see what milage you get.

    I'd use vi's filename [tab] expansion to try to "identify" to the shell what file you want to link to.

    I moved my ~/{.vim,.viminfo,.vimrc} aside so as to restrict things, like wildmenu input. You may want to do the same.

    Now, kick off the following command line sequence in the bogus file's containing folder:

    $ vi
    

    In vi, type precisely this sequence of characters

    !!ln -s [tab]
    

    So that key sequence was, in words: exclamation point, exclamation point, el, en, blank, dash, es, blank, tab. As soon as you hit the [tab] key, the first file of the current working directory should display after the blank character (following the "s") on vi's status line. Hit the [tab] key numerous times to cycle through to the bogus filename. When the bogus filename appears, press the space bar to add a space, and then type a new file/link-name.

    The result on vi's status line should look something like this

    !!ln -s bogus_filename new_sym_link_name
    

    Hit the return key to see if the link command (spawned from vi to replace the current null line in vi's empty buffer, which we don't care about, here; we want the side effect of executing a shell command with tab expansion) will create the link, new_sym_link_name.

    Exit vi with :q![return], and see if you can access your file through the symbolic link.

    (You could also try a hard link by leaving off the -s in the above ln command.

    Because I just noticed that you could change the filename through Windows, I'm thinking that somehow, a carriage-return linefeed sequence got into the filename, and that is confusing the various approaches.

  • Duijf

    Use ls -li to get the inode number of the file in question. Make a note of of the inode number (first field). I'll use 123456 as an example.

    Then, use find to remove the file:

    find -inum 123456 -exec rm {} \;
    

    or to rename it:

    find -inum 123456 -exec mv {} some_better_filename \;
    

  • Related Question

    linux - Rsync Character set problems
  • Nerdfest

    I'm attempting to backup a windows box to a Linux box (Ubuntu 9.10) using rsync on the Linux box, and I get "file has vanished" errors for filenames with unusual characters in the filenames. I get a similar error ("no such file or directory") if I use "cp" instead of rsync. The source in a share on an English language Windows box.

    One of the characters is the apostrophe character.

    I've been playing around with various --iconv options but haven't been able to solve the problem. Suggestions?


  • Related Answers
  • quack quixote

    You're mounting the share from Windows on Linux, then using rsync to copy files locally. How do you mount the share?

    Windows should be storing filenames in UTF8 or UTF16, but you need to tell Linux that so it can mount the share correctly. Use a mount option like utf8/utf16, or iocharset=utf8/iocharset=utf16 in your mount command:

    mount -t cifs -o utf16,other,options,here //server/share /path/to/mount/point
                  ^^^^^^^^
                       |
                       -- if utf16 doesn't help, try iocharset=utf16
                          utf8 or iocharset=utf8 may also work
    

    Other users are indicating that UTF16 is more likely to be correct.

  • nik

    One way to get around this is -- for the limited directories with special character filenames, zip or tar the directory and rsync with an exclusion for that directory (but including the zip/tar file instead).