linux - How to unzip the Portuguese file

07
2014-07
  • ASingh

    I have a zip file containing data in Portuguese. When I try to unzip the file

    $ unzip abc.zip
    

    It creates a file "abc.csv" but when I try to look into data, I get "?" instead of characters like "á". My LANG settings on shell looks like:

    $ locale
    LANG=pt_BR.UTF-8
    LC_CTYPE="pt_BR.UTF-8"
    LC_NUMERIC="pt_BR.UTF-8"
    LC_TIME="pt_BR.UTF-8"
    LC_COLLATE="pt_BR.UTF-8"
    LC_MONETARY="pt_BR.UTF-8"
    LC_MESSAGES="pt_BR.UTF-8"
    LC_PAPER="pt_BR.UTF-8"
    LC_NAME="pt_BR.UTF-8"
    LC_ADDRESS="pt_BR.UTF-8"
    LC_TELEPHONE="pt_BR.UTF-8"
    LC_MEASUREMENT="pt_BR.UTF-8"
    LC_IDENTIFICATION="pt_BR.UTF-8"
    LC_ALL=
    

    I will appreciate any help on this.

  • Answers
  • asamarin

    Most probably, you can blame your editor which is not able to understand either ISO-8859-1 or UTF-8 format. The iconv command comes in handy in these situations; try to convert the csv file both ways (ISO-8859-1 -> UTF-8 and UTF-8 -> ISO-8859-1, since I don't know which one is your original codification), and check whether at least one of those newly created files is read correctly afterwards:

    $ iconv -f UTF-8 -t ISO-8859-1 abc.csv > abc-latin1.csv

    $ iconv -f ISO-8859-1 -t UTF-8 abc.csv > abc-utf8.csv


  • Related Question

    ubuntu - shell character encoding seems to have been altered
  • Rory Fitzpatrick

    I'm having a problem with my shell on an Ubuntu virtual machine that I ssh in to. It was working fine until I output the contents of a JPEG file directly, since then the character encoding seems to have messed up and I've no idea how to fix it. Characters are looking like this:

    ÆsudoÅ password for rory: 
    

    I'm not sure if it's relevant, but the output of locale is:

    LANG=en_GB.UTF-8
    LC_CTYPE="en_GB.UTF-8"
    LC_NUMERIC="en_GB.UTF-8"
    LC_TIME="en_GB.UTF-8"
    LC_COLLATE="en_GB.UTF-8"
    LC_MONETARY="en_GB.UTF-8"
    LC_MESSAGES="en_GB.UTF-8"
    LC_PAPER="en_GB.UTF-8"
    LC_NAME="en_GB.UTF-8"
    LC_ADDRESS="en_GB.UTF-8"
    LC_TELEPHONE="en_GB.UTF-8"
    LC_MEASUREMENT="en_GB.UTF-8"
    LC_IDENTIFICATION="en_GB.UTF-8"
    LC_ALL=
    

    I've tried rebooting with no effect. Any hints on how to solve?


  • Related Answers
  • Dennis Williamson

    The problem is probably in your local terminal rather than the remote system. Try the reset command within that terminal (from any system that has it) or close and re-open it.