Unable to directly input unicode characters with diacritical marks into PuTTY on Windows 8

08
2014-07
  • Michał Rus

    After moving to Windows 8, I can no longer directly input unicode characters into PuTTY session window. Like ą, ę, ć, ń using Alt+<letter> with Polish (programmers) keyboard layout.

    • I have Window -> Translation -> Remote character set set to UTF-8.

    • Typing directly using the physical keyboard connected to the server works.

    • And, what is strange, pasting a text with these letters into PuTTY works, too.

    • The server is using UTF-8. Here, ąęółśćżźń is being pasted:

      m@debian:~$ echo ąęółśćżźń > x ; file x
      x: UTF-8 Unicode text
      m@debian:~$
      
    • Pressing e.g. Alt+x, that normally renders ź, in PuTTY window results in a normal latin z. Here, żźżźżź is being pasted:

      m@debian:~$ echo żźżźżź | md5sum
      1ff31403a1089c590ed55d42cdcd0f3e  -
      m@debian:~$
      

      Here, żźżźżź is being typed:

      m@debian:~$ echo zzzzzz | md5sum
      cd519e63e450d863e5ee02814bae016d  -
      m@debian:~$
      

      And here, a plain zzzzzz is being typed:

      m@debian:~$ echo zzzzzz | md5sum
      cd519e63e450d863e5ee02814bae016d  -
      m@debian:~$
      

      Same sum.

    • The only letter with a diacritic that is typable is ó (which is also present in latin1 charset).

    • This same exact executable does work on Windows 7.

    My guess is that Windows 8 is somehow deciding that PuTTY is unable to process typed (?) non-latin1 characters and it changes them on-the-fly to their latin1 counterparts.

    What can be done?

  • Answers
  • Michał Rus

    Setting "Language for non-Unicode programs" as suggested in http://superuser.com/a/497880/214569 helped.


  • Related Question

    Word doesn't convert non-Unicode characters as expected
  • Questioner

    Our users are experiencing a very discouraging issue in regards to how MS Word (in Windows) handles non-unicode characters. This issue is confirmed in both Word 2007 and the Word 2010 Beta using Windows XP SP3; I suspect it works the same way in 2003.

    Issue:

    1. A user creates a document using a non-unicode font, entering characters to represent scientific notations. For example, he enters a Mu (µ). Note: I pasted in a unicode-compliant Mu for reference.
    2. The user opens his document and attempts to copy / paste this non-unicode character representing a Mu into a web browser for entry into our system. It pastes as an unrecognized character. This is expected.
    3. The user opens his document, selects the non-unicode character and adjusts its font to "Arial Unicode MS," saving the document. He closes / re-opens the document for good measure. Once re-opened, he copies what should be a unicode Mu and pastes it into the web browser. It is still represented as an unrecognized character.
    4. The user creates a new document, sets the font to "Arial Unciode MS" and creates a Mu. He copies this Mu into the web browser and it pastes over in Unicode, as expected.

    Conclusion:

    Word is not actually converting non-unicode characters into unicode characters when it should, when a unicode font is selected. Instead, it is taking a best-guess for display reasons but doing no actual conversion.

    How do I overcome this problem?

    • Can I change some setting in Word to force a conversion? Preferable.
    • Is there a "cleaner" app or Word macro that will do this?
    • Other solutions?

    Additional Notes:

    • Re-typing the affected documents using unicode is not an option
    • This is not an issue in Mac OS X using the most recent version of Word. A sample case such as in (3) results in a unicode Mu being pasted into the browser.

    Please help!


  • Related Answers
  • Mark Ransom

    Try using Paste Special; there should be an option for Unicode text.

    Note that if the source document was created with a Symbol font, this won't help. Windows doesn't really know that the character is related to a specific Unicode character, the symbol fonts were created before Unicode as a way of meeting a need and the two aren't interchangeable.

  • FarhanN

    A lengthy process but I normally convert such files into images and then process those images through any OCR software. That helps. But, I was myself searching for an even better option.

  • Kirby

    Thanks Mark. You're right that Paste Special won't work for non-unicode text. Do you know of a utility that will "clean" non-unicode documents. For example, load document that contains non-unicode Mu symbol and convert said Mu symbols into unicode-compliant Mus? Is there a macro or plugin for Word that will do this?

    Best regards, Kirby