unicode - Encoding Issue in Notepad and ePub Files
2014-07
Whenever I open a text file in notepad or an epub file in Mobi Pocket Reader, all the apostrophes appear as Chinese characters. In the .epub files, many of the capital letters are replaced by Chinese characters, as are all apostrophes.
Any ideas what the cause of this is and how to fix it?
Thanks.
I have a bunch of book length text files I'd really like to read on my EPUB reader (as it happens FBReaderJ). What would be the best route to convert them?
I have access to Mac OS X and Linux (Ubuntu). Probably happiest with a command line, but would setting for a GUI for batch conversion.
My criteria for success are really based upon the shortfalls I have found with Calibre
- must do the whole book
- at least a guess of what the title/author may be. Minimum the source filename for the title.
- hygienic with files it uses - tidies up after itself (this is less important)
- doesn't try to be an all-in-one library manager (again, less important).
- is lenient in parsing special characters (e.g. < and & characters).
Happened upon this thread many moons later.
Just liked to point out there is a command line tool Calibre uses to convert. It's called (surprise, surprise) ebook-convert. See 'ebook-convert -h' or 'ebook-convert dummy.html .epub -h' to see conversion options for converting html to epub.
Haven't explored it though. I am most curious about --list-recipes (and if it works), it looks as somethings interesting.
I'd say, Calibre is for you, it works on Linux, Mac OS X, and Windows.
Input Formats: CBZ, CBR, CBC, EPUB, FB2, HTML, LIT, MOBI, ODT, PDF, PRC**, PDB, PML, RB, RTF, TXT
Output Formats: EPUB, FB2, OEB, LIT, LRF, MOBI, PDB, PML, RB, PDF, TXT
For the Mac OS X and Windows, I have had success with Stanza for Desktop.
This supports a good range of export formats.
More importantly, it copes very well with
- detecting chapters in large text files.
- unicode, including "significant" characters like < and &.
There are online tools to convert to epub files.
Example of such a website here.
If you have a MacOS X 10.6 machine, try this:
http://padilicious.com/epub/index.html
It relies on Automator
You may want to try ODFToEPub. This is an OpenOffice extension that lets you export a document to ePub.
If you have access to a Windows system, you can try Atlantis Word Processor. It converts not only TXT but also DOC, DOCX, and ODT files to EPUB. Only a few mouse clicks are needed to convert a document to EPUB. Convenient batch conversion is also offered. You can find details here:
http://www.atlantiswordprocessor.com/en/help/index.php?page=html/ebook.htm
Or see the main page of its site (there is also a download link).
A bit off-topic:
Here's a nice website, titled EVERYTHING ABOUT READING ELECTRONIC BOOKS.
Well, not quite EVERYTHING, but very informative nevertheless :)
This being a Russian website, they're focused on some of these extraordinary Russian eBook reading programs such as ICE Book Reader Professional, CoolReader 2 (maybe not as sophisticated as ICE but free) and AlReader 2. None of these support the EPUB format though.
There's also a link to several ePub libraries, which might be of interest for you.