mysql - How can I get a mysqldump file encoded in utf-8 for psql?
2014-03
I am migrating some data from a MySql database (v5.1.44/MyISAM/collation=latin1_swedish_ci) to a PostgreSQL (v9.0.4/the one included in OSX Lion).
I'm using
$ mysqldump --compatible=postgresql > tmp.sql # output create/insert statements
$ psql --command='\i tmp.sql' # import to postgresql
However the import fails with the error ERROR: invalid byte sequence for encoding "UTF8": 0xe97261
(This is in reference to accented letters).
The issue, I think, being that the exported file is not using utf-8.
The file that is exported shows the following file information
$ file tmp.sql
tmp.sql: Non-ISO extended-ASCII text, with very long lines
What's the easiest, scriptable way to get this file prepared in utf-8 for psql?
This does not work:
$ iconv -f ASCII -t UTF-8 tmp.sql > out.sql
iconv: tmp.sql:18:59270: cannot convert
I've found that opening the file in vim
and issuing :set fenc=utf-8
does make the import run smoothly, but this has to be automated so I need to cut out this manual step.
Try the following:
mysqldump --default-character-set=charset_name
Possible Duplicate:
Batch-convert files for encoding or line ending
I have a bunch of text files that I'd like to convert from any given charset to UTF-8 encoding.
Are there any command line tools or Perl (or language of your choice) one liners I can use to do this en masse?
iconv does convert between many character encodings. So adding a little bash magic and we can write
for file in *.txt; do
iconv -f ascii -t utf-8 "$file" -o "${file%.txt}.utf8.txt"
done
This will run iconv -f ascii -t utf-8
to every file ending in .txt
, sending the recoded file to a file with the same name but ending in .utf8.txt
instead of .txt
.
It's not as if this would actually do anything to your files (because ASCII is a subset of UTF-8), but to answer your question about how to convert between encodings.