linux - How can I convert multiple files to UTF-8 encoding using *nix command line tools?
2014-03
Possible Duplicate:
Batch-convert files for encoding or line ending
I have a bunch of text files that I'd like to convert from any given charset to UTF-8 encoding.
Are there any command line tools or Perl (or language of your choice) one liners I can use to do this en masse?
iconv does convert between many character encodings. So adding a little bash magic and we can write
for file in *.txt; do
iconv -f ascii -t utf-8 "$file" -o "${file%.txt}.utf8.txt"
done
This will run iconv -f ascii -t utf-8
to every file ending in .txt
, sending the recoded file to a file with the same name but ending in .utf8.txt
instead of .txt
.
It's not as if this would actually do anything to your files (because ASCII is a subset of UTF-8), but to answer your question about how to convert between encodings.
Possible Duplicate:
Batch-convert files for encoding or line ending under Windows
Hey!
I have many files that are encoded in the ANSI (iso-8859-1) format and I want to change it to utf8.
I am converting one by one using notepad++ but I was wondering if there is any application that will convert them all (I have many files) in a quick and easy way.
Anyone know of one app that will do this?? (free app would be great)
Thanks
This is a perfect fit for a scripting language to convert Windows-1252 to UTF-8.
- Here is a Python and Ruby script.
- Here is a Bash script using iconv.
You could try this SourceForge app. From the website:
Codepage Converter - Convert HTML/Text files to different encoding formats e.g. ANSI to UTF-8 or Unicode. Convert multiple files with 1 click. Works with all encodings
A bit late, but: If you saved your scripts as 'UTF without BOM' and notepad++ is now opening them as ansi -> you can 'fix' this behaviour by including a string of multibyte characters somewhere in your comments to force notepad++ to recognise the UTF encoding of the file. It's a complete hackjob, but it works ;-)