microsoft excel 2010 - how to get encode type of text file using vba
2014-07
I would like to check encode type of text file is UTF-8 without bom or SHIFT-JIS.
Because I change to SHIFT-JIS if encode type is UTF-8 without bom.
So I want to get encode type of text file.
What is the best way for doing this?
Please explain.
Edit
Dim obj1 As Object, obj2 As Object
Set obj1 = CreateObject("ADODB.Stream")
Set obj2 = CreateObject("ADODB.Stream")
obj1.Type = 2
obj1.Charset = "UTF-8"
obj1.Open
obj1.LoadFromFile strLogFileName
obj1.Position = 0
obj2.Type = adTypeText
obj2.Charset = "Shift_JIS"
obj2.Open
obj1.CopyTo obj2
obj2.SaveToFile strLogFileName, 2
obj2.Close
obj1.Close
Set obj1 = Nothing
Set obj2 = Nothing
The above is Convert "UTF-8" to "Shift_JIS" Code.
I am developing read from text file and write into csv file.
Encode type of original text file is UTF-8.
When write UTF-8 encoded data into CSV, Japanese Characters can't be displayed
correctly.
Therefore, convert encode type of text file before reading.
But that code is OK for only first time.
When read the same text file next time, error happens in writing process.
Because Japanese Characters of text file change abnormal character(eg:???)
when file have already been "Shift_JIS" is encoded to "Shift_JIS".
So I want to check encode type is "UTF-8" or not before converting.
Possible Duplicate:
Batch-convert files for encoding or line ending
I have a bunch of text files that I'd like to convert from any given charset to UTF-8 encoding.
Are there any command line tools or Perl (or language of your choice) one liners I can use to do this en masse?
iconv does convert between many character encodings. So adding a little bash magic and we can write
for file in *.txt; do
iconv -f ascii -t utf-8 "$file" -o "${file%.txt}.utf8.txt"
done
This will run iconv -f ascii -t utf-8
to every file ending in .txt
, sending the recoded file to a file with the same name but ending in .utf8.txt
instead of .txt
.
It's not as if this would actually do anything to your files (because ASCII is a subset of UTF-8), but to answer your question about how to convert between encodings.