Page 1 of 1

UTF-8 to ANSI /file

Posted: 29 Oct 2012 10:04
by itsarnie
Hi,

I have around 2000 files in UTF-8 mode. I have to save all of them in ANSI mode.

Please let me know if I can do it in a batch, at one go.

Regards,
Arnie.

Re: UTF-8 to ANSI /file

Posted: 29 Oct 2012 10:44
by Boombox
I think you can just use;

Code: Select all

type inputfilename> outputfilename.txt


This would do one of them anyhow!


We would need a FOR loop to do them all...

Can we see the contents of one of these UTF-8 files please? Or a snippet at least?

Re: UTF-8 to ANSI /file

Posted: 29 Oct 2012 10:49
by foxidrive
Some interesting comments here. If you apply them to batch then the answer would seem to be: you need a dedicated command line tool to convert them.

http://stackoverflow.com/questions/8298 ... -ansi-in-c

Re: UTF-8 to ANSI /file

Posted: 29 Oct 2012 12:00
by Boombox

Code: Select all

cmd.exe  /a /c TYPE c:\Utf8.txt > c:\Ansi.txt


The /a switch forces ANSI... Doesn't it?


But if Foxi says otherwise....

Re: UTF-8 to ANSI /file

Posted: 29 Oct 2012 12:53
by Liviu
itsarnie wrote:I have around 2000 files in UTF-8 mode. I have to save all of them in ANSI mode.

Converting UTF-8 encoded text to any one codepage such as ANSI or OEM is "lossy" - characters not present in the target codepage will be either remapped (sometimes in surprising ways), or lost for good.

You can (losslessly) convert UTF-8 to UTF-16, see for example http://www.dostips.com/forum/viewtopic.php?p=16399#p16399.

You can also convert UTF-8 to the default ANSI codepage, as long as you don't mind the loss of information, using a variation of http://www.dostips.com/forum/viewtopic.php?p=16399#p16399. EDIT - see P.S. below for the corrected code.

Code: Select all

chcp 65001 >nul & cmd /a /c type utf8.txt >ansi.txt

Liviu

EDIT - P.S. On a second look, don't think it can be done (reliably) with just a one liner. The following however should always work. Note that it assumes the ANSI codepage is 1252, change that as necessary. The code uses "utf8to16.cmd" which would be the snippet in my previously linked post.

Code: Select all

call utf8to16 utf8.txt utf16.txt
chcp 1252>nul
type utf16.txt >ansi.txt
del utf16.txt

Re: UTF-8 to ANSI /file

Posted: 29 Oct 2012 15:30
by foxidrive
Boombox wrote:But if Foxi says otherwise....


I know very little about codepages and unicode. I assumed there might be some tool that is more thorough.

Liviu's post agrees with the stackoverflow's information in essence - and to paraphrase - the process is not possible without loss of information or surprising results.