How to create an UTF-8 file (instead of UCS-2 Little Endian)?

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
pstein
Posts: 125
Joined: 09 Nov 2011 01:42

How to create an UTF-8 file (instead of UCS-2 Little Endian)?

#1 Post by pstein » 28 Nov 2018 05:10

Wehn I issue from a DOS batch file the following command:

echo foobar >myfile.txt

then myfile.txt will always be created as UCS-2 Little Endian based file

How can I tell the DOS batch execution to create an UTF-8 file instead?

Peter

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to create an UTF-8 file (instead of UCS-2 Little Endian)?

#2 Post by aGerman » 28 Nov 2018 11:34

pstein wrote:
28 Nov 2018 05:10
then myfile.txt will always be created as UCS-2 Little Endian based file
Serioulsly? I can't believe that. It's definitely not the default behavior.
pstein wrote:
28 Nov 2018 05:10
How can I tell the DOS batch execution to create an UTF-8 file instead?
The Windows has kind of UTF-8 support. It's rather bad though.

Code: Select all

@echo off
>nul chcp 1252
set "x=åéøü"
>nul chcp 65001
>"utf8test.txt" echo %x%
... where 1252 is the code page that I used to save the script. 65001 is the code page for UTF-8. The output is the UTF-8-encoded representation of the string. Although the file doesn't contain the UTF-8 byte order mark at the beginning.

Steffen

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: How to create an UTF-8 file (instead of UCS-2 Little Endian)?

#3 Post by Squashman » 28 Nov 2018 13:38

I thought we had a thread going on the forum about creating a UTF-8 file with a BOM. I can't find it.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to create an UTF-8 file (instead of UCS-2 Little Endian)?

#4 Post by aGerman » 28 Nov 2018 15:32

I'm absolutely sure we have several threads with examples of how to write the BOM. Everyone has the possibility to search for UTF-8 in the forum.
Nevermind.

Code: Select all

@echo off
>nul chcp 1252
set "x=åéøü"
<nul >"utf8test.txt" set /p "="

>nul chcp 65001
>>"utf8test.txt" echo %x%
Still dependent on the encoding used in the editor to save the script. Other possibilities would have been to use UTF-7 or CERTUTIL. Also it's not a big deal to use JREPL.BAT or CONVERTCP.exe to convert the file afterwards to any encoding you want. Both tools to be found in this forum, too.

Steffen

Post Reply