How to read a file in UTF-8 and remove a BOM in batch script?

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
PiotrMP006
Posts: 29
Joined: 08 Sep 2017 06:10

How to read a file in UTF-8 and remove a BOM in batch script?

#1 Post by PiotrMP006 » 04 Oct 2021 07:11

Hi

How to read a file in UTF-8 and remove a BOM in batch script?

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to read a file in UTF-8 and remove a BOM in batch script?

#2 Post by aGerman » 04 Oct 2021 08:33

The BOM is just one multibyte-character. You can read it away using PAUSE commands for the 3 bytes.

Code: Select all

@echo off
set "rmvBomRedir=|((pause&pause&pause)>nul&findstr "^^")"
set "rmvBomFor=^|((pause^&pause^&pause^)^>nul^&findstr "^^"^)"

>nul chcp 65001

:: redirect to console
type "utf8.txt" %rmvBomRedir%

:: redirect to a pipe
(type "utf8.txt" %rmvBomRedir%)|find /v ""

:: redirect to a file
>"utf8noBom.txt" (type "utf8.txt" %rmvBomRedir%)

:: process in a FOR /F loop
for /f "delims=" %%i in ('type "utf8.txt" %rmvBomFor%') do echo %%i

pause
Steffen
Last edited by aGerman on 05 Oct 2021 01:54, edited 1 time in total.
Reason: updated code

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to read a file in UTF-8 and remove a BOM in batch script?

#3 Post by aGerman » 05 Oct 2021 01:54

Forget about what I told you about one PAUSE is enough. Updated the code above.

Post Reply