Script to detect type (encoding) of files

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Script to detect type (encoding) of files

#16 Post by aGerman » 18 Oct 2021 11:15

IIRC at least UTF-16 is converted by the TYPE command.

Steffen

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Script to detect type (encoding) of files

#17 Post by aGerman » 19 Oct 2021 13:29

I updated the code once again. It's a little more robust now because it explicitly treats bytes 0x00..0x07 and 0xF8..0xFF being invalid. Besides of that I changed the errorlevel logic to get more information.
errorlevel >= 2 -- UTF-8 with multibyte sequences
errorlevel == 1 -- All ASCII. This is valid UTF-8 as long as it doesn't represent UTF-7.
errorlevel == 0 -- Anything else, including ANSI codepages, UTF-16, or binary data.

However, that's all more or less a bit-twiddling hack. Some people love it, some hate it :lol:

Steffen

aschipfl
Posts: 9
Joined: 13 Feb 2019 03:33

Re: Script to detect type (encoding) of files

#18 Post by aschipfl » 15 Jan 2022 08:40

Hello,
several years ago I wrote a script that can determine whether or not a file is ASCI/ANSI-encoded:
https://stackoverflow.com/a/43147510

Post Reply