Page 2 of 2

Re: Script to detect type (encoding) of files

Posted: 18 Oct 2021 11:15
by aGerman
IIRC at least UTF-16 is converted by the TYPE command.

Steffen

Re: Script to detect type (encoding) of files

Posted: 19 Oct 2021 13:29
by aGerman
I updated the code once again. It's a little more robust now because it explicitly treats bytes 0x00..0x07 and 0xF8..0xFF being invalid. Besides of that I changed the errorlevel logic to get more information.
errorlevel >= 2 -- UTF-8 with multibyte sequences
errorlevel == 1 -- All ASCII. This is valid UTF-8 as long as it doesn't represent UTF-7.
errorlevel == 0 -- Anything else, including ANSI codepages, UTF-16, or binary data.

However, that's all more or less a bit-twiddling hack. Some people love it, some hate it :lol:

Steffen

Re: Script to detect type (encoding) of files

Posted: 15 Jan 2022 08:40
by aschipfl
Hello,
several years ago I wrote a script that can determine whether or not a file is ASCI/ANSI-encoded:
https://stackoverflow.com/a/43147510