About codepage

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Barnack
Posts: 3
Joined: 08 Nov 2015 06:42

About codepage

#1 Post by Barnack » 08 Nov 2015 06:50

Hello guys, i'm new.
I'll immediately tell that i'm not new to batch but my knowledge is quite limited to essential commands (dos ones, chcp, choice menu, echo, color, mode etcc).
My first question here is about codepages:
after becomming mad in order to be able to draw box characters (example: "╦") i've found this website "http://unicode-table.com/en/" which shows the entire unicode table. but then when i tryed putting other characters in my .bat files, like "֍ ֎ ༼ ༽ ༺ ༻ ᐅ ᐊ † ⚕ ⚔" i've noticed they don't work. With a little search i've discovered that the "chcp 65001" i was using enables the unicode 8 bit, while characters on the linked website are in the unicode 16 bit.
Can i add unicode 16bit to batch output?

p.s. i'm talking about batch files for Windows prompt, not for dos.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: About codepage

#2 Post by Liviu » 09 Nov 2015 11:10

Barnack wrote:i've discovered that the "chcp 65001" i was using enables the unicode 8 bit, while characters on the linked website are in the unicode 16 bit.

Technically UTF-8 and UTF-16 are both Unicode encodings. Also note that running batch files under "chcp 65001" does not work at all under Windows XP-, and has limitations under Windows 7+ (for example, piping doesn't work).

Barnack wrote:Can i add unicode 16bit to batch output?

See http://www.dostips.com/forum/viewtopic.php?f=3&t=5516 and http://www.dostips.com/forum/viewtopic.php?f=3&t=5358 for some starting points, then search this board for chcp 65001 for more.

Liviu

Aacini
Expert
Posts: 1885
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: About codepage

#3 Post by Aacini » 09 Nov 2015 22:33

An Off-Topic answer? (perhaps not... :wink: )

JScript is a programming language preinstalled in all Windows versions from XP on that is more powerful than Batch files. We use JScript now and then because it is very easy to write a Batch-JScript hybrid script that have the .bat extension, but that include a JScript code section via a very simple trick.

Unlike Batch files, JScript manage strings of Unicode characters; this means that is possible to directly write Unicode characters in a JScript program that be correctly displayed in the screen. For example:

Code: Select all

WScript.Stdout.WriteLine(
    "blend     ‹αß©∂€› \n" +
    "latin     àáâāăąǻ \n" +
    "greek     αβγδεζη \n" +
    "cyrillic  абвгдеж \n" +
    "arrows    ←↑→↓↔↕↨ \n" +
    "drawing   ▌◄▲○▼►▐ \n" +
    "currency  ¢£¤¥₣₤€ \n" +
    "math      ±×∂∆∏∑− \n" +
    "punct     «¡¿©®§† \n" +
    "misc      ¼⅛¹♠♣♥♦ \n" +
    "frames    ┌┬┐├┼┤└┴┘─│"
);

Copy this code and paste it in a file with .js extension ("test.js" for example), then execute it by just entering its name at the command prompt. If that not works, use this line:

Code: Select all

cscript //nologo test.js

Note the following points about this program:

  • When you save the file, you must select >Save as >Codification: Unicode; otherwise all the special Unicode characters will be changed by regular Ascii ones.
  • Previous point means that this code can not be used in a Batch-JScript hybrid script, because Batch files can not directly contain Unicode characters. This point may be fixed changing the literal Unicode characters by their code-point numbers and using JScript's "String.fromCharCode(code)" function to convert the numbers to characters.
  • This program should show the right characters in the screen independently of the active code page. This means that it also should work in Windows XP. However...
  • The important one: This program will show the characters that are defined in the current font used by the cmd.exe command prompt window only; the Unicode characters that are not defined in such font will be displayed as small squares. There are just two True-Type fonts that can be used in cmd.exe: Consolas and Lucida Console; you may review the characters defined in such fonts via Character Map Windows accessory.

Below there is a Batch-JScript hybrid script that use the JScript section to show a frame in the screen comprised of single lines. Of course, any other Unicode character may be displayed using the same method.

Code: Select all

@if (@CodeSection == @Batch) @then


@echo off
echo/
echo Request JScript section to show a frame of 40 cols X 5 rows, placed at column 10:
echo/
CScript //nologo //E:JScript "%~F0" 40 5  10
echo/
echo End of example
goto :EOF


@end

// JScript section

// Initialize the code-points of the line-drawing characters
var LeftTop   =9484,  Top   =9516,  RightTop   =9488,
    Left      =9500,  Center=9532,  Right      =9508,
    LeftBottom=9492,  Bottom=9524,  RightBottom=9496,
    Hor=9472,         Ver=9474;

var arg = WScript.Arguments, margin = "", horiz = "", inside = "",
    width = parseInt(arg(0))-2, height = parseInt(arg(1))-2;

// If a third parameter was given: indicate the margin from left side
if ( arg.Length >= 3 ) for ( var i = 1; i <= arg(2); i++ ) margin += " ";

for ( i = 1; i <= width; i++ ) {
   horiz  += String.fromCharCode(Hor);
   inside += " ";
}

WScript.Stdout.WriteLine(margin+String.fromCharCode(LeftTop)+horiz+String.fromCharCode(RightTop));
for ( i = 1; i <= height; i++ ) {
   WScript.Stdout.WriteLine(margin+String.fromCharCode(Ver)+inside+String.fromCharCode(Ver));
}
WScript.Stdout.WriteLine(margin+String.fromCharCode(LeftBottom)+horiz+String.fromCharCode(RightBottom));

JScript is simple! Isn't it? 8)

Antonio
Last edited by Aacini on 17 Nov 2015 20:30, edited 1 time in total.

Barnack
Posts: 3
Joined: 08 Nov 2015 06:42

Re: About codepage

#4 Post by Barnack » 10 Nov 2015 12:50

Thanks guys; since i've not much time i've not tested your solutions yet, but i've some other questions.
Anyway i need it to work only on win7+ computers, so no problem about limitations

@Aacini
Won't i loose in performance calling a js inside a batch to draw things?

@Liviu
I've not read all in your links but it seems they're referring only to utf-8, not 16, that is what i want to achieve.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: About codepage

#5 Post by Liviu » 10 Nov 2015 20:29

Barnack wrote: I've not read all in your links but it seems they're referring only to utf-8, not 16, that is what i want to achieve.

Not so, check $chrW.cmd in viewtopic.php?p=33810#p33810 (which is part of the 1st link I posted).

If you want to hardcode Unicode strings in the batch file itself, then your .cmd must be saved as UTF-8 indeed (see my 1st post in the 2nd link). However, virtually all editors nowadays (including notepad in Win7) have the option to save arbitrary Unicode text in UTF-8 encoding, so I don't see why that'd be an issue.
Last edited by Liviu on 10 Nov 2015 23:15, edited 1 time in total.

Jer
Posts: 177
Joined: 23 Nov 2014 17:13
Location: California USA

Re: About codepage

#6 Post by Jer » 10 Nov 2015 22:32

Here's putting text in the empty box, code inspired by Aacini.
First attempt with jscript.
Batch file does poorly if ampersand included in the args.

Code: Select all

@if (@CodeSection == @Batch) @then


@echo off
echo/
echo Request JScript section to show a frame wide enough
echo to contain a message X 5 rows, placed at column 10:
echo/

if "%1"=="" (Set "userMsg=Hello World!") Else Set "userMsg=%*"

setlocal

Set "userMsg=%userMsg:~0,80%

echo(
CScript //nologo //E:JScript "%~F0" 5  10 "%userMsg%"
echo/
echo End of example
goto :EOF
endlocal

@end

// JScript section
// Initialize the code-points of the line-drawing characters
var LeftTop   =9484,  Top   =9516,  RightTop   =9488,
    Left      =9500,  Center=9532,  Right      =9508,
    LeftBottom=9492,  Bottom=9524,  RightBottom=9496,
    Hor=9472,         Ver=9474;

var arg = WScript.Arguments, msg = arg(2);
msgLen = arg(2).length;
boxDim = msgLen + 4;

var arg = WScript.Arguments, margin = "", horiz = "", inside = "",
    width = msgLen+2, height = parseInt(arg(0))-2;

// If a second parameter was given: indicate the margin from left side
if ( arg.Length >= 3 ) for ( var i = 1; i <= arg(1); i++ ) margin += " ";

for ( i = 1; i <= width; i++ ) {
   horiz  += String.fromCharCode(Hor);
   inside += " ";
}

midHeight = Math.round(height / 2);
msgPad = ((boxDim - msgLen) / 2);

for (i = 1; i <= msgPad; i++) {
   msg = " " + msg + " ";
}

msg = msg.substr(1, (boxDim - 2));

WScript.Stdout.WriteLine(margin+String.fromCharCode(LeftTop)+horiz+String.fromCharCode(RightTop));
for ( i = 1; i <= height; i++ ) {
   if (i == midHeight) {
      WScript.Stdout.WriteLine(margin+String.fromCharCode(Ver)+msg+String.fromCharCode(Ver));
   }
   else {
      WScript.Stdout.WriteLine(margin+String.fromCharCode(Ver)+inside+String.fromCharCode(Ver));
   }
}
WScript.Stdout.WriteLine(margin+String.fromCharCode(LeftBottom)+horiz+String.fromCharCode(RightBottom));

Aacini
Expert
Posts: 1885
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: About codepage

#7 Post by Aacini » 11 Nov 2015 12:39

Barnack wrote:@Aacini
Won't i loose in performance calling a js inside a batch to draw things?

In general: no. The cscript "compiler" is pretty fast and the execution of JScript code is much faster than Batch code. If the JScript code section is large and it needs to be executed many times (i.e. in a loop), then the whole data could be generated in the Batch section and the JScript section could be invoked just one time. In general, is faster to execute Batch-JScript hybrid code than execute pure Batch code that do the same task (and is even faster to execute pure JScript code).


Jer wrote:Here's putting text in the empty box, code inspired by Aacini.
First attempt with jscript.

Wow! What a pretty code in "your" first JScript program! :P This is possible because JScript syntax is pretty straightforward...


The program below is an example of how to use several basic JScript features.

Code: Select all

@if (@CodeSection == @Batch) @then


@echo off
setlocal EnableDelayedExpansion
cls
echo/
echo Show a phrase inside a box with different formats using JScript code:
set "str=The quick brown fox jumps over the lazy dog"
set "double="
for /L %%i in (1,1,5) do (
   echo/
   CScript //nologo //E:JScript "%~F0" str %%i !double! /C
   if not defined double (set double=/D) else set "double="
)
echo/
echo End of example
goto :EOF


@end

// JScript section

/*
Show a multi-line message inside a box

CScript //nologo //E:JScript "%~F0" messageVar [lines] [/C[:cols]] [/D]
    messageVar  Environment variable that contain the message
    lines       Show the message in this number of lines *or less* (default: 1)
    /C[:cols]   Center the box in this number of columns (default: 80)
    /D          Use double-lines for box frame (default: single-lines)
*/


// Function spaces: input = number of spaces (may be zero), output = spaces string
function spaces ( num ) {
   var result = "";
   for ( var i = 1; i <= num; i++ ) result += " ";
   return (result);
}


// Initialize the code-points of the line-drawing characters in LeftTop,RightTop,LeftBot,RightBot,Hor,Ver order
var frame  = "\u250c\u2510\u2514\u2518\u2500\u2502",   // Default frame: single lines
    double = "\u2554\u2557\u255a\u255d\u2550\u2551",
    LeftTop=0, RightTop=1, LeftBot=2, RightBot=3, Hor=4, Ver=5;   // Handy constants

// Get the message from the *environment variable* whose name is given in first argument
var arg      = WScript.Arguments,                       // Predefined object to get the arguments
    WshShell = WScript.CreateObject("WScript.Shell"),   // Standard object to access several Shell features
    EnvVars  = WshShell.Environment("Process"),         // ... like the environment variables (there are 4 types)
    msg      = EnvVars(arg(0));

var lines = 1, cols = 0;

// Get the number of lines from second argument
if ( arg.length > 1 ) lines = parseInt(arg(1));

// If the "named parameter" /D was given: use double-lines in box frame
if ( arg.Named.Exists("D") ) frame = double;

// If the "named parameter" /C was given, initialize the value of "cols": /C
//                                        or get its value if is also included: /C:cols
if ( arg.Named.Exists("C") ) {
   cols = 80;
   if ( arg.Named.Item("C") != undefined ) cols = parseInt(arg.Named.Item("C"));
}


// Get the *minimum* length of each line
var minLen  = Math.floor(msg.length/lines), maxLen = 0,
    msgLine = new Array();   // Empty array, *any* element may be assigned later

// Split the message in several lines at the first space after minLen
lines = 0;
while ( msg != "" ) {
   var nextWordPos = msg.indexOf(" ",minLen);
   if ( nextWordPos >= 0 ) {
      msgLine[++lines] = msg.substr(0,nextWordPos);
      msg = msg.substr(nextWordPos+1);
   } else {
      msgLine[++lines] = msg;
      msg = "";
   }
   if ( msgLine[lines].length > maxLen ) maxLen = msgLine[lines].length;
}


var margin = "";
if ( cols > 0 ) margin = spaces(Math.floor((cols-maxLen-2)/2));

var horiz = "";
for ( var i = 1; i <= maxLen; i++ ) horiz += frame.charAt(Hor);

WScript.Stdout.WriteLine(margin+frame.charAt(LeftTop)+horiz+frame.charAt(RightTop));
for ( i = 1; i <= lines; i++ ) {
   WScript.Stdout.WriteLine(margin+frame.charAt(Ver)+msgLine[i]+spaces(maxLen-msgLine[i].length)+frame.charAt(Ver));
}
WScript.Stdout.WriteLine(margin+frame.charAt(LeftBot)+horiz+frame.charAt(RightBot));

Output:

Code: Select all

Show a phrase inside a box with different formats using JScript code:

                 ┌───────────────────────────────────────────┐
                 │The quick brown fox jumps over the lazy dog│
                 └───────────────────────────────────────────┘

                          ╔═════════════════════════╗
                          ║The quick brown fox jumps║
                          ║over the lazy dog        ║
                          ╚═════════════════════════╝

                               ┌───────────────┐
                               │The quick brown│
                               │fox jumps over │
                               │the lazy dog   │
                               └───────────────┘

                               ╔═══════════════╗
                               ║The quick brown║
                               ║fox jumps over ║
                               ║the lazy dog   ║
                               ╚═══════════════╝

                                  ┌──────────┐
                                  │The quick │
                                  │brown fox │
                                  │jumps over│
                                  │the lazy  │
                                  │dog       │
                                  └──────────┘

End of example

The pure JScript program below show the code-point numbers of Unicode characters:

Code: Select all

str = "┌┐└┘─│ ╔╗╚╝═║";

for ( var i = 0; i < str.length; i++ ) {
   WScript.Stdout.WriteLine( '"'+str.charAt(i)+'" = '+
                             "\\u"+("000"+str.charCodeAt(i).toString(16)).slice(-4)
   );
}

Antonio

Aacini
Expert
Posts: 1885
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: About codepage

#8 Post by Aacini » 13 Nov 2015 12:34

I wrote a funny program that makes good use of some of the Unicode characters defined in Consolas font. Here it is:

Code: Select all

@if (@CodeSection == @Batch) @then


@echo off
setlocal

rem UpsideDown.bat: Show a string upside down, requires "Consolas" font
rem Antonio Perez Ayala

:loop
   echo/
   set "line="
   set /P "line=Enter a line: "
   if not defined line goto :EOF
   set /P "=Upside-down:  " < NUL
   CScript //nologo //E:JScript "%~F0" "%line%"
goto loop


@end


// JScript section

// Define the map from normal characters to upside-down ones **IN "Consolas" FONT!**

var upside = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,.!?",
    down   = "\u0244\u0071\u0186\u0070\u018e\u2132\u0494\u0048\u0049\u017f\u029e\u02e5\u0057" +  // "ABCDEFGHIJKLM"
             "\u004e\u004f\u0500\u00d2\u1d1a\u0053\u2534\u041f\u039b\u004d\u0058\u019b\u005a" +  // "NOPQRSTUVWXYZ"
             "\u0250\u0071\u0254\u0070\u01dd\u025f\u1d77\u0265\u1d09\u027e\u029e\u01ae\u026f" +  // "abcdefghijklm"
             "\u0075\u006f\u0064\u0062\u0279\u0073\u0287\u006e\u028c\u028d\u0078\u028e\u007a" +  // "nopqrstuvwxyz"
             "\u2018\u0387\u00a1\u00bf";                                                         // ",.!?"

var input = WScript.Arguments(0), output = "";

for ( var i = input.length-1; i >= 0; i-- ) {
   var pos = upside.indexOf(input.charAt(i));
   output += (pos >= 0) ? down.charAt(pos) : input.charAt(i) ;
}

WScript.Stdout.WriteLine(output);

Output example:

Code: Select all

Enter a line: It is a funny thing, the whole line appears upside-down!
Upside-down:  ¡uʍop-ǝpᴉsdn sɹɐǝddɐ ǝuᴉƮ ǝƮoɥʍ ǝɥʇ ‘ᵷuᴉɥʇ ʎuunɟ ɐ sᴉ ʇI

Enter a line: My name is Antonio Perez Ayala, a.k.a. Aacini.
Upside-down:  ·ᴉuᴉɔɐɄ ·ɐ·ʞ·ɐ ‘ɐƮɐʎɄ zǝɹǝԀ oᴉuoʇuɄ sᴉ ǝɯɐu ʎW

Tested in Windows 8.1. Remember that you must select Consolas font in the cmd.exe window in order for this program to show the proper output; however, I don't know if the Consolas font included in other Windows versions have defined the same characters used by this program (although the common sense indicate: "yes").

Antonio
Last edited by Aacini on 17 Nov 2015 20:34, edited 1 time in total.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: About codepage

#9 Post by Squashman » 13 Nov 2015 12:47

Works on Windows 7 using Consolas font

Barnack
Posts: 3
Joined: 08 Nov 2015 06:42

Re: About codepage

#10 Post by Barnack » 17 Nov 2015 11:34

So basically i have to start learning java and switch to it when needed

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: About codepage

#11 Post by penpen » 17 Nov 2015 17:50

No, you could also do all in batch:
You have to read all linked information (and files) Liviu has given you in his first post.

For example you could write this "upsideDown.bat" (which needs my "testcon.bat", Liviu's "$cpChars.cmd", and of course the right console font set in your shell):

Code: Select all

@echo off
if not "%~1" == "UNICODE" (
   cmd /U /E:ON /V:ON /C ^""%~f0" "UNICODE"^"
   goto :eof
)
setlocal enableExtensions enableDelayedExpansion
:: Define the map from normal characters to upside-down ones **IN "Consolas" FONT!**
:: Author           : Antonio Perez Ayala (aka Aacini www.dostips.com)
:: Batch port author: Ulf Schneider       (aka penpen www.dostips.com)

set "codepage=850"
for /F "tokens=2 delims=:." %%a in ('chcp') do for %%b in (%%~a) do set "codepage=%%~b"
set "UTF8_CP=65001"
set "down="
set "upsideH=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,.^!?000000000000000011111111111111112222222222222222333333333"
set "upsideL=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,.^!?0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF012345678"
set /A "charCount=0x37"
set "down=!down!
set "down=!down!\u0244\u0071\u0186\u0070\u018e\u2132\u0494\u0048\u0049\u017f\u029e\u02e5\u0057"   &REM // "ABCDEFGHIJKLM"
set "down=!down!\u004e\u004f\u0500\u00d2\u1d1a\u0053\u2534\u041f\u039b\u004d\u0058\u019b\u005a"   &REM // "NOPQRSTUVWXYZ"
set "down=!down!\u0250\u0071\u0254\u0070\u01dd\u025f\u1d77\u0265\u1d09\u027e\u029e\u01ae\u026f"   &REM // "abcdefghijklm"
set "down=!down!\u0075\u006f\u0064\u0062\u0279\u0073\u0287\u006e\u028c\u028d\u0078\u028e\u007a"   &REM // "nopqrstuvwxyz"
set "down=!down!\u2018\u0387\u00a1\u00bf"                                                         &REM // ",.!?"
(
   >nul chcp 65001
   >"temp.dat" call testcon %down:\u= %
   <"temp.dat" set /P "down="
   del "temp.dat"
   >nul chcp %codepage%
)
setlocal disableDelayedExpansion

call :upsideDown "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz,.!?"
echo(
call :upsideDown "It is a funny thing, the whole line appears upside-down!"

endlocal
goto :eof


:upsideDown
:: lazy implementation: for/L loop over 8192 chars; no check for %~1
: %~1   Text to display upside down
: %~2   [optional] variable to store
set "text= %~1"
set "udText= "
set "c= "
set "i=0"
:loop
   set /A "i+=1"
   call set "c=%%text:~%i%,1%%"
   if not defined c goto :break
   for /F "tokens=2 delims=%c%" %%c in ("#%upsideH%") do set "h=%%~c"
   for /F "tokens=2 delims=%c%" %%c in ("#%upsideL%") do set "l=%%~c"
   call set "h=%%h:~%charCount%,1%%"
   call set "l=%%l:~%charCount%,1%%"
   if "0x" == "0x%h%%l%" ( set "udText=%c%%udText%"
   ) else call set "udText=%%down:~0x%h%%l%,1%%%%udText%%"
   goto :loop
:break
echo(text  : "%text:~1%"
echo(udtext: "%udtext:~1%"
goto :eof
The usage of unicode characters in coinsole is just a little more work, as it is "not well supported for" batch:
So JScript is simpler to use.


penpen

Post Reply