«special…character»

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

«special…character»

#1 Post by Ed Dyreen » 30 Jul 2012 05:59

'
Why isn't this special character in my codepage 850 ?

Code: Select all

…    …    …    horizontal ellipsis
If I can identify which codepage the titlebar uses ( the very same character seems to produce different results ).

Code: Select all

>chcp
Actieve codetabel: 850

>title title.

>echo text.

Code: Select all

title…  « imagine this is the titlebar
text.
I like my codepage to match so it produces

Code: Select all

title…  « imagine this is the titlebar
text…
I try it from a batch and get a totally different result :shock:

Code: Select all

Actieve codetabel: 850
titleà  « imagine this is the titlebar
textà


Thanks,

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: «special…character»

#2 Post by foxidrive » 30 Jul 2012 06:59

code page differences and foreign character sets give me a headache. :)

Sorry, that's no help... just saying that I avoid dealing with them. ;)

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: «special…character»

#3 Post by Liviu » 30 Jul 2012 09:13

The cmd window and titlebar are fully unicode, which is why setting the title at the prompt works. However, the batch file itself is 8-bit extended ASCII, and when it is executed the hardcoded strings are translated to unicode based on the active codepage. In your case that is cp 850, which does not carry the ellipsis, indeed.

Why you get an 'à' instead is probably because your editor saved the text file using codepage 1252 encoding (typical of GUI editors), where '…' is char 0x85. But when the batch file runs, the active codepage is 850 where char 0x85 happens to be the 'à' (see http://msdn.microsoft.com/en-us/goglobal/cc305145, http://msdn.microsoft.com/en-us/goglobal/cc305160.aspx).

The right and safe way to get any character in the title would be to set an environment variable with the unicode string you want to use, then reference that variable from the batch file.

A hack'ish and more limited workaround which could work for ellipsis specifically would be to manually "remap" the codes so that they end up where you want them. For example, the following sets the title to "« … »".

Code: Select all

chcp 1252
title ½ à ╗
chcp 850
The three characters above must be 0xAB, 0x85, 0xBB in the .cmd file on disk.

Liviu

aGerman
Expert
Posts: 4742
Joined: 22 Jan 2010 18:01
Location: Germany

Re: «special…character»

#4 Post by aGerman » 30 Jul 2012 10:06

We discussed that a few times before. Since the Germans have some strange characters (ÄÖÜäöüß) I'm a bit familiar with that theme :wink: Probably your editor saves the batch file in your default ANSI codepage (aka ACP, which should be 1252 in your case). The CMD works with your default ASCII codepage (aka OEMCP, should be 850 for you).
There is a simple trick using variables to get it to work:

Code: Select all

@echo off &setlocal
>nul chcp 1252
set "x=…"
>nul chcp 850
title title%x%

Unfortunately it seems this character is not supported inside of the cmd window while it works perfectly for a lot of other special characters.

Regards
aGerman

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: «special…character»

#5 Post by Ed Dyreen » 30 Jul 2012 13:00

'
Liviu,

Your explanation is of great value to me. Unfortunately your hack'ish example doesn't work here.

Code: Select all

>chcp 1252 &title title ½ à ╗ &chcp 850
Actieve codetabel: 1252
Actieve codetabel: 850

>
Gives title ½ à {square}


aGerman,

I was hoping you'd read this and you did. The example that sets the variable works sweat.


Thanks :D
I've bookmarked this page and should print it out on paper or post this as html because it's that valuable to me.
I have a feeling I have a few more questions soon...

aGerman
Expert
Posts: 4742
Joined: 22 Jan 2010 18:01
Location: Germany

Re: «special…character»

#6 Post by aGerman » 30 Jul 2012 13:56

Just an example how I usually work with it

Code: Select all

@echo off &setlocal

setlocal&for /f "tokens=2 delims=:" %%a in ('chcp') do (set /a oemcp=%%~na&chcp 1252>nul)
for /f "tokens=1-7" %%a in ('echo Ä Ö Ü ä ö ü ß^&chcp %oemcp%^>nul') do (
set au=%%a&set ou=%%b&set uu=%%c&set al=%%d&set ol=%%e&set ul=%%f&set sz=%%g)
(endlocal&set Ä=%au%&set Ö=%ou%&set Ü=%uu%&set ä=%al%&set ö=%ol%&set ü=%ul%&set ß=%sz%)

:: German example:
echo %Ä%nderungen der Ma%ß%einheiten f%ü%hren m%ö%glicherweise zu
echo verf%ä%lschten Ergebnissen in der %Ü%bertragsgleichnung.
echo(
pause


Regards
aGerman

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: «special…character»

#7 Post by Liviu » 30 Jul 2012 15:23

aGerman wrote:Unfortunately it seems this character is not supported inside of the cmd window while it works perfectly for a lot of other special characters.

Difference, I think, is that the other characters in your example do exist in both ANSI and OEM codepages (even if they may have different numeric values between them) which allows unicode<->oem<->ansi conversions to work "losslessly". The "…" character does not exist in the OEM codepage at all.

Ed Dyreen wrote:Unfortunately your hack'ish example doesn't work here.

Code: Select all

>chcp 1252 &title title ½ à ╗ &chcp 850
Actieve codetabel: 1252
Actieve codetabel: 850
Gives title ½ à {square}

A few things to check... First off, if you run the above at the cmd prompt, you'll get exactly what you typed, in this case "½ à ╗". This is because the command line itself is unicode aware, so it takes the string as entered without performing any codepage conversions. My snippet was meant to be included in a batch file.

Then, you need to leave "chcp 1252" on its own separate line. This because the parser works one line at a time, and the translation is based on the codepage in effect when the (entire) line is parsed. In other words, the "title" command is parsed and built before "chcp 1252" has a chance to execute.

Lastly, doublecheck and make sure that the characters in the batch file you saved are indeed 0xAB, 0x85, 0xBB. How you get them there depends on the editor you use, and sometimes on "magical" codepage conversions copy/paste does on its own.

Liviu

aGerman
Expert
Posts: 4742
Joined: 22 Jan 2010 18:01
Location: Germany

Re: «special…character»

#8 Post by aGerman » 30 Jul 2012 18:16

Liviu wrote:
aGerman wrote:Unfortunately it seems this character is not supported inside of the cmd window while it works perfectly for a lot of other special characters.

Difference, I think, is that the other characters in your example do exist in both ANSI and OEM codepages (even if they may have different numeric values between them) which allows unicode<->oem<->ansi conversions to work "losslessly". The "…" character does not exist in the OEM codepage at all.

Yeah, probably it works in the title because it was encoded to unicode internally.

Liviu wrote:
Ed Dyreen wrote:Unfortunately your hack'ish example doesn't work here.

Code: Select all

>chcp 1252 &title title ½ à ╗ &chcp 850
Actieve codetabel: 1252
Actieve codetabel: 850
Gives title ½ à {square}

[ ... ]

Lastly, doublecheck and make sure that the characters in the batch file you saved are indeed 0xAB, 0x85, 0xBB. How you get them there depends on the editor you use, and sometimes on "magical" codepage conversions copy/paste does on its own.

When I used my Editor I had to write ...

Code: Select all

chcp 1252
echo «…»
chcp 850

... because it displays and saves the file in ANSI character set.
The point is that it only works for the title (apart from '…' for which we already know it can't be displayed in the cmd window).

Code: Select all

chcp 1252
echo «»
chcp 850

... displays ...
Aktive Codepage: 1252.
½╗
Aktive Codepage: 850.

(yes, I checked the characters with my HEX editor, the content is AB BB for the both characters)
Though I remember some discussions with Americans here and in other forums. They stated it would work this way and the assigning of variables (like I did) was not necessary. I can't confirm that for my German Windows (neither XP nor Win7).

Regards
aGerman

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: «special…character»

#9 Post by Liviu » 30 Jul 2012 21:04

aGerman wrote:The point is that it only works for the title (apart from '…' for which we already know it can't be displayed in the cmd window).

Sorry, no, and don't trust your eyes ;-) It does work both for the title and the console itself, including the '…' ellipsis.

What makes discussing unicode issues even more complicated than it needed be is the hidden dependencies on one's console settings (codepage, font) and editor habits (codepage default, and copy/paste "automatic" adjustments).

This snippet

Code: Select all

C:\tmp>chcp 1252
Active code page: 1252

C:\tmp>type ellipsis.cmd
@echo off
setlocal
for /f "tokens=2 delims=:" %%a in ('chcp') do set /a oemcp=%%a

chcp 1252 >nul
echo «…»
chcp %oemcp% >nul
outputs what looks like this with the (default) console raster font

Code: Select all

C:\tmp>ellipsis
½à╗
but gives this with the console set to use a multi-codepage font like Lucida Console

Code: Select all

C:\tmp>ellipsis
«…»

Regardless of the font (and what's displayed in the console) the output is binarily the same, as can be verified by redirecting >somefile.txt.

Liviu

Post Reply