Saving a filename with a unicode character

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Saving a filename with a unicode character

#1 Post by MKANET » 06 Apr 2012 23:19

I've been searching and searching on google for examples, but can't figure out how to do it.

All I want to do is create an empty text file with the following name (as a placeholder).

something like this:

echo > ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ %location% ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒


I dont need to see this character ▒ in a command window; which seems to be the most common examples I've seen. All I know is I might have to use the chcp command. I have no idea what code page to use for it; or, how to even setup the batchfile for my echo command to work.

Below, is everything I could find in reference to that special character (in terms of unicode character lookup).
http://www.fileformat.info/info/unicode/char/2592/index.htm

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#2 Post by Liviu » 07 Apr 2012 00:48

MKANET wrote:All I want to do is create an empty text file

FWIW any redirected 'echo' will create a file at least 2 bytes long (a CR+LF empty line) i.e. not an empty file proper.

MKANET wrote:something like this:

echo > ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ %location% ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

I dont need to see this character ▒ in a command window

Then, where exactly do you need to see it? The U+2592 character is in the "block elements" group, which many GUI oriented fonts don't cover. In XP for example, using the default Tahoma font, you'd see empty rectangles in place of "▒".

MKANET wrote:I have no idea what code page to use for it; or, how to even setup the batchfile for my echo command to work.

You can use any codepage that includes the given character, for example 437 and 850 both do. You need to create/edit the batch file in a codepage-aware editor, and make sure it's saved under the same codepage the cmd prompt will run it under. However, given that you don't care what it looks like in the console, and the high chance it may look wrong in a GUI like Win Explorer, I am not sure why you'd want to do that.

Liviu

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#3 Post by MKANET » 07 Apr 2012 01:15

I don't mind if the file is 2 bytes long or completely empty; as long as it gets created by my batch file. This file (when created manually) is easily viewable in Windows Explorer. That's the only place I need to see it. I would like an actual working example; as, I have already tried going down the route of looking up codepages and trying to figure out how to make it work. I see a lot of talk about displaying unicode characters when doing google searches; but not any specific examples that are even close to what I'm trying to do.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#4 Post by Liviu » 07 Apr 2012 01:42

MKANET wrote:I would like an actual working example; as, I have already tried going down the route of looking up codepages and trying to figure out how to make it work.

Paste the following

Code: Select all

@echo.>"▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ something ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒"
into an empty notepad document, save-as, select encoding = unicode, name it say C:\tmp\mkanet.txt.

Open a cmd prompt at C:\tmp, execute the following

Code: Select all

C:\tmp>chcp
Active code page: 437

C:\tmp>type mkanet.txt
@echo.>"▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ something ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒"

C:\tmp>type mkanet.txt >mkanet.cmd

C:\tmp>type mkanet.cmd
@echo.>"▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ something ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒"

C:\tmp>mkanet

C:\tmp>dir /b *.
▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ something ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

C:\tmp>
Note that the codepage returned by 'chcp' (437 above) is not essential, but if the 'type' lines are not identical between mkanet.txt and mkanet.cmd, then the console is set to a codepage which does not include your character.

Otherwise, the last 'dir' shows that the file has been created successfully.

MKANET wrote:I see a lot of talk about displaying unicode characters when doing google searches; but not any specific examples that are even close to what I'm trying to do.
It is easy to display any unicode characters at the prompt, just save them to a (unicode encoded) .txt file, and 'type' the file at a cmd prompt set to use a font which covers the desired ranges (generally not the default raster font).

It is however not possible to hardcode arbitrary unicode pathnames in a batch file, since those are 8-bit codepage-encoded text. The lucky break for "▒" is that it's covered by many of the OEM codepages.

Liviu

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#5 Post by MKANET » 07 Apr 2012 10:31

I probably should have said this originally.

The problem I'm running into (before I started this thread) is when I put the below line (or anything that echos unicode characters) in a batch file (saved as 'Unicode' in Notepad.exe), I get the below result (which obviously doesnt work). It works from the command line, just not from a batch file. I am certain I am saving the batch file as a Unicode (not ANSI) txt file using notepad. What trick do I need to do for this to work. I'm guessing you might have thought it goes without saying.

Code: Select all

@echo.>"▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ something ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒"



Result:
C:\tmp>■
'■' is not recognized as an internal or external command,
operable program or batch file.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#6 Post by Liviu » 07 Apr 2012 11:34

MKANET wrote:What trick do I need to do for this to work. I'm guessing you might have thought it goes without saying.

I am guessing you did not try the code I posted. Please do. Hint: unicode batch files do not work, which is precisely why the "type mkanet.txt >mkanet.cmd" is needed, which converts the unicode .txt to a CP-437 .cmd.

Liviu

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#7 Post by MKANET » 07 Apr 2012 11:48

I did try that's how I know it works from the command line. It is no use to me to manually do it; as I can already create the file in Windows Explorer. I need a way to automate this using a batch if any way possible.

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#8 Post by MKANET » 07 Apr 2012 11:50

Nevermind im just being dense. Now I understand. Ill try it as soo as I get back to the computer.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Saving a filename with a unicode character

#9 Post by aGerman » 07 Apr 2012 14:38

Batch has virtually no unicode support. With some tricks you can save unicode text in a file ...

Code: Select all

@echo off

:: change the code page to ANSI
>nul chcp 1252

:: Byte Order Mark
set "BOM=ÿþ"

:: Wide Character 2592 written as ASCII 92 (’) and 25 (%) in Little Endian order
set "WCHAR=’%%"

:: start the line with the BOM and 15 times ’%
>"123.txt" set /p "=%BOM%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%"<nul

:: convert normal ASCII to Unicode using CMD /U and append it to the line
cmd /u /c >>"123.txt" set /p "=location"<nul

:: append 15 times ’%
>>"123.txt" set /p "=%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%%WCHAR%"<nul

:: convert the line break to Unicode and append it
cmd /u /c >>"123.txt" echo(

... but I never found a way to create unicode file names :(

Regards
aGerman

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#10 Post by Liviu » 07 Apr 2012 15:34

aGerman wrote:Batch has virtually no unicode support.

I'd phrase it more like "batch has half-baked and very finicky unicode support" ;-)

Does support:
- unicode input in the interactive console, for example pasting unicode text at the prompt;
- unicode console output, both at the default prompt, and "cmd /u";
- environment and loop path variables are unicode;
- internal commands (dir, copy etc) take unicode command lines.

Does not support:
- batch files encoded as unicode text;
- unicode input other than the interactive console, for example redirection from file, or "for /f" reading of a unicode encoded text file;
- piping unicode is inconsistent, for example "more" doesn't take unicode input even in a "cmd /u" console.

aGerman wrote:I never found a way to create unicode file names :(

That's correct in the sense of not being able to hardcode unicode names in a batch file. And I know that's what you meant, but just to nitpick anyway ;-) it's still possible to create unicode filenames if the name itself comes from elsewhere, for example from an environment variable, or the name of a file or path that already exists on disk.

Liviu

P.S. The title of this thread is slightly misleading. Issue here was not about unicode characters at large, but rather about "special" characters missing from certain codepages. My code "happens" to work with "▒", but would not work with, for example, CJK (Asian) characters.

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#11 Post by MKANET » 07 Apr 2012 16:24

Thanks everyone for being so patient with me. Liviu, I am able to include my special character in new batch file I made via the method you showed me. It works very well.

However, as you said earlier, this method may not work with other special characters. I am hoping I can use the below special characters in the below sample code:
▶ = U+25B6
◀ = U+25C0
ᐊ = U+140A
ᐅ = U+1405

Code: Select all

if exist notes.txt set errors=ᐊ Here ᐅ
if not exist notes.txt set errors=◀ NOT HERE ▶
echo > Is Notes.txt here %errors%.txt


I am able to create this file manually with Windows Explorer. Is this a circumstance where I need to use the chcp command?

Could I trouble one of you to give me a working example of making something like the above two lines of code to work in batch file ultimately?

Thanks so much,
MKANET

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#12 Post by Liviu » 07 Apr 2012 18:06

MKANET wrote:However, as you said earlier, this method may not work with other special characters. I am hoping I can use the below special characters in the below sample code:
▶ = U+25B6
◀ = U+25C0
ᐊ = U+140A
ᐅ = U+1405

I don't think that's possible. The method will work for any characters that are "mapped" into one and the same codepage. I don't know of any such codepage that would cover all of the above.

You can check yourself, starting for example at http://msdn.microsoft.com/en-us/goglobal/bb964653 and looking at each of the listed codepages. If you find one which has all the special characters you want, then you can reuse the same method, just insert a "chcp XYZ" at the top, where XYZ is the codepage identifier (in place of the default 437 used in my example).

However, if you don't find such a codepage, then what you are after is simply not possible.

Liviu

P.S. Off-topic, and odd... There are a couple of control characters visually close to the U+25B6/25C0 "pointing triangles", namely the U+25BA/25C4 "pointing pointers" which are traditionally assigned to control characters 0x10/0x11 (see for example http://en.wikipedia.org/wiki/Code_page_ ... 31_and_127).

The cmd parser seems to generally accept them, except in redirections

Code: Select all

C:\tmp>echo ^Q etc ^P
◄ etc ►

C:\tmp>echo >"^Q etc ^P"
The filename, directory name, or volume label syntax is incorrect.
(where ^Q, ^P is what cmd shows for the control codes entered as Ctrl-Q, Ctrl-P).

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#13 Post by MKANET » 07 Apr 2012 18:48

Is there any way to just change to the codepage when the special character is needed; then switch back to the default codepage for the rest of the characters? If so, how would I do it when trying to create a filename using echo redirection? Im not sure what command/syntax or codepage to use for:

▶ = U+25B6
◀ = U+25C0
ᐊ = U+140A
ᐅ = U+1405

As for usig the other visually similar characters; that would be a great alternative; except that I need to do redirection to be able to create the file.

Edit: Maybe I'm not understanding, don't those special characters belong to a codepage that I can change to?


I found a batch file by doing a search on google which switches between codepages. I just dont know what code pages those special characters belong to.

Maybe I've missed something very fundamental; and, this wouldn't apply in my case. I can't find a website where I can just paste a special character; and, get the code page for it.

Code: Select all

@echo off
(
chcp 65001
rem Do my UTF work
chcp 850
)
echo This works

chcp 65001 & type myFile & chcp 850
echo This also works

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#14 Post by MKANET » 07 Apr 2012 19:47

Maybe I can create the file without a redirect? If there's a way to use copy con in a batch file that would work maybe. I've been doing search on google since my last post for some kind of solution.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Saving a filename with a unicode character

#15 Post by aGerman » 07 Apr 2012 19:56

Forget about non supported characters.
Perhaps these are sufficient for you:

Code: Select all

@echo off &setlocal

:: save the current code page
for /f "tokens=2 delims=:" %%i in ('chcp') do set /a oemcp=%%~ni

:: ANSI
>nul chcp 1252

set "x=‹ Here ›.txt"
set "y=« NOT HERE ».txt"
>"%x%" type nul
>"%y%" type nul

:: back to ASCII
>nul chcp %oemcp%
pause

Regards
aGerman

Post Reply