Saving a filename with a unicode character

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Saving a filename with a unicode character

#31 Post by aGerman » 08 Apr 2012 14:34

@MKANET
Small workaround. Change test.bat as follows and double click it (callbat.vbs must be available).

Code: Select all

@echo off &setlocal
if not defined emptytriangleleft (
  start "" "callbat.vbs"
  goto :eof
)

set "x=%emptytriangleleft% Here %emptytriangleright%.txt"
set "y=%fulltriangleleft% NOT HERE %fulltriangleright%.txt"
>"%x%" type nul
>"%y%" type nul

You will lose the arguments you possibly passed.



@Ed
These characters can't be assigned to a variable inside of a batch code because they are not supported. Youre example also doesn't work for me.

Regards
aGerman

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#32 Post by Liviu » 08 Apr 2012 16:14

Guess I must be guilty for mixing up the control characters complication into already confusing unicode matters ;-)

Ed Dyreen wrote:aGerman, why does it work with you and not for me :cry:
Short answer, and at the risk of repeating myself, but do _not_ use 'dir /b' in "for /f" loops anywhere near unicode.

Long answer: there are two characters which echo as "◄" in the console. One is the proper U+25C4 "left pointing pointer", per the unicode definition. The second one is U+0011 (or Ctrl-Q) which "happens" to display the same glyph by DOS/OEM legacy. The two characters are not the same, for example the U+25C4 is valid in pathnames while U+0011 is not.

A plain "for" loop returns the correct (unicode) filename from disk, which uses U+25C4. A "for /f" loop with 'dir /b' returns the remapped U+0011 character, instead. As good as that might look on screen, but the two don't match, which causes the "if exist" to not find the file and fail. Step by step below, assuming the Special◄.CMD file exists already...

Code: Select all

C:\tmp>chcp
Active code page: 437

C:\tmp>dir /b special*.cmd
Special◄.CMD

C:\tmp>for %a in (special*.cmd) do @if exist %a (echo %a) else echo ???
Special◄.CMD

C:\tmp>for /f %a in ('dir /b special*.cmd') do @if exist %a (echo %a) else echo ???
???

Now, to prove the 2nd point about the difference between U+25C4 vs. U+0011, run the following...

Code: Select all

C:\tmp>for %a in (special*.cmd) do @set u25c4=%a

C:\tmp>set u25c4
u25c4=Special◄.CMD

C:\tmp>for /f %a in ('dir /b special*.cmd') do @set u0011=%a

C:\tmp>set u0011
u0011=Special◄.CMD

C:\tmp>cmd /u /c echo %u25c4:~7,1%>u25c4.txt

C:\tmp>cmd /u /c echo %u0011:~7,1%>u0011.txt
Then open the two .txt files in a hex viewer, and you'll notice that the first one carries character 0x25C4 (= bytes C4 25) while the second one has 0x0011, instead.

Liviu

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Saving a filename with a unicode character

#33 Post by aGerman » 08 Apr 2012 17:14

A new possibility opens up in that case, Liviu.

You could create a file by hand (only once) called
ᐊ ᐅ ◀ ▶.#

With that FOR technique you can read the file name and use it to assign the variables:

Code: Select all

@echo off &setlocal
for %%F in (*.#) do (
  for /f "tokens=1-4" %%A in ("%%~nF") do (
    set "emptytriangleleft=%%A"
    set "emptytriangleright=%%B"
    set "fulltriangleleft=%%C"
    set "fulltriangleright=%%D"
  )
)

set "x=%emptytriangleleft% Here %emptytriangleright%.txt"
set "y=%fulltriangleleft% NOT HERE %fulltriangleright%.txt"
>"%x%" type nul
>"%y%" type nul


Regards
aGerman

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Saving a filename with a unicode character

#34 Post by Liviu » 08 Apr 2012 18:22

aGerman wrote:With that FOR technique you can read the file name and use it to assign the variables:

Yes, indeed. Pushing the idea to the limit, a set of 512 files with 128-long names each would cover the entire UTF-16 range, so with some clever indexing one could get any unicode character off the filename and into a variable.

Liviu

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#35 Post by MKANET » 08 Apr 2012 20:43

aGerman, this is THE SOLUTION!

It works very well; and, doesnt require VBscript. This is a fantastic way to assign almost any character to filenames via batch file!!! Problem is now completely resolved as far as I'm concerned. I'm very glad none of us gave up so quickly.

aGerman wrote:A new possibility opens up in that case, Liviu.

You could create a file by hand (only once) called
ᐊ ᐅ ◀ ▶.#

With that FOR technique you can read the file name and use it to assign the variables:

Code: Select all

@echo off &setlocal
for %%F in (*.#) do (
  for /f "tokens=1-4" %%A in ("%%~nF") do (
    set "emptytriangleleft=%%A"
    set "emptytriangleright=%%B"
    set "fulltriangleleft=%%C"
    set "fulltriangleright=%%D"
  )
)

set "x=%emptytriangleleft% Here %emptytriangleright%.txt"
set "y=%fulltriangleleft% NOT HERE %fulltriangleright%.txt"
>"%x%" type nul
>"%y%" type nul


Regards
aGerman

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#36 Post by MKANET » 12 Apr 2012 12:53

aGerman or anyone,

I currently have a text file (unicode.txt) which was saved in notepad.exe in unicode format. The characters in it are still preserved and completely readable in notepad; including the special unicode characters.

I would like to do the following command (or something similar):

I need to add the contents of another normal ascii text file (normal.txt) to unicode.txt; preserving the file as a true unicode text file (so contents in the file is completely readable and unicode characters fully in tact).

If I try to do the below command, it destroys the contents of the file:

Code: Select all

type normal.txt >> unicode.txt


I need to add this text as if I copied and pasted the entire normal.txt file into the unicode.txt file in notepad.exe and save it as unicode.

I REALLY hope I explained this correctly!

Thanks a million as always for your expertise.

MKANET
Posts: 160
Joined: 31 Mar 2012 21:31

Re: Saving a filename with a unicode character

#37 Post by MKANET » 12 Apr 2012 13:06

Nevermind, I found the command finally in a google search:

Code: Select all

cmd /u /c type normal.txt >> unicode.txt

Post Reply