"universal" %DATE% parser

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

"universal" %DATE% parser

#1 Post by dbenham » 10 May 2011 21:43

There are many situations where %DATE% must be parsed into numeric year, month, and day components. But the format of %DATE% is highly variable depending on locality and personal preference. It is fairly easy to develop custom code for each individual situation. But it would be great to have a "universal" function that could parse any "normal" %DATE% format.

I believe there are existing robust methods that rely on reading the registry. But many employees cannot access the registry on their workplace computers, so that is not a good solution.

The DosTips :Unique function gives a good start to a general solution. It can handle formats with the year, month, and date in any order. But it is dependent on the DATE command prompt using standard English date format tokens of yy, mm and dd. In the md and move Help experts thread aGerman pointed out to me how the tokens can vary by language. For example German uses JJ, MM and TT instead.

Below is my attempt at a "universal" date parser that does not require access to the registry. It currently supports English and German (or any other language that happens to use the same date tokens). It should be trivial to add support for additional languages by adding the appropriate token translation maps. I'm only able to test the English translation, but I think the German should work as well.

I'd like to hear from anyone who can contribute additional translation maps for the function.

I'm also keenly interested if anyone sees a problem with the code, has optimizations to contribute, or perhaps has an entirely different algorithm.

Of course any code can be broken if someone sets the date format to something out ot the ordinary. But I'm happy as long as the "universal" function supports "normal" formats.

Code: Select all

:parseDate yyVar mmVar ddVar
::
:: Parses %DATE% into numeric year, month, and date components
::
::   yyVar = name of variable to hold numeric year value
::   mmVar = name ov variable to hold numeric month value
::   ddVar = name of variable to hold numeric day value

  setlocal enableDelayedExpansion
 
  :: Parse the native date format tokens output by the DATE command
  for /f "skip=1 tokens=2-4 delims=(-)" %%a in ('"echo:|date"') do (
 
    rem Parse the numeric date components of %DATE%
    for /f "tokens=1-3 delims=/.- " %%A in ("%date:* =%") do (
 
      rem Set variables named after native date tokens to appropriate numeric date component values
      set %%a=%%A
      set %%b=%%B
      set %%c=%%C
 
      rem Create string of sorted native date format tokens
      if %%a lss %%b if %%b lss %%c set "tokens=%%a %%b %%c"
      if %%a lss %%c if %%c lss %%b set "tokens=%%a %%c %%b"
      if %%b lss %%a if %%a lss %%c set "tokens=%%b %%a %%c"
      if %%b lss %%c if %%c lss %%a set "tokens=%%b %%c %%a"
      if %%c lss %%a if %%a lss %%b set "tokens=%%c %%a %%b"
      if %%c lss %%b if %%b lss %%a set "tokens=%%c %%b %%a"
    )
  )
  :: We now have variables with the correct values, but the code doesn't know
  :: the name of the variables! We must translate the unknown native variable
  :: names into known English names
 
  :: Initialize the translation of native tokens to English tokens
  set translate=
  set T_yy=
  set T_mm=
  set T_dd=
 
  :: Determine the correct token translation map
  :: Each map in the for list should contain the space delimited English tokens
  :: on the left followed by a semicolon followed by the space delimited native
  :: tokens on the right. The native tokens should be sorted alphabetically
  :: and the English tokens should be in the matching logical order.
  :: Currently supports English and German. Additional translation maps
  :: should be added for additional language support.
  for %%t in (
    "dd mm yy;dd mm yy"
    "yy mm dd;JJ MM TT"
  ) do (
    set test=%%~t
    if "!test:*;=!"=="%tokens%" set translate=%%~t
  )
 
  :: Parse the token translation map into variables used for translation
  for /f "tokens=1-6 delims=; " %%a in ("%translate%") do (
    set T_%%a=%%d
    set T_%%b=%%e
    set T_%%c=%%f
  )
 
  :: Transfer the values from native date variables to English date variables
  set E_yy=!%T_yy%!
  set E_mm=!%T_mm%!
  set E_dd=!%T_dd%!
 
  :: Eliminate leading zeros so that numbers aren't interpreted as octal
  set /a "E_yy=10000%E_yy% %%10000, E_mm=100%E_mm% %%100, E_dd=100%E_dd% %%100"
 
  :: Return the date values
  (endlocal
    set %~1=%E_yy%
    set %~2=%E_mm%
    set %~3=%E_dd%
  )
exit /b


Dave Benham

amel27
Expert
Posts: 177
Joined: 04 Jun 2010 20:05
Location: Russia

Re: "universal" %DATE% parser

#2 Post by amel27 » 11 May 2011 01:54

interesting topic, dbenham
another way of sorting... but in reverse order:

Code: Select all

set "tokens="
for %%x in ("%%a %%b %%c" "%%b %%a %%c" "%%c %%a %%b"
            "%%a %%c %%b" "%%b %%c %%a" "%%c %%b %%a"
) do if "!tokens!" lss "%%~x" set "tokens=%%~x"

russian map (in your order): :)

Code: Select all

yy dd mm;гг дд мм

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: "universal" %DATE% parser

#3 Post by aGerman » 11 May 2011 10:25

It works for me as expected :wink:
It's a real good alternative if you don't have access to the registry.

Regards
aGerman

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: "universal" %DATE% parser

#4 Post by dbenham » 11 May 2011 15:15

Thanks for the good news aGerman!

amel27 - Good idea for sorting!

Your idea is easily adapted for an ascending sort

Code: Select all

set tokens="%%c %%b %%a"
for %%x in ("%%a %%b %%c" "%%b %%a %%c" "%%c %%a %%b"
            "%%a %%c %%b" "%%b %%c %%a"
) do if "!tokens!" gtr "%%~x" set "tokens=%%~x"


I'm worried that localized character collation sequences may have the potential to break the algorithm.

I think the following sort algorithm at least partially avoids potential collation problems:

Code: Select all

set "tokens="
for /f %%x in ('^(for %%t in ^(%%a %%b %%c^) do @echo %%t^)^|sort /L C') do set "tokens=!tokens! %%x"
set "tokens=!tokens:~1!"

The above should always work as long as characters less than ASCII 128 collate the same regardless of locality. Does anyone know if collation of characters less than ASCII 128 can vary?

I think Sorting tokens within a string is an interesting topic that deserves its own thread, so I started one.


amel27 - How did you post those (Cyrillic?) characters? And more importantly, how do I get them in my code? I've never tried to deal with languages other than English before, so I'm pretty ignorant to the ins and outs of internationalization.

I'm still hoping one function can handle all languages supported by Windows, given the correct set of translation maps. But I won't be surprised if we run into problems.


Dave Benham

amel27
Expert
Posts: 177
Joined: 04 Jun 2010 20:05
Location: Russia

Re: "universal" %DATE% parser

#5 Post by amel27 » 12 May 2011 02:28

dbenham wrote:amel27 - How did you post those (Cyrillic?) characters?
Simple copy/paste from DOS, but my IE font is UTF-8. Thus, I can copy/paste text from IE to notepad, then save file as Unicode (UTF16LE), and convert this text file to appropriate OEM via BAT:

Code: Select all

@echo off
for /f "tokens=2 delims=:" %%a in ('chcp') do set/a "ch=%%a"
>nul chcp 858
type "cp858_16le.txt">cp858.txt
>nul chcp %ch%

In this sample I convert unicode text from your sample post to OEM 858. For my Russian you should take 866 OEM code page for converting.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: "universal" %DATE% parser

#6 Post by dbenham » 12 May 2011 16:29

Thanks amel27

So a binary hex dump of the sorted Russian space delimited tokens would be:
A3 A3 20 A4 A4 20 AC AC

Did I get that right?

So if I were to send you a binary copy of the code below and you installed it on your Russian machine, the 3rd map would appear in Russian and everything would work for you, correct?

But because this forum doesn't support attachments, you are unable to get the binary image directly. So you copy and paste the code into a Unicode document. You then temporarily set your code page to OEM 858 (Multilingual Latin 1 + Euro) which corresponds to what my English text editor produced. (I'm taking your word on this, I have no idea. My text editor display does not match my DOS setting of OEM 437) You then type the Unicode file, redirecting the output to a file, which effectively converts the Unicode into OEM 858 and you have your binary image. You reset your code page back to OEM 866 Russian and everything is ready to go. Did I get this right?


For clarity I think it would be good if the hex dump and the supported language(s) are appended to each translation map as comments. This requires some minor changes to the code.

If anyone else would like to contribute an additional language translation map to this function, please include a hex dump of your sorted native date format tokens. I also need the English translation in the same sort order.


I've given some more thought to the localized character collation sequence. As long as the sequence is consistant for a given locale, I don't think it should matter.

I do have some concern for Asian (or other) languages that use Multi-Byte characters. Not all byte combinations are valid, and some translation maps might have invalid sequences for a given language. I'm curious how tolerant Windows "DOS" would be if it ran accross such a situation.

Here is the updated code with the Russian map and an ascending version of amel27's sort technique.

Code: Select all

:parseDate yyVar mmVar ddVar
::
:: Parses %DATE% into numeric year, month, and date components
::
::   yyVar = name of variable to hold numeric year value
::   mmVar = name ov variable to hold numeric month value
::   ddVar = name of variable to hold numeric day value

  setlocal enableDelayedExpansion
 
  :: Parse the native date format tokens output by the DATE command
  for /f "skip=1 tokens=2-4 delims=(-)" %%a in ('"echo:|date"') do (
 
    rem Parse the numeric date components of %DATE%
    for /f "tokens=1-3 delims=/.- " %%A in ("%date:* =%") do (

      rem Set variables named after native date tokens to appropriate numeric date component values
      set %%a=%%A
      set %%b=%%B
      set %%c=%%C
 
      rem Create a string of sorted native date format tokens
      set "tokens=%%c %%b %%a"
      for %%x in ("%%a %%b %%c" "%%b %%a %%c" "%%c %%a %%b"
                  "%%a %%c %%b" "%%b %%c %%a"
      ) do if "!tokens!" gtr "%%~x" set "tokens=%%~x"
    )
  )
  :: We now have variables with the correct values along with a list of the
  :: variable names. But the code doesn't know how to interpret the names!
  :: We must translate the native names into English names.

  :: Initialize the translation of native date format tokens to English tokens
  set translate=
  set T_yy=
  set T_mm=
  set T_dd=

  :: Determine the correct token translation map.
  ::
  :: Each map in the for list should contain the space delimited English tokens
  :: on the left followed by a comma followed by the space delimited native
  :: tokens on the right. The native tokens should be sorted alphabetically
  :: and the English tokens should be in the matching logical order. A comment
  :: should be appended starting with a semicolon followed by a hexdump of each
  :: map's native binary representation as well as the language(s) it supports.
  :: A map with codes greater than 0x7F may not appear correct unless viewed on
  :: a machine that supports that language.
  ::
  for %%t in (
    "dd mm yy,dd mm yy; 64 64 20 6D 6D 20 79 79 - English"
    "yy mm dd,JJ MM TT; 4A 4A 20 4D 4D 20 54 54 - German"
    "yy dd mm,££ ¤¤ ¬¬; A3 A3 20 A4 A4 20 AC AC - Russian"
  ) do (
    for /f "delims=;" %%m in (%%t) do set map=%%m
    if "!map:*,=!"=="%tokens%" set translate=!map!
  )

  :: Parse the token translation map into variables used for translation
  for /f "tokens=1-6 delims=, " %%a in ("%translate%") do (
    set T_%%a=%%d
    set T_%%b=%%e
    set T_%%c=%%f
  )
 
  :: Transfer the values from native date variables to English date variables
  set E_yy=!%T_yy%!
  set E_mm=!%T_mm%!
  set E_dd=!%T_dd%!
 
  :: Eliminate leading zeros so that numbers aren't interpreted as octal
  set /a "E_yy=10000%E_yy% %%10000, E_mm=100%E_mm% %%100, E_dd=100%E_dd% %%100"
 
  :: Return the date values in year, month, day order
  (endlocal
    set %~1=%E_yy%
    set %~2=%E_mm%
    set %~3=%E_dd%
  )
exit /b


Dave Benham

amel27
Expert
Posts: 177
Joined: 04 Jun 2010 20:05
Location: Russia

Re: "universal" %DATE% parser

#7 Post by amel27 » 12 May 2011 18:59

dbenham wrote:So a binary hex dump of the sorted Russian space delimited tokens would be:
A3 A3 20 A4 A4 20 AC AC

Did I get that right?
Yes, it is correct dump. :)

dbenham wrote:So you copy and paste the code into a Unicode document. You then temporarily set your code page to OEM 858 (Multilingual Latin 1 + Euro) which corresponds to what my English text editor produced. (I'm taking your word on this, I have no idea. My text editor display does not match my DOS setting of OEM 437) You then type the Unicode file, redirecting the output to a file, which effectively converts the Unicode into OEM 858 and you have your binary image. You reset your code page back to OEM 866 Russian and everything is ready to go. Did I get this right?
In fact, we should store multilanguage text in Unicode (UTF16LE preffered), but before using it as BAT code, we can convert it to actual OEM. Windows automatically recoded Unicode file depending on localized adjustments.

dbenham wrote:Here is the updated code with the Russian map and an ascending version of amel27's sort technique.
No, this unicode representation incorrect, probably, because you copied not from unicode source. Binary dump of this representation differs from the original: 3F 3F 20 FD FD 20 BF BF

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: "universal" %DATE% parser

#8 Post by dbenham » 12 May 2011 21:03

amel27 wrote:In fact, we should store multilanguage text in Unicode (UTF16LE preffered), but before using it as BAT code, we can convert it to actual OEM. Windows automatically recoded Unicode file depending on localized adjustments.

Ah, we may have a philosophical difference of opinion here. :wink:
I realize that a batch file is an interpreted text file, and Unicode is the best way (only reliable standard way) to store text with any arbitrary combination of languages. BUT, my original intent was to have a functioning batch file that could be passed back and forth and work properly with "any" language. As you pointed out, a batch file in Unicode cannot be executed. If we follow your suggestion, then when the first person converts the Unicode to his/her OEM character set, all the unsupported characters are irreversibly translated into question marks "?". The batch file is no longer universal in that it cannot be passed on to the next person that requires the corrupted translation maps.

The problem is even more acute if you consider the :asc, :chr, :str2hex, :hex2str functions. These require a string that represents all byte values from 0x01 - 0xFF. The Unicode representation will change from language to language, but the binary representation of the functioning batch file will not.

In my mind the problem is solved by treating the batch file not as text but as binary. The meaning of each character (byte) greater than 0x7F changes between languages, but the "universal" function(s) continue(s) to work as intended. In theory anyway, I'm still not sure what happens with multi-byte languages. I acknowledge that my strategy introduces its own set of issues, probably some that I'm not aware of. An obvious one is how to post a binary representation of a batch file on this site.

I don't think there is a perfect answer to this debate. I'd love input from more people on their opinion of the best way to proceed. Does the community at large think that treating a batch file as binary is a bad idea? Or is it worth pursuing?

With regard to my last posted code:
amel27 wrote:No, this unicode representation incorrect, probably, because you copied not from unicode source. Binary dump of this representation differs from the original: 3F 3F 20 FD FD 20 BF BF

Well, as per my earlier discussion, I was not attempting to post Unicode. Can you tell me if the steps I outlined in my prior post work for you?
Addendum added 2011-05-13
Oops! I misunderstood what you wrote. I see that you already did try it and OEM 858 is obviously incorrect.

Try these maps using OEM 437. With correct manipulation using DOS OEM/Unicode conversion I am able to reconstruct the Russian, so I think it should work for you as well.

Code: Select all

    "dd mm yy,dd mm yy; 64 64 20 6D 6D 20 79 79 - English"
    "yy mm dd,JJ MM TT; 4A 4A 20 4D 4D 20 54 54 - German"
    "yy dd mm,úú ññ ¼¼; A3 A3 20 A4 A4 20 AC AC - Russian"


In the mean time I will try to figure out what character set my editor is using for display.

Thanks amel27 for your valued help and input

Dave Benham

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: "universal" %DATE% parser

#9 Post by dbenham » 13 May 2011 15:05

Erueka! :D

I've determined that my most recent post of the full code was using Windows 1252 character set (CHCP 1252).

I was able to reconstruct the russian characters via the following steps:

1) copy and paste the posted code into Notepad and save as Unicode.
2)from command line set CHCP=1252
3) then type unicode file with redirection into a 1252 text file.

At this point the file should be the binary image that I intended.

4) cmd /U
5) CHCP=866
6) type 1252 file with redirection into russian unicode text file.

The 3rd map should now be in Russian!

Dave Benham

orange_batch
Expert
Posts: 442
Joined: 01 Aug 2010 17:13
Location: Canadian Pacific
Contact:

Re: "universal" %DATE% parser

#10 Post by orange_batch » 14 May 2011 02:03

I studied the relationships between unicode and command prompt extensively and came up with some ingenious solutions for certain problems. They don't relate in this case though. Just know that there are different types of unicode that behave differently under command prompt, for example I don't know if asian languages would work with this. In any case it seems you found the solution for most unicode, it would involve chcp usually. By the way, chcp 65001 is UTF-8.

Also, as an alternative, there must be a way to retrieve standardized date/time information through WMI via WMIC or simple VBScript. It might require certain privileges though, I don't know.
Last edited by orange_batch on 14 May 2011 05:31, edited 1 time in total.

amel27
Expert
Posts: 177
Joined: 04 Jun 2010 20:05
Location: Russia

Re: "universal" %DATE% parser

#11 Post by amel27 » 14 May 2011 02:35

dbenham wrote:But because this forum doesn't support attachments, you are unable to get the binary image directly.
On the other hand, we can reproduce the binary data via script (WSH for example).
This BAT regenerate itself to new BAT file from this post. :)

Code: Select all

@set @x=0 /*
@echo off
set "MAP="& set "CRLF=\x0D\x0A"

:: ---------------------------------------------------------
set "MAP=%MAP%    "dd mm yy,\x64\x64\x20\x6D\x6D\x20\x79\x79; English " %CRLF%"
set "MAP=%MAP%    "yy mm dd,\x4A\x4A\x20\x4D\x4D\x20\x54\x54; German  " %CRLF%"
set "MAP=%MAP%    "yy dd mm,\xA3\xA3\x20\xA4\xA4\x20\xAC\xAC; Russian " %CRLF%"
:: ---------------------------------------------------------

(echo/:parseDate yyVar mmVar ddVar
 echo/::
 echo/:: Parses %%DATE%% into numeric year, month, and date components
 echo/::
 echo/::   yyVar = name of variable to hold numeric year value
 echo/::   mmVar = name ov variable to hold numeric month value
 echo/::   ddVar = name of variable to hold numeric day value
 echo/
 echo/  setlocal enableDelayedExpansion
 echo/ 
 echo/  :: Parse the native date format tokens output by the DATE command
 echo/  for /f "skip=1 tokens=2-4 delims=(-)" %%%%a in ('"echo:|date"'^) do (
 echo/ 
 echo/    rem Parse the numeric date components of %%DATE%%
 echo/    for /f "tokens=1-3 delims=/.- " %%%%A in ("%%date:* =%%"^) do (
 echo/
 echo/      rem Set variables named after native date tokens to appropriate numeric date component values
 echo/      set %%%%a=%%%%A
 echo/      set %%%%b=%%%%B
 echo/      set %%%%c=%%%%C
 echo/ 
 echo/      rem Create a string of sorted native date format tokens
 echo/      set "tokens=%%%%c %%%%b %%%%a"
 echo/      for %%%%x in ("%%%%a %%%%b %%%%c" "%%%%b %%%%a %%%%c" "%%%%c %%%%a %%%%b"
 echo/                  "%%%%a %%%%c %%%%b" "%%%%b %%%%c %%%%a"
 echo/      ^) do if "!tokens!" gtr "%%%%~x" set "tokens=%%%%~x"
 echo/    ^)
 echo/  ^)
 echo/  :: We now have variables with the correct values along with a list of the
 echo/  :: variable names. But the code doesn't know how to interpret the names!
 echo/  :: We must translate the native names into English names.
 echo/
 echo/  :: Initialize the translation of native date format tokens to English tokens
 echo/  set translate=
 echo/  set T_yy=
 echo/  set T_mm=
 echo/  set T_dd=
 echo/
 echo/  :: Determine the correct token translation map.
 echo/  ::
 echo/  :: Each map in the for list should contain the space delimited English tokens
 echo/  :: on the left followed by a comma followed by the space delimited native
 echo/  :: tokens on the right. The native tokens should be sorted alphabetically
 echo/  :: and the English tokens should be in the matching logical order. A comment
 echo/  :: should be appended starting with a semicolon followed by a hexdump of each
 echo/  :: map's native binary representation as well as the language(s^) it supports.
 echo/  :: A map with codes greater than 0x7F may not appear correct unless viewed on
 echo/  :: a machine that supports that language.
 echo/  ::
 echo/  for %%%%t in (
)>"%~n0.tmp"

cscript/nologo /e:jscript "%~f0" "%~n0.tmp" "%MAP:"=\x22%"

(echo/  ^) do (
 echo/    for /f "delims=;" %%%%m in (%%%%t^) do set map=%%%%m
 echo/    if "!map:*,=!"=="%%tokens%%" set translate=!map!
 echo/  ^)
 echo/
 echo/  :: Parse the token translation map into variables used for translation
 echo/  for /f "tokens=1-6 delims=, " %%%%a in ("%%translate%%"^) do (
 echo/    set T_%%%%a=%%%%d
 echo/    set T_%%%%b=%%%%e
 echo/    set T_%%%%c=%%%%f
 echo/  ^)
 echo/
 echo/  :: Transfer the values from native date variables to English date variables
 echo/  set E_yy=!%%T_yy%%!
 echo/  set E_mm=!%%T_mm%%!
 echo/  set E_dd=!%%T_dd%%!
 echo/ 
 echo/  :: Eliminate leading zeros so that numbers aren't interpreted as octal
 echo/  set /a "E_yy=10000%%E_yy%% %%%%10000, E_mm=100%%E_mm%% %%%%100, E_dd=100%%E_dd%% %%%%100"
 echo/ 
 echo/  :: Return the date values in year, month, day order
 echo/  (endlocal
 echo/    set %%~1=%%E_yy%%
 echo/    set %%~2=%%E_mm%%
 echo/    set %%~3=%%E_dd%%
 echo/  ^)
 echo/exit /b
)>>"%~n0.tmp"

del/f "%~nx0"&& ren "%~n0.tmp" "%~nx0"

@exit */
inp=new ActiveXObject("ADODB.Stream");
inp.Type=2;
inp.Open();
inp.WriteText(eval("'"+WScript.Arguments(1)+"'"));
out=new ActiveXObject("ADODB.Stream");
out.Type=2;
out.Charset='ISO-8859-1';
out.Open();
out.LoadFromFile(WScript.Arguments(0));
inp.Position=0;
out.Position=out.Size;
inp.CopyTo(out);
inp.Close();
out.SaveToFile(WScript.Arguments(0),2);
out.Close();

amel27
Expert
Posts: 177
Joined: 04 Jun 2010 20:05
Location: Russia

Re: "universal" %DATE% parser

#12 Post by amel27 » 14 May 2011 20:48

dbenham wrote:I've determined that my most recent post of the full code was using Windows 1252 character set (CHCP 1252).
Hm.. 1252 is ANSI (1-byte), but if I switch my browser to any ANSI/DOS font, russian chars in yor posts takes 2 bytes... :?

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: "universal" %DATE% parser

#13 Post by dbenham » 16 May 2011 14:20

amel27 wrote:Hm.. 1252 is ANSI (1-byte), but if I switch my browser to any ANSI/DOS font, russian chars in yor posts takes 2 bytes...

I assume everything works when your browser is set to UTF-8? If so, then this reminds me of an old joke.

Patient - "Doctor, I hope you can help me. It hurts when I do this!"
Doctor - "Then don't do that."

But I see your point. This is another possible point of failure for the transfer of the correct code.

I like the basic idea of your code re-generator. But it takes a fair amount of manipulation to prepare the file. I've refined the technique below to something I think is much easier to administer. It should be easily adabtable to any file. The only thing that needs to change is the definition of the output file near the top, and the text between the BEGIN and END markers.

I also had an 'aha' moment when I realized I can completely avoid the need to sort the native tokens. The new translation technique simply requires a map with the 3 space delimited native tokens in year, month, day order. The code is significantly shorter and simpler, and now I know definitively I don't have to worry about local character collation sequences. What really surprises me is the new code is actually 10% slower to execute on my machine then the prior version. Regardless, I like the simplicity of the new version, and it is still plenty fast.

So here is the "refined" function - packaged in a variant of amel27's code re-generator. Please note that this code re-generator operates differently then what amel27 has - It is hard coded to create a file named "parseDate.bat", and the original code re-generator file will remain untouched.

makeParseDate.bat

Code: Select all

@set @junk=0 /* The 1st 3 lines should not be changed
@echo off & set "@junk="
setlocal

:: define the name of the output file
set file="parseDate.bat"

if exist %file% del %file%
set "output="
for /f "skip=2 tokens=1,* delims=[]" %%a in ('find /v /n "" %~f0') do (
  if defined output (
    if "%%b"=="END" (set "output=") else (set "ln=%%b")&call :procLn
  ) else if "%%b"=="BEGIN" set output=true
)
exit /b
:procLn
  setlocal enableDelayedExpansion
  if "!ln:~0,4!" neq "HEX:" (echo:!ln!>>!file!)&exit /b
  set "ln=!ln:~4!"
  call cscript/nologo /e:jscript "%~f0" !file! "!ln:"=\x22!\x0D\x0A"
exit /b

** BEGIN marks the beginning of the intended file
**
** END marks the end of the intended file
**
** HEX: marks a line containing hex that must be processed by the jscript.
**      Only lines containing characters greater than 0x7F need be prepared
**      as HEX (also lines containing 0x09)
********************************************************************************
BEGIN
:parseDate yyVar mmVar ddVar
::
:: Parses %DATE% into numeric year, month, and date components
::
::   yyVar = name of variable to hold numeric year value
::   mmVar = name ov variable to hold numeric month value
::   ddVar = name of variable to hold numeric day value

  setlocal enableDelayedExpansion
  for /f "delims==" %%v in ('2^>nul set $') do set "%%v="

  :: Parse the native date format tokens output by the DATE command
  for /f "skip=1 tokens=2-4 delims=(-)" %%a in ('"echo:|date"') do (
    rem Parse the numeric date components of %DATE%
    for /f "tokens=1-3 delims=/.- " %%A in ("%date:* =%") do (
      rem Set variables named after native date tokens to appropriate numeric date component values
      set $N_%%a=%%A
      set $N_%%b=%%B
      set $N_%%c=%%C
    )
  )
 
  :: Translate the native variables with unknown meaning into English variables
  :: with known meaning.
  ::
  :: Each map in the for list should contain the space delimited native date
  :: format tokens in Year Month Day order. Optional Documentation should
  :: follow the map consisting of the hex dump of the native tokens, and the
  :: language(s) the map supports.
  ::
  :: A map with codes greater than 0x7F may not appear correct unless viewed on
  :: a machine that supports that language.
  ::
  for %%m in (
    "yy mm dd - 79 79 20 6D 6D 20 64 64 - English"
    "JJ MM TT - 4A 4A 20 4D 4D 20 54 54 - German"
HEX:    "\xA3\xA3 \xAC\xAC \xA4\xA4 - A3 A3 20 AC AC 20 A4 A4 - Russian"
  ) do (
    for /f "tokens=1-3 delims= " %%a in (%%m) do (
      rem Only the correct map will have all 3 variables defined
      if defined $N_%%a if defined $N_%%b if defined $N_%%c (
        set $E_yy=!$N_%%a!
        set $E_mm=!$N_%%b!
        set $E_dd=!$N_%%c!
      )
    )
  )
 
  :: Eliminate leading zeros so that numbers aren't interpreted as octal
  set /a "$E_yy=10000%$E_yy% %%10000, $E_mm=100%$E_mm% %%100, $E_dd=100%$E_dd% %%100"
 
  :: Return the date values in year, month, day order
  (endlocal
    set %~1=%$E_yy%
    set %~2=%$E_mm%
    set %~3=%$E_dd%
  )
exit /b
END
 
******* The remainder is the jscript used to insert the HEX ******/
inp=new ActiveXObject("ADODB.Stream");
inp.Type=2;
inp.Open();
inp.WriteText(eval("'"+WScript.Arguments(1)+"'"));
out=new ActiveXObject("ADODB.Stream");
out.Type=2;
out.Charset='ISO-8859-1';
out.Open();
out.LoadFromFile(WScript.Arguments(0));
inp.Position=0;
out.Position=out.Size;
inp.CopyTo(out);
inp.Close();
out.SaveToFile(WScript.Arguments(0),2);
out.Close();


Dave Benham

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: "universal" %DATE% parser

#14 Post by aGerman » 16 May 2011 15:41

Sorry guys, but now I'm totally confused. On the right hand you use JScript-injection, on the left hand nobody has the idea to use it for getting the date directly :? Why not :?:

Code: Select all

@set @junk=0 /* The 1st 3 lines should not be changed
@echo off & set "@junk="
setlocal

for /f "tokens=1-3" %%a in ('cscript //nologo //e:jscript "%~f0"') do (
  set /a yy=%%a, mm=%%b, dd=%%c
)
echo year  %yy%
echo month %mm%
echo day   %dd%
pause
goto :eof
*/

var d = new Date();
WScript.Echo(d.getFullYear() + ' ' + (d.getMonth() + 1) + ' ' + d.getDate());


Regards
aGerman

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: "universal" %DATE% parser

#15 Post by dbenham » 16 May 2011 16:21

aGerman wrote:Sorry guys, but now I'm totally confused. On the right hand you use JScript-injection, on the left hand nobody has the idea to use it for getting the date directly Why not

Orange_batch nearly got there - he suggested VBScript or WMI, which I imagine could work as well.

But the intent was for a pure batch solution. Unfortunately the solution contains extended ASCII characters that can be problematic when posted on this forum, especially when the poster and the reader are using different character sets. The JScript-injected file is simply a method to transmit a binary image of the pure batch solution over this forum. If we could simply post binary attachments then it would be a non-issue.

BTW - I like your JScript solution :D
But if one were to go the JScript route, what happens if there are multiple injections in the same file. Is there a way to direct a specific set of lines to cscript, without using a temporary or external file?

Dave Benham

Post Reply