The BatchLineParser tutorial ( jeb, dBenham )

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

The BatchLineParser tutorial ( jeb, dBenham )

#1 Post by Ed Dyreen » 28 Jul 2012 08:42

Amen to jeb :D http://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts
Jan Erik wrote:The BatchLineParser:

A line of code in a batch file has multiple phases (on the command line the expansion is different!).

The process starts with phase 1

Phase/order
1) Phase(Percent):

  • A double %% is replaced by a single %
  • Expansion of argument variables (%1, %2, etc.)
  • Expansion of %var%, if var does not exists replace it with nothing

1.5) Remove all <CR> (CarriageReturn 0x0d) from the line

2) Phase(Special chars, "<LF>^&|<>()"): Look at each character

  • If it is a quote (") toggle the quote flag, if the quote flag is active, the other special characters are ignored (^&|<><parenthesis>)
  • If it is a caret (^) the next character has no special meaning, the caret itself is removed, if the caret is the last character of the line, the next line is appended, the first charater of the next line is always handled as escaped charater.
    • <LF> stops the parsing immediatly, but not with a caret in front
  • Parenthesis increment/decrement the parenthesis counter, the parenthesis itself will be removed, except for a closing one and counter=0
  • If the line end is reached and the parenthesis counter is > 0 the next line will be appended (starts again with phase 1)
  • In this phase the primary token list is build, token delimiters are <space>,;=
  • In this phase REM, IF and FOR are detected, for the special handling of them.
  • If the first token is "rem", only two tokens are processed, important for the multiline caret

3) Phase(echo): If "echo is on" print the result of phase 1 and 2
  • For-loop-blocks are echoed multiple times, first time in the context of the for-loop, with unexpanded for-loop-vars
  • For each iteration, the block is echoed with expanded for-loop-vars

---- These two phases are not really follows directly, but it makes no difference
4) Phase(for-loop-vars expansion)
: Expansion of %%a and so on

5) Phase(Exclamation mark): Only if delayed expansion is on, look at each character

  • If it is a caret (^) the next character has no special meaning, the caret itself is removed
  • If it is an exclamation mark, search for the next exclamation mark (carets are not observed anymore), expand to the content of the variable
  • If no exclamation mark is found in this phase, the result is discarded, the result of phase 4 is used instead (important for the carets)
    Important: At this phase quotes and other specical characters are ignored
  • Expanding vars at this stage is "safe", because special characters are not detected anymore (even or )

6) Phase(call/caret doubling): Only if the cmd token is CALL

  • If the first token is "call", start with phase 1 again, but stops after phase 2, delayed expansion are not processed a second time here
  • Remove the first CALL, so multiple CALL's can be stacked
  • Double all carets (the normal carets seems to be stay unchanged, because in phase 2 they are reduced to one, but in quotes they are effectivly doubled)

7) Phase(Execute): The command is executed

  • Different tokens are used here, depends on the internal command executed
  • In case of a set "name=content", the complete content of the first equal sign to the last quote of the line is used as content-token, if there is no quote after the equal sign, the rest of the line is used.

CmdLineParser:

Works like the BatchLine-Parser, but:
  • Goto/call a label isn't allowed

Phase1(Percent):

  • %var% will be replaced by the content of var, if the var isn't defined, the expresssion will be unchanged
  • No special handling of %%, the second percent could be the beginning of a var, set var=content, %%var%% expands to %Content%

Phase5(exclamation mark): only if "DelayedExpansion" is enabled

  • !var! will be replaced by the content of var, if the var isn't defined, the expresssion will be unchanged

for-loop-command-block

e.g. for /F "usebackq" %%a IN (command block) DO echo %%a

The command block will be parsed two times, at first the BatchLineParser(the loop is inside a batch) or the CmdLineParser(loop on the cmd-line) is active, at the second run always the CmdLineParser is active. In the second run, DelayedExpansion is active only if it is enabled with the registry key

The second run is like calling the line with cmd /c

Setting of variables are therefore not persistent.

Hope it helps Jan Erik
Amen to dbenham :D http://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/7970912#7970912
dbenham wrote:1) (Percent) Starting from left, scan each character for %. If found then

  • 1.1 (escape %)
    If followed by another % then
    Replace %% with single % and continue scan
  • 1.2 (expand argument)
    • Else if followed by * and command extensions are enabled then
      Replace %* with the text of all command line arguments
    • Else if followed by <digit> then
      Replace %<digit> with argument value (replace with nothing if undefined) and continue scan
    • Else if followed by ~ and command extensions are enabled then
      • If followed by optional valid list of argument modifiers followed by required <digit> then
        Replace %~[modifiers]<digit> with modified argument value (replace with nothing if not defined or if specified $PATH: modifier is not defined) and continue scan.
        Note: modifiers are case insensitive and can appear multiple times in any order, except $PATH: modifier can only appear once and must be the last modifier before the <digit>
      • Else invalid modified argument syntax raises fatal error: batch processing aborts!
  • 1.3 (expand variable)
    • Else if command extensions are disabled then
      Look at next string of characters, breaking before % or <LF>, and call them VAR (may be an empty list)
      • If next character is % then
        Replace %VAR% with value of VAR (replace with nothing if VAR not defined) and continue scan
      • Else goto 1.4
    • Else if command extensions are enabled then
      Look at next string of characters, breaking before % : or <LF>, and call them VAR (may be an empty list). If VAR breaks before : and the subsequent character is % then include : as the last character in VAR and break before %.
      • If next character is % then
        Replace %VAR% with value of VAR (replace with nothing if VAR not defined) and continue scan
      • Else if next character is : then
        • If VAR is undefined then
          Remove %VAR: and continue scan.
        • Else if next character is ~ then
          • If next string of characters matches pattern of [integer][,[integer]]% then
            Replace %VAR:~[integer][,[integer]]% with substring of value of VAR (possibly resulting in empty string) and continue scan.
          • Else goto 1.4
        • Else if followed by = or *= then
          Invalid variable search and replace syntax raises fatal error: batch processing aborts!
        • Else if next string of characters matches pattern of
        • search=[replace]% then
          Replace %VAR:
        • search=[replace]% with value of VAR after performing search and replace (possibly resulting in empty string) and continue scan
        • Else goto 1.4
  • 1.4 (strip %)
    Else remove % and continue with scan

The above helps explain why this batch

Code: Select all

@echo off
setlocal enableDelayedExpansion
set "1var=varA"
set "~f1var=varB"
call :test "arg1"
exit /b 
::
:test "arg1"
echo %%1var%% = %1var%
echo ^^^!1var^^^! = !1var!
echo --------
echo %%~f1var%% = %~f1var%
echo ^^^!~f1var^^^! = !~f1var!
exit /b
Gives these results:

Code: Select all

%1var% = "arg1"var
!1var! = varA
--------
%~f1var% = P:\arg1var
!~f1var! = varB
Note 1 - Phase 1 occurs prior to the recognition of REM statements. This is very important because it means even a remark can generate a fatal error if it has invalid argument expansion syntax or invalid variable search and replace syntax!

Code: Select all

@echo off
rem %~x This generates a fatal argument expansion error
echo this line is never reached
Note 2 - Another interesting consequence of the % parsing rules: Variables containing : in the name can be defined, but they cannot be expanded unless command extensions are disabled. There is one exception - a variable name containing a single colon at the end can be expanded while command extensions are enabled. However, you cannot perform substring or search and replace operations on variable names ending with a colon. The batch file below (courtesy of jeb) demonstrates this behavior

Code: Select all

@echo off
setlocal
set var=content
set var:=Special
set var::=double colon
set var:~0,2=tricky
set var::~0,2=unfortunate
echo %var%
echo %var:%
echo %var::%
echo %var:~0,2%
echo %var::~0,2%
echo Now with DisableExtensions
setlocal DisableExtensions
echo %var%
echo %var:%
echo %var::%
echo %var:~0,2%
echo %var::~0,2%
Note 3 - An interesting outcome of the order of the parsing rules that jeb lays out in his post: When performing search and replace with normal expansion, special characters should NOT be escaped (though they may be quoted). But when performing search and replace with delayed expansion, special characters MUST be escaped (unless they are quoted).

Code: Select all

@echo off
setlocal enableDelayedExpansion
set "var=this & that"
echo %var:&=and%
echo "%var:&=and%"
echo !var:^&=and!
echo "!var:&=and!"
http://stackoverflow.com/questions/7882395/why-is-delayed-expansion-in-a-batch-file-not-working-in-this-case
http://www.dostips.com/forum/viewtopic.php?f=3&t=2500

Tags: BatchLineParser, CmdLineParser, batch parser, commandline parser, expansion phase, parsing rules, special characters

Ocalabob
Posts: 79
Joined: 24 Dec 2010 12:16
Location: Micanopy Florida

Re: The BatchLineParser ( jeb )

#2 Post by Ocalabob » 28 Jul 2012 12:34

Greetings Ed,

A hearty hand shake and a pad on the back for that post plus a tip of the fedora to Jeb for a job well done!

Thank you both for the education.
Later.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: The BatchLineParser ( jeb )

#3 Post by dbenham » 28 Jul 2012 16:06

Hi Ed.

I'm glad you posted this, but I would like a bit of credit. :mrgreen: I wrote the 2nd half of the post that deals with the details of the % expansion.

Unfortunately that portion of the post looks pretty bad on this site because it relies a great deal on indent level to help show the logic, but indents don't work here :(

It makes much more sense when viewed on the SO site.

Edit - All my points above have been addressed perfectly by Ed :D

There have been many advancements in the DOS world since the original reference info and tutorials were posted on DOS Tips. Many of the advancements have taken place right here in the forums.

Do you or aGerman have any influence with the owner to make the site a bit more up to date :?: I might be interested in contributing material to update the reference and or tutorials. It would be great if we could collaboratively update the site.


Dave Benham
Last edited by dbenham on 28 Jul 2012 21:08, edited 1 time in total.

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: The BatchLineParser ( jeb, dBenham )

#4 Post by Ed Dyreen » 28 Jul 2012 16:59

dbenham wrote:Hi Ed.

I'm glad you posted this, but I would like a bit of credit. :mrgreen: I wrote the 2nd half of the post that deals with the details of the % expansion.
Oops, didn't notice it was you who wrote the 2nd part, corrected it.
dbenham wrote:Unfortunately that portion of the post looks pretty bad on this site because it relies a great deal on indent level to help show the logic, but indents don't work here :(

It makes much more sense when viewed on the SO site.
I finally understand the semantic web and the importance of tags for searching but stackOverflow still confuses me.
I copied it here so that I could bookmark it.
I've been fighting with the [list=][/list] tags but can't get them to work, I think they are broke ?!
dbenham wrote:Do you or aGerman have any influence with the owner to make the site a bit more up to date :?: I might be interested in contributing material to update the reference and or tutorials. It would be great if we could collaboratively update the site.
Maybe you pm admin, I'm sure he will be very interested :)
admin wrote:I'm still thinking of reformatting some of the forum content and moving it to a main page, your stuff is definately a candidate. Wish I had more time.
admin wrote:But we could reformat interesting content in order to add it to the main page. Almost all Dostips content is driven from a MySQL table. Each "Tip" looks like this:

Code: Select all

INSERT INTO `Tips` (`TAG`, `ALIAS`, `TITLE`, `CATEGORIES`, `SHORT`, `LONG`, `CODE`, `OUTPUT`) VALUES 
( 'Snippets.TrimRightSubst', '_Toc135152727'
, 'Trim Right'
, ',StringManipulation,'
, 'Trim spaces from the end of a string via substitution.'
, 'Trimming spaces at the end of a variable seems a little tricky.  The following example shows how to use the string substitution feature to trim <b>up to 31 spaces</b> from the end of a string.  It assumes that the string to be trimmed never contains two hash "##" characters in a row.'
,  'set str=15 Trailing Spaces to truncate               &rem
    echo."%str%"
    set str=%str%##
    set str=%str:                ##=##%
    set str=%str:        ##=##%
    set str=%str:    ##=##%
    set str=%str:  ##=##%
    set str=%str: ##=##%
    set str=%str:##=%
    echo."%str%"'
,  '"15 Trailing Spaces to truncate               "
    "15 Trailing Spaces to truncate"'
);
Tips can be grouped by the Category and then shown together on a single page. Groups can then be named an added to the DosTips index.

Problem is someone would have to re-format the interesting content.
Looks like you are pretty busy - you are student :)

Let me know what you think. Thanks!
I don't have time now, and I'm not going to hypothesize on things that are yet to happen, I'll contact you through pm maybe even this year. :)

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: The BatchLineParser ( jeb, dBenham )

#5 Post by dbenham » 28 Jul 2012 18:11

The list feature isn't broken :D

Code: Select all

[list=1]
[*] [b]Outer level text[/b]

Here we have an indented paragraph of information asfa  asdfa;fk a;lsfkj ;asl fkj;asl fkj;als fkj;asl fkj;asl fj;asl fkj;aslfkj;aslfkj a;slfkj ;aslfkj ;asl fj;asldfkj;aslfkj;aslfkja;sldk f;as dklf;a ldksf ja;ldfks j;aldks fj;aldks fj;aldks fj;al dksfj;aldks fj;aldks jf;al dfkj;al dksf;aldks jf;aldks j;aldfks j;al dfksj;aldks fj;aldks f;aldks ja;ldksjf ;aldks fj;al fkja;l fj;adfklj a;

[list=a]
[*] Sub point
[list]
[*] Sub-sub point 1
[*] Sub-sub point 2
[/list]


[*] Another sub point
[/list]

[b][*] More outer level text[/b]

And another paragraph; ;als a ;lsa dkjf;la ;al dfkj;a la;l dkjsf;al dfkjs;al dfkjs;al dkjsf;al fj;a dfkj;a dfk;adkls fa;ldsk;la ;sldk f;aldks fj;alsdkfj ;aldks ja;ldksjf ;alsdk fj;al kj;al ksd; ;la k;sl d; as;ldk ;a s;dl a;dfs ;adkf a;lsd ;alsd a;dls f;la ds;a dsa ;s d;a s df; a;dksjf ;aldksj f;aldfk j;adk fja;dfk jaf[list=a]

[*] Sub point
[list]
[*] Sub-sub point 1
[*] Sub-sub point 2
[/list]


[*] Another sub point
[/list]
[/list]

  1. Outer level text

    Here we have an indented paragraph of information asfa asdfa;fk a;lsfkj ;asl fkj;asl fkj;als fkj;asl fkj;asl fj;asl fkj;aslfkj;aslfkj a;slfkj ;aslfkj ;asl fj;asldfkj;aslfkj;aslfkja;sldk f;as dklf;a ldksf ja;ldfks j;aldks fj;aldks fj;aldks fj;al dksfj;aldks fj;aldks jf;al dfkj;al dksf;aldks jf;aldks j;aldfks j;al dfksj;aldks fj;aldks f;aldks ja;ldksjf ;aldks fj;al fkja;l fj;adfklj a;

    1. Sub point
      • Sub-sub point 1
      • Sub-sub point 2

    2. Another sub point

  2. More outer level text

    And another paragraph; ;als a ;lsa dkjf;la ;al dfkj;a la;l dkjsf;al dfkjs;al dfkjs;al dkjsf;al fj;a dfkj;a dfk;adkls fa;ldsk;la ;sldk f;aldks fj;alsdkfj ;aldks ja;ldksjf ;alsdk fj;al kj;al ksd; ;la k;sl d; as;ldk ;a s;dl a;dfs ;adkf a;lsd ;alsd a;dls f;la ds;a dsa ;s d;a s df; a;dksjf ;aldksj f;aldfk j;adk fja;dfk jaf

    1. Sub point
      • Sub-sub point 1
      • Sub-sub point 2

    2. Another sub point


Dave Benham

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: The BatchLineParser ( jeb, dBenham )

#6 Post by Liviu » 28 Jul 2012 20:55

Thanks for sharing, everybody.

Ed Dyreen wrote:Tags: BatchLineParser, CmdLineParser, batch parser, commandline parser, expansion phase, parsing rules, special characters

That smells of a homegrown google cheater ;-) and there is nothing wrong with that, though I must say the site search works pretty well already.

Still, one problem with google-based searches, perhaps more apparent here than at large, is that it's hard to search on terms involving special characters. Say, for example, that I remember seeing a post which contained 9^>". I don't seem to be able to find it using the site search, though of course it's Dave's viewtopic.php?p=18477#p18477 from yesterday.

Maybe the admins would consider creating either a sticky post, or perhaps a "best of forum" page, with direct links to some of the "classics" posted/discussed/referenced here. I have a few bookmarks, pretty sure others do, too, but this would be better managed by an admin since individual posts tend to scroll down into oblivion, then there is that 2-URL limit.

Just a thought,
Liviu

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: The BatchLineParser ( jeb, dBenham )

#7 Post by Ed Dyreen » 28 Jul 2012 21:33

Liviu wrote:Say, for example, that I remember seeing a post which contained 9^>". I don't seem to be able to find it using the site search, though of course it's Dave's viewtopic.php?p=18477#p18477 from yesterday.
You won't even find 'Get a unique base lock name for this particular instantiation.'
Content in code tags cannot be searched :(

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: The BatchLineParser ( jeb, dBenham )

#8 Post by Liviu » 28 Jul 2012 21:50

Interesting, thanks. That's adding to my point, if anything.

P.S. FWIW now that you've pasted the previous phrase outside code tags it is found by google, both global and site-bound. However Dave's next line comment

Code: Select all

Incorporate a timestamp from WMIC if possible
is still off-bounds.

Post Reply