eol=; tokens=*

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

eol=; tokens=*

#1 Post by Sponge Belly » 01 Jan 2015 17:11

Happy 2015! :)

Has anyone else noticed this strange behaviour?

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion

for /f "delims=:" %%N in (
'findstr /bln __DATA__ "%~dpnx0"
') do set lino=%%N

for /f "usebackq skip=%lino% eol=; tokens=*" %%A in (
"%~dpnx0") do echo([%%A]

endlocal & goto :eof

__DATA__
straightforward line 1
; commented-out line 2
  indented text line 3
  ; indented comment line 4


It seems eol is applied after delims and tokens. Change "tokens=*" to "delims=" and run the program again to see what I mean. Just one more thing to be mindful of when you’re processing lines of text with a for /f loop! ;)

BFN!

- SB

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: eol=; tokens=*

#2 Post by foxidrive » 01 Jan 2015 22:41

Does this show what you were trying to illustrate?

It's not clear to me what you're trying to say.

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion

for /f "delims=:" %%N in (
'findstr /bln __DATA__ "%~dpnx0"
') do set lino=%%N

echo %lino%

echo ========
for /f "usebackq skip=%lino% eol=; tokens=*" %%A in ("%~dpnx0") do echo([%%A]
echo A
for /f "usebackq skip=%lino% eol=; delims=" %%A in ("%~dpnx0") do echo([%%A]
echo ========
for /f "usebackq skip=%lino% eol=: tokens=*" %%A in ("%~dpnx0") do echo([%%A]
echo C
for /f "usebackq skip=%lino% eol=: delims=" %%A in ("%~dpnx0") do echo([%%A]
echo ========
pause
endlocal & goto :eof

__DATA__
straightforward line 1
; commented-out line 2
  indented text line 3
  ; indented comment line 4



Code: Select all

21
========
[straightforward line 1]
[indented text line 3]
A
[straightforward line 1]
[  indented text line 3]
[  ; indented comment line 4]
========
[straightforward line 1]
[; commented-out line 2]
[indented text line 3]
[; indented comment line 4]
C
[straightforward line 1]
[; commented-out line 2]
[  indented text line 3]
[  ; indented comment line 4]
========
Press any key to continue . . .

Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: eol=; tokens=*

#3 Post by Sponge Belly » 02 Jan 2015 04:15

Hi Foxi,

Sorry for not being clearer. The difference in the output from your first two examples is, as you inferred, what I was trying to draw attention to. When "delims=" is in effect, lines beginning with a semi-colon will be ignored. But when it’s "tokens=*", lines where the first non-whitespace character is a semi-colon will be passed over.

That’s all! ;)

- SB

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: eol=; tokens=*

#4 Post by dbenham » 02 Jan 2015 06:59

The behavior is strictly controlled by DELIMS and EOL - it has nothing to do with the TOKENS value.

When FOR /F processes a line, it first breaks the line into tokens as per DELIMS, and then skips the line if the first character of the first token is the EOL character. This explains why the indented line is skipped when using the default DELIMS value of <space><tab>. It also explains the known behavior that setting EOL to one of the DELIMS characters effectively disables EOL - any EOL character at the start will have been consumed by DELIMS processing by the time the EOL check is made.

Only after all of the above occurs are the appropriate token values assigned to FOR values as per TOKENS. So TOKENS has no impact on whether a line is skipped due to EOL.

I do find the indented case interesting - I was not fully aware of the order of operations until you showed that case - thanks :D


Dave Benham

npocmaka_
Posts: 512
Joined: 24 Jun 2013 17:10
Location: Bulgaria
Contact:

Re: eol=; tokens=*

#5 Post by npocmaka_ » 02 Jan 2015 07:07

:!: Here I've tried to figure out the prio of the FOR /F options - http://ss64.org/viewtopic.php?id=1799 , but seems I should re-think what I've thought I know :-)


So the the prio seems to be:

useback>skip>tokens=*>delims>eol>tokens=[ something that is not an asterisk ] :?:

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: eol=; tokens=*

#6 Post by foxidrive » 02 Jan 2015 07:26

dbenham wrote:When FOR /F processes a line, it first breaks the line into tokens as per DELIMS, and then skips the line if the first character of the first token is the EOL character. This explains why the indented line is skipped when using the default DELIMS value of <space><tab>.


Under the A line in my screenshot shows that the indented line is not skipped. Only the line starting with the default EOL

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: eol=; tokens=*

#7 Post by dbenham » 02 Jan 2015 09:29

foxidrive wrote:
dbenham wrote:When FOR /F processes a line, it first breaks the line into tokens as per DELIMS, and then skips the line if the first character of the first token is the EOL character. This explains why the indented line is skipped when using the default DELIMS value of <space><tab>.


Under the A line in my screenshot shows that the indented line is not skipped. Only the line starting with the default EOL
Yes - this is mostly consistent with the rules as I had stated them. When using the default DELIMS of <space><tab> (TOKENS value is insignificant), the DELIMS strips the leading spaces, so the first remaining character in the first token is ";", which matches the default EOL, so the line is skipped. When DELIMS is disabled, the leading spaces are preserved, so the first remaining character of the first token does not match EOL, so it is not skipped.

I did make one mistake - The DELIMS token splitting must not be done to completion before TOKEN selection, otherwise * would not work properly. The token selection must be interleaved with the delimiter parsing.

npocmaka_ wrote:So the the prio seems to be:

useback>skip>tokens=*>delims>eol>tokens=[ something that is not an asterisk ] :?:
Not quite. The tokens=* belongs at the end along with all other tokens values.

Here are my revised FOR /F processing rules:

Code: Select all

1 - apply USEBACKQ to IN() clause to determine the appropriate content source
2 - Line Loop - while not end of content {
    3 - If SKIP not defined or line number > SKIP value then {
        4 - Remove leading delimiters as per DELIMS
        5 - If first remaining character does not match EOL then {
            6 - Token Loop - while not end of TOKENS {
                7 - If next specified token is not *, then split at next set of DELIMS characters
                8 - If current token number matches the next token specified by TOKENS, then assign FOR variable and advance the TOKENS list
            }
        }
    }
}


Dave Benham

Yury
Posts: 115
Joined: 28 Dec 2013 07:54

Re: eol=; tokens=*

#8 Post by Yury » 02 Jan 2015 16:54

"Eol" must be applied to the entire string, regardless of the declared delimiters, so the problem is here:

Code: Select all

@(for /f "eol=# tokens=1-3 delims=#" %%i in ("#111#222#333#") do @echo %%i %%j %%k -- this shouldn't be!& pause>nul)|| (echo This should be!& pause>nul)
@(for /f "eol=1 tokens=1-3 delims=#" %%i in ("#111#222#333#") do @echo %%i %%j %%k -- this should be!& pause>nul)|| (echo This shouldn't be!& pause>nul)


:o .

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: eol=; tokens=*

#9 Post by dbenham » 03 Jan 2015 09:30

Yury wrote:"Eol" must be applied to the entire string, regardless of the declared delimiters

Why on earth would you say that, when your own experiment disproves your assumption :?: :?

CMD.EXE is so poorly documented - usually a lack of documentation, but sometimes the documentation is simply wrong. I've never seen the EOL feature properly documented in any official MS docs.

The proposed FOR /F rules I laid out fully predict the observed behavior.

But... I never realized FOR /F treated the absence of an un-skipped line as an error condition :!: :shock: 8)
Thanks for your examples - This could be useful :D

As long as at least one line is not skipped, then there is no error:

Code: Select all

C:\test>(for /f %a in ('echo hello^&echo(     ') do @echo %a) && echo PASS || echo FAIL
hello
PASS

But if all lines are skipped, FOR /F raises an error:

Code: Select all

C:\test>(for /f %a in ("") do @echo %a) && echo PASS || echo FAIL
FAIL

C:\test>(for /f %a in ("    ") do @echo %a) && echo PASS || echo FAIL
FAIL

C:\test>(for /f %a in (";hello") do @echo %a) && echo PASS || echo FAIL
FAIL

C:\test>(for /f "tokens=2" %a in ("hello") do @echo %a) && echo PASS || echo FAIL
FAIL

C:\test>(for /f "skip=1" %a in ("hello") do @echo %a) && echo PASS || echo FAIL
FAIL

For these next examples, remember that (CALL) without a space raises an error, and (CALL ) with a space clears any error.

An empty FOR /F loop does not automatically set ERRORLEVEL to non zero, even though an error is raised:

Code: Select all

C:\test>(call )

C:\test>for /f "tokens=2" %a in ("") do @echo %a)

C:\test>echo %errorlevel%
0

But ERRORLEVEL is set if || is activated:

Code: Select all

C:\test>(call )

C:\test>(for /f "tokens=2" %a in ("") do @echo %a) || rem

C:\test>ECHO %errorlevel%
1

You must be careful, because an error condition can exist upon completion of FOR /F if the last executed DO command raised an error:

Code: Select all

C:\test>(for /f %a in ("hello") do @echo %a & (call) ) && echo PASS || echo FAIL
hello
FAIL

So if you want to test if FOR /F had at least one non-skipped line, then your last DO command should clear any error:

Code: Select all

C:\test>(for /f %a in ("hello") do @echo %a & (call) & (call ) ) && echo PASS || echo FAIL
hello
PASS


Dave Benham

Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: eol=; tokens=*

#10 Post by Sponge Belly » 10 Jan 2015 06:47

Hello Again! :)

Here’s some more for /f weirdness:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion

if not exist "hash10k.txt" (
for /l %%I in (1 1 1000) do <nul set /p "=##########" >>"hash10k.txt"
<nul set /p "=# " >>"hash10k.txt"
)

if not exist "xll.txt" (
<nul set /p "=first second " >"xll.txt"
type "hash10k.txt" >>"xll.txt"
<nul set /p "=fourth " >>"xll.txt"
type "hash10k.txt" >>"xll.txt"
>>"xll.txt" echo(sixth
)

set "nth=1"
:loop
(for /f "usebackq tokens=%nth%" %%A in ("xll.txt") do set /a nth+=1
) && goto loop || (set /a nth-=1 & goto break)
:break
set nth

endlocal & goto :eof


Test file xll.txt contains six tokens. Tokens 3 and 5 are 10,001 hashes long. The for /f loop processes the extremely long line token by token. It doesn’t throw an error when it gets to the 3rd or 5th token and increments the nth variable without complaint. Only when it looks for the 7th token and finds none does it execute the || claus.

This magic is fragile, however, and falls apart if you attempt to access the %%A loop variable. Even so, an interesting way to find the number of tokens in a string.

BFN!

- SB

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: eol=; tokens=*

#11 Post by Squashman » 10 Jan 2015 09:12

Sponge Belly wrote:Even so, an interesting way to find the number of tokens in a string.

FOR /F will treat consecutive delimiters as 1 token.
A[tab]B[tab][tab]D would only be 3 tokens.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: eol=; tokens=*

#12 Post by dbenham » 10 Jan 2015 09:50

@Sponge Belly - Ooh, nice one :D

I vaguely remember a jeb post that said FOR /F could read past 8191 characters, but I never saw an example. Your code provides a nice example of that feature.

Squashman wrote:FOR /F will treat consecutive delimiters as 1 token.
A[tab]B[tab][tab]D would only be 3 tokens.
Sponge Belly's point is still valid, as long as you understand the definition of a FOR /F token.


Dave Benham

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: eol=; tokens=*

#13 Post by Squashman » 10 Jan 2015 10:01

dbenham wrote:Sponge Belly's point is still valid, as long as you understand the definition of a FOR /F token.

I am not understanding. Are you saying my comment is wrong and his code will work regardless of the number of repeat delimiters and we don't have to use your PARSECSV.bat file to properly parse the number of correct tokens?

jeb
Expert
Posts: 1041
Joined: 30 Aug 2007 08:05
Location: Germany, Bochum

Re: eol=; tokens=*

#14 Post by jeb » 10 Jan 2015 14:00

dbenham wrote:I vaguely remember a jeb post that said FOR /F could read past 8191 characters, but I never saw an example. Your code provides a nice example of that feature.

Yes, I played with ultra long lines a long time ago.

I tested it again with this code

Code: Select all

@echo off
setlocal EnableDelayedExpansion
set "line=X"
for /L %%n in (1 1 10) do set "line=!line:~0,500!!line:~0,500!"

(
   for /L %%n in (1 1 100) do (
      set /p ".=!line!" < nul
   )
   echo  param2 param3
) > long.txt

for /F "tokens=1,2,3" %%A in (long.txt) DO (
   echo %%B %%C
)


It creates one line with param1=100.000 "X", param2 and param3 with a space as delimiter.
And as you can see the parameters 2 and 3 work as expected.
It seems that there is no limit for the length of a single parameter, the code also works with 1MB or also with 10MB.

Btw. I'd never found a way to access the ultra long parameters itself.

All modifiers seem to fail, even when I build a ulp (ultra long parameter) of the form "C:\XXXXXXX....100k...XXXX.txt"
%%~dA, %%~xA or %%~zA results in a block failure, so the complete block isn't executed

jeb

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: eol=; tokens=*

#15 Post by foxidrive » 10 Jan 2015 20:31

Sponge Belly wrote:Hello Again! :)

Here’s some more for /f weirdness:

This magic is fragile, however, and falls apart if you attempt to access the %%A loop variable. Even so, an interesting way to find the number of tokens in a string.

BFN!

- SB


That's most interesting SB!
I'm wondering what task you were doing to happen to stumble across that :)

The only input I can make is a rearrangement of your code below which is just a bit simpler - and the file was created so quickly here that I didn't use the if exist...

Code: Select all

@echo off

(
<nul set /p "=first second "
for /l %%I in (1 1 1000) do <nul set /p "=##########"
<nul set /p "=# fourth "
for /l %%I in (1 1 1000) do <nul set /p "=##########"
echo(# sixth
)>"xll.txt"


set "nth=1"
:loop
for /f "usebackq tokens=%nth%" %%A in ("xll.txt") do set /a nth+=1 & goto loop
set /a nth-=1
set nth

pause


Post Reply