vagaries of for /f loops

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Sponge Belly
Posts: 223
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

vagaries of for /f loops

#1 Post by Sponge Belly » 01 Aug 2013 08:22

Dear DosTips,

I came across something puzzling while researching an alternative method of trimming whitespace (spaces and/or tabs) from the head and tail of a string. To trim any whitespace on the left of a string, simply do this:

Code: Select all

for /f "tokens=*" %%a in ("%string%") do set "string=%%a"


It’s as simple as that… or is it? I have to do more testing, but I suspect the usual poison characters and unbalanced quotes may cause problems. But that’s another topic.

What’s puzzling me is this slightly different code taken from this 2008 alt.msdos.batch thread:

Code: Select all

for /f "tokens=*" %%a in ('echo(%string% ') do set "string=%%a"
set "string=%string:0,-1%"


Not only does it trim any whitespace from the left and right of the string, it also condenses any internal multiplespaces or tabs to a single instance.

I don’t understand how a small change in the for /f loop’s in (…) clause can have such a dramatic effect on the output. Can anyone shed some light on this for me?

Thanks! :-)

- SB

penpen
Expert
Posts: 1996
Joined: 23 Jun 2013 06:15
Location: Germany

Re: vagaries of for /f loops

#2 Post by penpen » 01 Aug 2013 13:44

Code: Select all

for /f "tokens=*" %%a in ("%string%") do set "string=%%a"
This for loop parses a string, using space and tab as delimiter, until it finds the start of the first token.
Then pasring stops and the rest of the line is passed as variable %%a to the command executed after do.
So it removes all leading spaces (and tabs) of string.

Code: Select all

for /f "tokens=*" %%a in ('echo(%string% ') do set "string=%%a"
This for loop parses the output of the echo command to the variable %%a again using tokens=* delims as above.
But this for Loop has to compute the command line to be executed, and normalize it prior to execution.
This results in removing all command delimiters and set one space after each command line token.
So you get this result.

Btw: The second space in the echo command is not needed.
Also you could add other command line delimiters such as ,;

Code: Select all

for /f "tokens=*" %%a in ('echo(%string%') do set "string=%%a"
for /f "tokens=*" %%a in ('echo(%string%,,,,,;;;;;') do set "string=%%a"
The result should be the same.
The delimiters in the string variable are removed, too, so commata and semicola are deleted.

penpen

Edit: Corrected the name of the ; characters... (semicolon... not a dot...) :oops:
Last edited by penpen on 10 Aug 2013 15:14, edited 1 time in total.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: vagaries of for /f loops

#3 Post by dbenham » 01 Aug 2013 15:20

Good explanation penpen.

The added space at the end is needed to make sure there is always a space to trim off at the end when the substring operation is done.

One important thing to add. The code will convert all consecutive spaces (or other token delimiters) into a single space. This includes multiple spaces in the middle. So the technique really is not good for left and right trimming. For example. "<sp><sp><sp>Hello<sp><sp><sp>world<sp><sp><sp>" becomes "<sp>Hello<sp>world<sp>"


Dave Benham

Sponge Belly
Posts: 223
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: vagaries of for /f loops

#4 Post by Sponge Belly » 10 Aug 2013 13:40

Thanks Penpen and Dave! ;-)

In the second example, multiple parameter separators (, ; = <SP> and <TAB>) are indeed replaced with a single space… but not dots. Note that if you quote the string, any leading whitespace and the opening quote itself will be stripped away and the rest of the string will be displayed literally—except that the closing quote is displayed too, which is probably not what you want.

Try this from the command line:

Code: Select all

>set "str=  alpha,  bravo=  charlie;  delta.  echo!  foxtrot&  "
>@for /f "tokens=*" %a in ('echo( "%str%" ') do @echo([%~a]
[  alpha,  bravo=  charlie;  delta.  echo!  foxtrot&  " ]


But wait! :shock:

It just keeps getting weirder. I just tried this:

Code: Select all

>@for /f "tokens=*" %a in ('"  echo(  %str%  "') do @echo([%~a]
[alpha,  bravo=  charlie;  delta.  echo!  foxtrot]


And the string is nicely trimmed at both ends… apart from the missing & after foxtrot.

Hmm… anything after the & will be executed as a command. I’ll play around with this for a while and let you know what I come up with…

Back again! Nah, couldn’t do anything useful with it. :-(

The final character of the last token has to be an ampersand for the “magic” to happen, but if there are any other poison characters in the string, you’re in trouble.

That’s all I’ve got for now!

- SB

penpen
Expert
Posts: 1996
Joined: 23 Jun 2013 06:15
Location: Germany

Re: vagaries of for /f loops

#5 Post by penpen » 10 Aug 2013 15:12

Sponge Belly wrote:In the second example, multiple parameter separators (, ; = <SP> and <TAB>) are indeed replaced with a single space… but not dots.
Sorry i meant the semicola... had written it in the example code, but failed in text... :oops:

Sponge Belly wrote:Note that if you quote the string, any leading whitespace and the opening quote itself will be stripped away and the rest of the string will be displayed literally—except that the closing quote is displayed too, which is probably not what you want.

Try this from the command line:

Code: Select all

>set "str=  alpha,  bravo=  charlie;  delta.  echo!  foxtrot&  "
>@for /f "tokens=*" %a in ('echo( "%str%" ') do @echo([%~a]
[  alpha,  bravo=  charlie;  delta.  echo!  foxtrot&  " ]

But this is not a surprising behavior:the first space is stripped of by the tokenizer and %a is set to "[ ][ ]alpha,[ ][ ]bravo=[ ][ ]charlie;[ ][ ]delta.[ ][ ]echo![ ][ ]foxtrot&[ ][ ]"[ ] ([ ] is a normal space character).
You are using %~a that is %a with removed doublequotes from first and last character. As only the first character is a doublequote (the last character is a [ ]) only the first doublequote is removed, so the last stays.

It just keeps getting weirder. I just tried this:

Code: Select all

>@for /f "tokens=*" %a in ('"  echo(  %str%  "') do @echo([%~a]
[alpha,  bravo=  charlie;  delta.  echo!  foxtrot]

And the string is nicely trimmed at both ends… apart from the missing & after foxtrot.

Hmm… anything after the & will be executed as a command. I’ll play around with this for a while and let you know what I come up with.
The string is not trimmed properly in this case: The cause of the "right trimming" is the position of the unescaped &, that will cut the string to trim at right this position.
Just add some spaces between the "foxtrot" and the "&" and you will see, that it will not be trimmed.

And why is this weird? It's just the use of an unescaped & in the first echo command.
You may escape this special character prior to the for loop, to display it properly:

Code: Select all

>set "str=%str:&=^&%"

penpen

Post Reply