Page 1 of 3

split string into substrings based on delimiter

Posted: 02 May 2015 09:17
by Sponge Belly
Hi All! :)

The syntax for extracting the first occurrence of a substring to end of string is well known:

Code: Select all

set "tail=%str:*x=%"


And there’s a kludgy way to get the start of a string up to the first occurrence of the substring:

Code: Select all

set "head=%str:x=" & rem."%"


I was messing around with the latter when I stumbled across the following:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion

set "x=monotonous"
set "x1=%x:o=" & set "x2=%"
set x

endlocal & goto :eof


Var x1 contains: m, and x2 ends up with: us. From the last occurrence of the substring to the end of string, in other words. 8)

All the usual caveats apply, of course. The substring is case-insensitive, but the replacement string isn’t. Quotes must be doubled. Per cents, tildes, asterisks and equal signs must be encoded. And it only works with %-variables.

But there’s more. Run my little snippet again with echo on. The x2 var is set four times, each time with the contents of the substring between the previous occurrence of the letter o and the next one. :shock:

BFN!

- SB

Re: from last occurrence to end of string

Posted: 02 May 2015 09:32
by Aacini
I like it! :D

This remembers me the good old times, when interesting Batch discoveries were frequently made...

EDIT: THIS WORKS! :shock:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set i=1
set "x=monotonous"
set "x!i!=%x:o=" & set /A i+=1 & set "x!i!=%"
set x

Output:

Code: Select all

x=monotonous
x1=m
x2=n
x3=t
x4=n
x5=us


SB: Perhaps you should change the topic title to "Split string in all substrings separated by a delimiter!" 8)

Antonio

Re: split string into substrings based on delimiter

Posted: 02 May 2015 13:25
by Sponge Belly
Hi Aacini,

Clever use of delayed expansion and set /a! 8)

Your method of storing all the intermediary results was so obvious when I read the example… so why didn’t I think of it myself? :cry:

Anyways, I changed the subject line as you suggested.

Laters!

- SB

Re: split string into substrings based on delimiter

Posted: 03 May 2015 07:13
by aGerman
Great find :) I didn't even believe we could be able to consider the internal iterations that happen for text replacements.

Regards
aGerman

Re: split string into substrings based on delimiter

Posted: 03 May 2015 18:27
by carlos
great discovery. thanks for share it.
how it was found ?

Re: split string into substrings based on delimiter

Posted: 04 May 2015 01:45
by npocmaka_
nice!

Re: split string into substrings based on delimiter

Posted: 04 May 2015 01:46
by Aacini
The modification below allows to "Replace each substring by a series of different strings":

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set p=%%
set r1=ONE
set r2=TWO
set r3=THREE
set r4=FOUR

set i=0
set "x=monotonous"
set "x2=%x:o=" & set /A i+=1 & call set "x2=!x2!!p!r!i!!p!%"
set x

At end, x2 contains mONEnTWOtTHREEnFOURus. 8)

Antonio

Re: split string into substrings based on delimiter

Posted: 04 May 2015 01:52
by npocmaka_
carlos wrote:great discovery. thanks for share it.
how it was found ?



The &rem trick is comparatively old. I think I saw it first here - viewtopic.php?t=194 and here viewtopic.php?f=3&t=381

Re: split string into substrings based on delimiter

Posted: 04 May 2015 04:05
by jeb
npocmaka_ wrote:The &rem trick is comparatively old. I think I saw it first here - viewtopic.php?t=194 and here viewtopic.php?f=3&t=381

I think it's much older, but till now I only saw the &REM variant.

But the trick of Aacini to use different replace strings is really cool. :o

The only drawback of the command injection is the problem, that it's really tricky to made it bullet proof against quotes, linefeeds and carriage returns.

Re: split string into substrings based on delimiter

Posted: 04 May 2015 11:01
by Aacini
Another one! :mrgreen:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=<one>1</one>,<two>2</two>,<three>3</three>,<four>4</four>"

set "a=%x%,"
set "b=%a:,=" & (if "!b:<two>=!" neq "!b!" set "c=!b!") & set "b=%"
for /F "tokens=2 delims=><" %%a in ("%c%") do set "xTwo=%%a"
set x

At end: xTwo=2 8)

EDIT: The modification below get all fields from the line:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=<one>1</one>,<two>2</two>,<three>3</three>,<four>4</four>"

set q="
set p=%%
set "a=%x%,"
set "b=%a:,=" & set "b=!b:~1!" & set "b=!b:>==!" & call set !q!x!p!b:^</=!q!!p! & set "b=%"
set x

Output:

Code: Select all

x=<one>1</one>,<two>2</two>,<three>3</three>,<four>4</four>
xfour=4
xone=1
xthree=3
xtwo=2

I tried to insert the "& rem." command of the original trick in place of the "</" string in order to eliminate the undesired part after it, but I didn't found the way to made it work. However, just enclosing the desired part of the value in quotes was enough, although this method will fail if the undesired part contain special characters.

Note that the "call set !q!x!p!b:^</=!q!!p!" part is a nested replacement that is executed with each one of the substrings of the original replacement. This way, this method is comprised of three stages:

  1. The original string is splitted in several parts via the first %expansion%.
  2. Each part is processed using delayed expansion !variables! to assemble the final expression. This method allows to insert quotes and other special characters in places that the original %expansion% can not handle.
  3. The final expression in each part is evaluated via the nested CALL command.

This means that this method may be used instead of a FOR command in certain cases, when the processing of each part is not too complex.

Antonio

Re: split string into substrings based on delimiter

Posted: 04 May 2015 13:48
by Aacini
I like the following one! :P

"Replace a list of comma-separated subscripts by their corresponding array elements"

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set p=%%
set r1=ONE
set r2=TWO
set r3=THREE
set r4=FOUR
set r5=FIVE

set "x=3,1,5,4"
set "x2=%%r%x:,=!p!," & call set "x2=!x2!!p!r%%%"
set x

Output:

Code: Select all

x=3,1,5,4
x2=THREE,ONE,FIVE,FOUR

Antonio

Re: split string into substrings based on delimiter

Posted: 12 May 2015 13:52
by Sponge Belly
Hi Antonio,

Your last example was amazing. Too bad I can’t understand it! :lol:

Anyways, I was wondering if it’s possible to change the value of the original string from inside the loop caused by the string split operation… because if it is, we could append to the string what was just taken off, the string would never grow shorter, and the loop would go on indefinitely.

Just thinking out loud. ;)

- SB

Re: split string into substrings based on delimiter

Posted: 13 May 2015 01:11
by Ed Dyreen
Sponge Belly wrote:Hi Antonio,

Your last example was amazing. Too bad I can’t understand it! :lol:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set p=%%
set r1=ONE
set r2=TWO
set r3=THREE
set r4=FOUR
set r5=FIVE

set "x=3,1,5,4"
set "x2=%%r%x:,=!p!," & call set "x2=!x2!!p!r%%%"
set x
this is still the &REM trick, but replaced with a different command, so during the first percent expansion each , inside the x variable expands into the string !p!," & call set "x2=!x2!!p!r so you get
set "x2=%r3!p!," &call set "x2=!x2!!p!r1!p!," &call set "x2=!x2!!p!r5!p!," &call set "x2=!x2!!p!r4%"
When the first command is executed there is a second exclamation mark expansion so !p! expands into %
set "x2=%r3%,"
The second command starts with Call so call set "x2=!x2!!p!r1!p!," expands to set "x2=%r3%,%r1%,"

Now the set command executes and the result is assigned to x2
set "x2=THREE,ONE,"
and so on.

This also answers your 2nd question; what you call a loop is a fixed series of commands.

dosItHelp? :wink:

Re: split string into substrings based on delimiter

Posted: 22 May 2015 10:10
by Sponge Belly
Hi Ed, :)

Thanks for the explanation. Still can’t quite wrap my head around Aacini’s code, though. Will keep trying.

In the meantime, I’ve rekindled my obsession with finding the best way to trim leading and trailing whitespace from a string:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion
set ^"str= ^^^"    ^^^&^^    ^"^^^&^"^& !^^!^^^^! %%   %%OS%%    ^"
for /f delims^=^ eol^= %%A in ('
cmd /von /c echo(^^!str:^^^"^=^^^"^^^"^^!^| more /t1
') do set "x= %%A "

set /a i=j=0 & set "k="
set "x=%x: =" & (if defined x if not defined k set /a k=i) & (if defined x set /a j=i) & set /a i+=1 & set "x=%"
set "x="
set /a pos=k-1,len=j-i+1
if %len% lss 0 (set "len=,%len%") else set "len="

setlocal enabledelayedexpansion
for /f delims^=^ eol^= %%A in ("!str:~%pos%%len%!") do (
endlocal & set "xstr=%%A" & echo([%%A])
set xstr

endlocal & goto :eof


Quotes must be doubled. Any tabs are turned into spaces by more /t1. The cmd /von is necessary to avoid %-variable expansion. A space is added to either end of the resultant string, which is stored in var x. This is to ensure that x is undefined the first and last time the string is split.

Sorry about the overlong line, btw. If anyone can help me optimise the… whatayacallit… statements between the opening and closing per cents that are executed for every time the string is split, please get in touch.

Anyways, i is the number of times the string is split, j is the value of i the last time x was defined, and k is the value of i the first time x was defined. The amount of whitespace to be trimmed from both ends of the string can be inferred from these values.

Interesting approach, but I don’t know how practical it is. :|

- SB

Re: split string into substrings based on delimiter

Posted: 22 May 2015 11:20
by Aacini
I LIKE THIS! :mrgreen:

"Trim leading and trailing whitespace from a string"

EDIT 2015/05/23 - I slightly modified the code exchanging the initialization of "x2" and "word" variables; this detail allows to eliminate the inserted space at begining of the string (and makes the code more coherent).

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=     String with spaces     "
set "x=%x% "
set "x2="
set "word=%x: =" & (if "!word!" neq "" set "x2=!x2! !word!") & set "word=%" & set "x2=!x2:~1!"

echo "%x:~0,-1%"
echo "%x2%"

Antonio