split string into substrings based on delimiter

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: split string into substrings based on delimiter

#16 Post by Ed Dyreen » 23 May 2015 05:20

Sponge Belly wrote:In the meantime, I’ve rekindled my obsession with finding the best way to trim leading and trailing whitespace from a string
The best way is probably a way that allows not only to trim whitespace from a string but any character.

Code: Select all

@echo off &setlocal enableDelayedExpansion

set "_= examplaString "
set "s= "


if defined _ (
       set/Ac=4096&for /l %%i in (1,1,12) do set s=!s!!s!
       for /l %%i in (1,1,13) do if defined _ for %%? in (!c!) do (
              if !_:~-%%?!==!s:~-%%?! set _=!_:~0,-%%?!
              if !_:~0^,%%?!==!s:~-%%?! set _=!_:~%%?!
              set/Ac/=2
       )
)
echo(&<nul set/P=  _: '!_!'
echo(


pause
exit

Code: Select all

  _: 'examplaString'
Druk op een toets om door te gaan. . .
Parameter 's' contains a whiteSpace, but it can be any character and is therefore more functional.

Aacini
Expert
Posts: 1910
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#17 Post by Aacini » 23 May 2015 13:00

I realized that my previous method to trim leading and trailing spaces also reduce several spaces between words to just one. The new method below correctly preserve multiple spaces between words:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=     String   with   spaces     "

set "x=%x% "
set "i=0"
set "j="
set "w=%x: =" & (if not defined w (if not defined j (set /A i+=1) else set /A j+=1) else set j=1) & set "w=%"
set "x2=!x:~%i%,-%j%!"

echo "%x:~0,-1%"
echo "%x2%"

NOTE: I just discovered that when the starting position in substring extraction is omitted, like in "%VAR:~,end%", the starting position is assumed to be 0. I don't know if this point has been mentioned before...

Antonio

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: split string into substrings based on delimiter

#18 Post by Ed Dyreen » 23 May 2015 14:42

Aacini wrote:NOTE: I just discovered that when the starting position in substring extraction is omitted, like in "%VAR:~,end%", the starting position is assumed to be 0. I don't know if this point has been mentioned before...

Antonio
verified on XP

Aacini
Expert
Posts: 1910
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#19 Post by Aacini » 23 May 2015 22:13

"Get all positions where a substring appear in a larger string (NOT case sensitive)"

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=Here, There and Everywhere"
echo %x%

rem k = Lenght of the substring, "er" in this case
set /A "k=2, i=-k"
set "x2="
set "w=%x:er=" & call :strLen w j & set /A i+=j+k & set "x2=!x2!,!i!" & set "w=%"
if defined x2 set "x2=%x2:~1%"
echo Substring "er" at positions: %x2%

rem k = Lenght of the substring, "here" in this case
set /A "k=4, i=-k"
set "x2="
set "w=%x:here=" & call :strLen w j & set /A i+=j+k & set "x2=!x2!,!i!" & set "w=%"
if defined x2 set "x2=%x2:~1%"
echo Substring "here" at positions: %x2%

goto :EOF


:strLen var len=
set "str=0!%1!"
set "%2=0"
for /L %%a in (8,-1,0) do (
   set /A "newLen=%2+(1<<%%a)"
   for %%b in (!newLen!) do if "!str:~%%b,1!" neq "" set "%2=%%b"
)
exit /B

Output:

Code: Select all

Here, There and Everywhere
Substring "er" at positions: 1,8,18,23
Substring "here" at positions: 0,7,22

Antonio

Sponge Belly
Posts: 231
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: split string into substrings based on delimiter

#20 Post by Sponge Belly » 22 Nov 2015 11:49

Hello Again! :)

Below is my revised code for trimming whitespace from the start and end of a string:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion
:: nasty str full of poison chars
set ^"str= ^^^"  ^^^&^^   ^&^"^&    %%os%%    !random! ^"
:: double all quotes
set "x=%str:"=""%"
:: turn tabs to spaces and ensure str ends with space
for /f delims^=^ eol^= %%A in ('
cmd /von /c "echo(^!x^!"^| more /t1
') do set "x=%%A "

:: i is offset from start of str and j is offset from end
set /a i=0 & set "j="
:: thanks to Aacini for this magic incantation
set "x=%x: =" & (if not defined x (if not defined j (set /a i+=1) else set /a j+=1) else set "j=1") & set "x=%"

setlocal enabledelayedexpansion
:: add space to end of orig str
set "str=!str! "
:: trim str, pass over endlocal barrier, and echo on screen
for /f delims^=^ eol^= %%A in ("!str:~%i%,-%j%!") do (
endlocal & set "strx=%%A" & echo([%%A])

::dump vars
set str

endlocal & goto :eof


And as a bonus to anyone still reading at this point, here’s my take on how to extract the final word/token from a string:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion
:: nasty str full of poison chars
set ^"str=  ^^^"    ^^^&^^   ^&^"^&   %%os%%  ^^^^!random^^^^!^"
:: double all quotes
set "strq=%str:"=""%"
:: turn tabs to spaces and ensure str ends with space
for /f tokens^=*^ eol^= %%A in ('
cmd /v:on /c "echo(^!strq^!"^| more /t1
') do set "strq=%%A "

:: counts words/tokens in str
set "strq=%strq: =" & (if defined strq set /a nth+=1) & set "strq=%"

:: store nth token of str in var and echo on screen
for /f tokens^=%nth%^ eol^= %%A in ('
cmd /v:on /c "echo(^!str^!"
') do (set "strn=%%A" & echo([%%A])

:: dump vars
set str
set nth

endlocal & goto :eof


Thanks to everyone who contributed to this topic. Honorable mentions go to Aacini and Ed Dyreen.

BFN!

- SB

CirothUngol
Posts: 46
Joined: 13 Sep 2017 18:37

Re: split string into substrings based on delimiter

#21 Post by CirothUngol » 16 Sep 2017 12:20

Aacini wrote:I realized that my previous method to trim leading and trailing spaces also reduce several spaces between words to just one.
... and that's why I love it! I had previously always used this construct to do the same thing:

Code: Select all

FOR %%A IN ("!string: =" "!") DO IF "%%~A" NEQ "" SET "result=!result! %%~A"
SET "result=!result:~1!"
I suspect that the string expansion will be far faster than the FOR loop, thanks again for another useful technique to solve this common problem.

Note: I realize that this thread is nearly 2 years old so I hope there are no rules regarding the resurrection of dead topics.

ShadowThief
Expert
Posts: 1166
Joined: 06 Sep 2013 21:28
Location: Virginia, United States

Re: split string into substrings based on delimiter

#22 Post by ShadowThief » 16 Sep 2017 21:44

CirothUngol wrote:I hope there are no rules regarding the resurrection of dead topics.

I never understood forums that had those rules. If old topics weren't meant to be posted in, they would be automatically locked after some period of inactivity.

Squashman
Expert
Posts: 4485
Joined: 23 Dec 2011 13:59

Re: split string into substrings based on delimiter

#23 Post by Squashman » 23 Oct 2017 20:05

Could we use this same type of string substitution to get the first character of each word assigned to a final output variable?

Code: Select all

set "x=split string into substrings"


And the output for the final var would be:

Code: Select all

ssis

Aacini
Expert
Posts: 1910
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#24 Post by Aacini » 23 Oct 2017 20:33

Squashman wrote:
23 Oct 2017 20:05
Could we use this same type of string substitution to get the first character of each word assigned to a final output variable?

Code: Select all

set "x=split string into substrings"
And the output for the final var would be:

Code: Select all

ssis

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=split string into substrings"

set "x=%x% "
set "x2="
set "word=%x: =" & set "x2=!x2!!word:~0,1!" & set "word=%" 

set x
Antonio

thefeduke
Posts: 211
Joined: 05 Apr 2015 13:06
Location: MA South Shore, USA

Re: split string into substrings based on delimiter

#25 Post by thefeduke » 26 Oct 2017 20:57

Aacini wrote:"Get all positions where a substring appear in a larger string (NOT case sensitive)"

Code: Select all

. . .
set "w=%x:er=" & call :strLen w j & set /A i+=j+k & set "x2=!x2!,!i!" & set "w=%"
. . .
set "w=%x:here=" & call :strLen w j & set /A i+=j+k & set "x2=!x2!,!i!" & set "w=%"
...

Output:

Code: Select all

Here, There and Everywhere
Substring "er" at positions: 1,8,18,23
Substring "here" at positions: 0,7,22

Antonio
I thought that this magnificent substitution method would be more useful if the delimiter string were not hard-coded but substituted from an input argument.

The post is a couple of years old, but here's how I modified that script:

Code: Select all

@echo off
:StrPos delim String
setlocal EnableDelayedExpansion

Rem Get the delimeter string and its length
set "v=%~1"
If ".%v%" EQU "." (Echo.& Echo.No Substring& Exit /B)
call :strLen v k

Rem Insert the delimeter string into a copy of this batch script
    Set "Check=%~dp0~StrPos~%~x0"
Rem Delayed expansion phrases are destroyed but never executed in copy
    If not exist "%Check%" (
        For /F "tokens=*" %%A in ('Type %~f0') do (
            Set "line=%%~A"
            Set "line=!line:<delim>=%v%!"
            >>"%Check%" Echo.!line!
        )
Rem     Call the copy of this batch script and retern to self
        call         "%Check%" %*
        Del          "%Check%"
        exit /B
    )

set "x=%~2"
If ".%x%" EQU "." (Echo.& Echo.No String& Exit /B)
call :strLen  x  y

rem k = Length of the substring
set /A "k=k, i=-k"
set "x2="
Rem This critical statement can now contain a pseudo substition in the copy
set "w=%x:<delim>=" & call :strLen w j & set /A i+=j+k & set "x2=^!x2^!,^!i^!" & set "w=%"
@echo off
if defined x2 set "x2=%x2:~1%"

echo.%x2%
echo Substring "<delim>" at positions: %x2% of '%x%'

goto :EOF

Rem Delayed expansion phrases are escaped for proper execution in the copy
:strLen var len=
set "str=0^!%1^!"
set "%2=0"
for /L %%a in (8,-1,0) do (
   set /A "newLen=%2+(1<<%%a)"
   for %%b in (^!newLen^!) do if "^!str:~%%b,1^!" neq "" set "%2=%%b"
)
exit /B

Example wrote:Call StrPos \ %Temp%
2,8,15,23,29
Substring "\" at positions: 2,8,15,23,29 of 'C:\Users\Zyltch\AppData\Local\Temp'


Call StrPos " - " "Window Title - Call StrPos"
12
Substring " - " at positions: 12 of 'Window Title - Call StrPos'
The delimiter string is the first argument, but ends up hard-coded in the copied script.

John A.

Aacini
Expert
Posts: 1910
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#26 Post by Aacini » 30 Oct 2017 18:59

thefeduke wrote:
Aacini wrote:"Get all positions where a substring appear in a larger string (NOT case sensitive)"

Code: Select all

. . .
set "w=%x:er=" & call :strLen w j & set /A i+=j+k & set "x2=!x2!,!i!" & set "w=%"
. . .
set "w=%x:here=" & call :strLen w j & set /A i+=j+k & set "x2=!x2!,!i!" & set "w=%"
...

Output:

Code: Select all

Here, There and Everywhere
Substring "er" at positions: 1,8,18,23
Substring "here" at positions: 0,7,22

Antonio


I thought that this magnificent substitution method would be more useful if the delimiter string were not hard-coded but substituted from an input argument.

The post is a couple of years old, but here's how I modified that script:

Code: Select all

. . .

John A.



You may get the same result in a simpler way:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=Here, There and Everywhere"

call :StrPos "er" "%x%"
call :StrPos "here" "%x%"

goto :EOF


:StrPos Delim String
setlocal EnableDelayedExpansion

set "x=%~2"
if not defined x echo/& echo No string& exit /B
set "v=%~1"
if not defined v echo/& echo No delimiter& exit /B
call :strLen v k=
set /A i=-k
set "x2="
set "w=!x:%~1=" ^& call :strLen w j ^& set /A i+=j+k ^& set "x2=¡x2¡,¡i¡" ^& set "w=!"
set "w=%w:¡=!%"
if defined x2 set "x2=%x2:~1%"
echo %~2
echo Substring "%~1" at positions: %x2%
exit /B


:strLen var len=
setlocal EnableDelayedExpansion
set "str=0!%1!"
set "%2=0"
for /L %%a in (8,-1,0) do (
   set /A "newLen=%2+(1<<%%a)"
   for %%b in (!newLen!) do if "!str:~%%b,1!" neq "" set "%2=%%b"
)
for %%a in (!%2!) do endlocal & set "%2=%%a"
exit /B

The only drawback with this method is to choose an unused character/string as replacement of the exclamation mark. In this code I chose the open exclamation mark "¡" used in Spanish.

Antonio

thefeduke
Posts: 211
Joined: 05 Apr 2015 13:06
Location: MA South Shore, USA

Re: split string into substrings based on delimiter

#27 Post by thefeduke » 02 Nov 2017 00:28

Aacini wrote:You may get the same result in a simpler way:
. . .
The only drawback with this method is to choose an unused character/string as replacement of the exclamation mark. In this code I chose the open exclamation mark "¡" used in Spanish.
Thank you, Antonio! Not only simpler, but more fitting in the use of that magical substitution. It also completely solved some difficulties in applying my technique in multiple or nested calls. This is much more useful :!:

I altered the output slightly:

Code: Select all

. . . 
if defined x2 set "x2=%x2:~1%"
echo.%x2%
echo Substring "%~1" at positions: %x2%
echo %~2
exit /B
. . .
This allows the first record of file-directed output to be used to set a variable:

Code: Select all

    call StrPos " - " "%WinTitle%">%temp%\TitlePos.txt
    set /p tpos=<%temp%\TitlePos.txt
where the format of the variable tpos is: n[,n...] or undefined.

John A.

thefeduke
Posts: 211
Joined: 05 Apr 2015 13:06
Location: MA South Shore, USA

Re: split string into substrings based on delimiter

#28 Post by thefeduke » 23 Nov 2017 01:09

thefeduke wrote:...
This is much more useful :!:

I altered the output slightly:
. . .
I am posting the script with those changes in its entirety, with just a little more added function. I added an x3 variable to the magic incantation to save the last position found and used that to set the return code (with a value of -1 if not found). Edited 12/23/2017: return code is actually negative the length of the search string if not found(correction). So, here is StrPos.bat:

Code: Select all

@echo off
:StrPos Delim String
Rem Start http://www.dostips.com/forum/viewtopic.php?p=54609#p54609
setlocal EnableDelayedExpansion

set /A "rc=-1"
set "x=%~2"
if not defined x echo/& echo No string& exit /B %rc% 
set "v=%~1"
if not defined v echo/& echo No delimiter& exit /B %rc% 
call :strLen v k=
set /A i=-k
set "x2="
set "x3="
set "w=!x:%~1=" ^& call :strLen w j ^& set /A i+=j+k ^& set "x2=¡x2¡,¡i¡" ^& set "w=!"                       
set "w=%w:¡=!%"
if defined x2 (
    set "x2=%x2:~1%"
    set /A "rc=%i%"
) 
echo.%x2%
echo Substring "%~1" at positions: %x2%
echo Last position: %i%
echo %~2
exit /B %rc% 


:strLen var len=
setlocal EnableDelayedExpansion
set "str=0!%1!"
set "%2=0"
for /L %%a in (8,-1,0) do (
   set /A "newLen=%2+(1<<%%a)"
   for %%b in (!newLen!) do if "!str:~%%b,1!" neq "" set "%2=%%b"
)
for %%a in (!%2!) do endlocal & set "%2=%%a"
exit /B
This kind of return code seems easy enough to manage, but is it good form? Would two 16bit numbers be better? Something like 0x00000004 for success with 4 and 0x00010000 for failure.

Anyway, as an interesting side effect one could use

Code: Select all

StrPos 5 0123456789
to set ERRORLEVEL to 5.
Edited 11/23/2017: x3 eliminated to use i directly, as suggest by Antonio.
Edited 12/23/2017: Usage comment and return code clarification(correction).
Edit: Usage limitation. Because of the nature of the substitutions, only the first in a set of overlapping source strings is found in the target string. For example,

Code: Select all

StrPos "aa123aa" "bbbaa123aa123aaccc"
will not find the second occurrence of "aa123aa"..

John A
Last edited by thefeduke on 23 Dec 2017 13:27, edited 2 times in total.

Aacini
Expert
Posts: 1910
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#29 Post by Aacini » 23 Nov 2017 12:21

thefeduke wrote: I altered the output slightly:

I am posting the script with those changes in its entirety, with just a little more added function. I added an x3 variable to the magic incantation to save the last position found and used that to set the return code (with a value of -1 if not found).
You not need an additional x3 variable to return the last position found, just use the value of i variable... If you eliminate x3 variable the resulting expression will be shorter, so a larger string could be processed.

Antonio

thefeduke
Posts: 211
Joined: 05 Apr 2015 13:06
Location: MA South Shore, USA

Re: split string into substrings based on delimiter

#30 Post by thefeduke » 23 Nov 2017 19:46

Aacini wrote:If you eliminate x3 variable the resulting expression will be shorter, so a larger string could be processed.
Done. Thank you for noticing.

John A.

Post Reply