How to replace "=","*", ":" in a variable

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Raistlan
Posts: 1
Joined: 20 Feb 2014 10:23

Re: How to replace "=","*", ":" in a variable

#31 Post by Raistlan » 20 Feb 2014 10:44

I was investigating this in order to solve a related "=" issue, and that issue provided me a tool to solve this:

When command line arguments are tokenized into %1, %2, etc., "=", ";" and "," are treated as token delimiters and not preserved. If I take a string, change the spaces, commas and semicolons into other, distinctive strings, I can then pass the string to a call and I know that an equals sign was in between each token that the call receives.

The below example does these replacements:
"," -> "#-comma-#"
";" -> "#-semicolon-#"
"=" -> "#-equal-#"

Code: Select all

@if DEFINED _echo @( echo on ) else @( echo off )

call :replaceSpecialCharacters %*
@echo Special characters substituted out: %__test_Out%
set __test_Out=%__test_Out:#-comma-#=,%
set __test_Out=%__test_Out:#-semicolon-#=;%
set __test_Out=%__test_Out:#-equal-#==%
@echo Special characters substituted back in: %__test_Out%
goto :EOF



:replaceSpecialCharacters
    @REM when arguments are tokenized [%1, %2, etc.], "=", "," and ";" are treated
    @REM as a space and lost. %* isn't modified, so we use it to preserve the
    @REM exact command line arguments when we have it tokenized.
    @REM Returns: __test_Out, which is the passed in argument string with these
    @REM substitutions:
    @REM "," -> "#-comma-#"
    @REM ";" -> "#-semicolon-#"
    @REM "=" -> "#-equal-#"
    set __test_In=%*
    set __test_In=%__test_In: =#-space-#%
    set __test_In=%__test_In:,=#-comma-#%
    set __test_In=%__test_In:;=#-semicolon-#%
    call :replaceEquals %__test_In%
    set __test_Out=%__test_Out:#-space-#= %
goto :EOF



:replaceEquals
    set __test_Out=
    :replaceLoop
        set __test_Arg=%1
        if NOT DEFINED __test_Arg goto :endReplaceLoop

        if DEFINED __test_Out (
            set __test_Out=%__test_Out%#-equal-#%__test_Arg%
        ) else (
            set __test_Out=%__test_Arg%
        )
        shift
        goto :replaceLoop
    :endReplaceLoop
goto :EOF


Code: Select all

c:\>test.cmd 1 foo,bar=baz;fip 3
Special characters substituted out: 1 foo#-comma-#bar#-equal-#baz#-semicolon-#fip 3
Special characters substituted back in: 1 foo,bar=baz;fip 3

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to replace "=","*", ":" in a variable

#32 Post by dbenham » 05 Apr 2014 08:38

This is a bit of a cheat since I'm using hybrid JScript/batch technology, but I've posted a powerful regex search and replace utility for environment variables called REPLVAR.BAT.

The utility uses a variant of jeb's safe return technique, so it works if delayed expansion is enabled or disabled.

From what I can tell, there is no substitution that cannot be done using this utility, except for the obvious limitation of no NULL (0x00) characters, and line lengths are limited somewhat due to the temporary encoding of % " \n \r and sometimes ! and ^ characters. The temporary encoding is necessary for the safe return technique.

It is both more robust and faster than any pure batch solution that I am aware of.

Here is a trivial demonstration of usage:

Code: Select all

@echo off
setlocal enableDelayedExpansion
set "input=1 + 1 = 3!"
call replVar input output "=" "<>" L
echo(!output!
--OUTPUT--

Code: Select all

1 + 1 <> 3!


Dave Benham

jfl
Posts: 226
Joined: 26 Oct 2012 06:40
Location: Saint Hilaire du Touvet, France
Contact:

Re: How to replace "=","*", ":" in a variable

#33 Post by jfl » 14 Nov 2016 15:06

Following the work on another post, here are three routines that address this problem, with various advantages and drawbacks.

The first one is based on the observation that even though it's not possible to do %VAR:*=replacement% to replace asterisks, it IS possible to do %VAR:**=% to remove everything up to and including the first asterisk. Then, using strlen on the tail, it is possible to process the string and replace all asterisks. It is generalized to replace any chosen character, although replacing * is its main purpose.

Note that in this routine, like in all the following, the replacement string can contain the replaced character. This allows doing things like escaping of quoting a tricky character.

Code: Select all

:# Replace characters
:# Advantage: Works with CHAR ':' '*'
:# Advantage: The input string can contain LF and '"' characters
:# Advantage: Faster than the third routine, because it calls :strlen only once per loop.
:# Drawback: Does not work with CHAR '='.
:ReplaceChars STRVAR CHAR REPLACEMENT RETVAR
setlocal EnableDelayedExpansion
set "STRING=!%~1!"
set "REPL=%~3"
set "RESULT="
if defined STRING (
  call :strlen STRING SLEN      &:# SLEN = Full string length
:ReplaceChars.again
  set "TAIL=!STRING:*%~2=!"     &:# Split STRING into HEAD CHAR TAIL
  call :strlen TAIL TLEN        &:# TLEN = Tail length
  if !TLEN!==!SLEN! (   :# No more C chars
    set "RESULT=!RESULT!!TAIL!"
  ) else (              :# Reached one char
    set /a "HLEN=SLEN-TLEN-1"   &:# HLEN = Head length
    for %%h in (!HLEN!) do set "RESULT=!RESULT!!STRING:~0,%%h!!REPL!"
    if defined TAIL (   :# Then there might be more chars in the tail
      set "STRING=!TAIL!"       &:# Repeat the same operation for the tail.
      set "SLEN=!TLEN!"
      goto :ReplaceChars.again
    )
  )
)
endlocal & set "%~4=%RESULT%"
exit /b 0


The next two use the technique published by npocmaka_ in the above post.

This second routine may be useful when you know that you have a single tricky character to replace.

Code: Select all

:# Replace delimiter sets.
:# Advantage: Simple and fast; Works with CHAR '=' ':' '*'
:# Drawback: Multiple consecutive CHARs are replaced by a single REPL string.
:# Drawback: Does not work on strings containing LF or '!' characters.
:ReplaceDelimSets STRVAR CHAR REPLACEMENT RETVAR
setlocal EnableDelayedExpansion
set "STRING=[!%~1!]"    &:# Make mure the string does not begin or end with delims
set "REPL=%~3"
set "RESULT="
:ReplaceDelimSets.loop
for /f "delims=%~2 tokens=1*" %%s in ("!STRING!") do (
  set "RESULT=!RESULT!%%s"
  set "TAIL=%%t"
  if defined TAIL (      :# Then there might be more chars to replace in the tail
    set "RESULT=!RESULT!!REPL!"
    set "STRING=!TAIL!" &:# Repeat the same operation for the tail.
    goto :ReplaceDelimSets.loop
  )
)
endlocal & set "%~4=%RESULT:~1,-1%"
exit /b


This third one is only a slightly simplified (and slightly faster) adaptation of npocmaka_'s routine.

Code: Select all

:# Replace delimiters.
:# Advantage: Works with CHAR '=' ':' '*'
:# Drawback: Does not work on strings containing LF or '"' characters.
:ReplaceDelims STRVAR CHAR REPLACEMENT RETVAR
setlocal DisableDelayedExpansion
call set "STRING=[%%%~1%%]"     &:# Make mure the string does not begin or end with delims
set "REPL=%~3"
set "RESULT="
call :strlen STRING SLEN        &:# SLEN = Full string length
:ReplaceDelims.loop
for /f "delims=%~2 tokens=1*" %%s in ("%STRING%") do (
  set "HEAD=%%s"
  set "TAIL=%%t"
)
set "RESULT=%RESULT%%HEAD%"
call :strlen HEAD HLEN  &:# HLEN = Head length
call :strlen TAIL TLEN  &:# TLEN = Tail length
set /a "N=SLEN-HLEN-TLEN"       &:# Number of delimiters in between
setlocal EnableDelayedExpansion
for /l %%n in (1,1,%N%) do set "RESULT=!RESULT!!REPL!"
endlocal & set "RESULT=%RESULT%"
if defined TAIL (        :# Then there might be more chars to replace in the tail
  set "STRING=%TAIL%"   &:# Repeat the same operation for the tail.
  set "SLEN=%TLEN%"
  goto :ReplaceDelims.loop
)
endlocal & set "%~4=%RESULT:~1,-1%"
exit /b


Finally the first and third routines use :strlen. Here's my optimized version. It's 5% faster than the :strlen currently published in the forum library, by avoiding the initial string copy.

Code: Select all

:strlen STRVAR RETVAR
setlocal EnableDelayedExpansion
set "len=0"
if defined %~1 for /l %%b in (12,-1,0) do (
  set /a "i=(len|(1<<%%b))-1"
  for %%i in (!i!) do if not "!%~1:~%%i!"=="" set /a "len=%%i+1"
)
endlocal & if "%~2" neq "" set "%~2=%len%"
exit /b

batchfan
Posts: 1
Joined: 20 Nov 2016 04:20

Re: How to replace "=","*", ":" in a variable

#34 Post by batchfan » 20 Nov 2016 10:05

Hi,

Second and third functions doesn't work with string containing semi-colon ';' which doesn't appear in your functions comments.

I also had to quote characters when calling these functions for it to work.

Code: Select all

call :ReplaceDelimSets input_var "=" "," outout_var

jfl
Posts: 226
Joined: 26 Oct 2012 06:40
Location: Saint Hilaire du Touvet, France
Contact:

Re: How to replace "=","*", ":" in a variable

#35 Post by jfl » 05 Dec 2021 12:08

5 years later, working on a routine for URL-encoding strings, I again faced the problem of replacing = characters in strings.
Which brought me back to this thread.

I reused the brillant routine posted long ago by @amel5 in comment #5, but I quickly noticed several issues with that code:
- It lost the end of strings that did not end with an = character
- It failed miserably if other $_XXX variables existed beforehand
- The known issue when ! or LF characters are present
- It's limited to 99 = signs

The first issue is fixed by appending !$b! to the !$v! result, and then trimming the tail | appended in the beginning.
It took me a while, and several complex intermediate solutions, before understanding that the fix was that simple.
(Appending one tail | character like this is necessary to avoid error messages when the string ends with an =)
I'm surprised that this severe bug remained unnoticed by the author, or anybody who read that thread since then!

The second issue is straightforward to fix, with a simple loop in the beginning to delete all such variables.

I've not attempted to fix the third issue with the ! or LF characters, because there are no ! in the strings I need to URL-encode now.
The obvious fix would be to encode all ^ " ! LF beforehand to some code sequence TBD, then decode them in the end.
Performance would suffer though, another reason for _not_ doing it now... :-)
... Unless someone knows a simple and fast trick for detecting the presence of ! characters in a string?

The 99 = limitation was probably not something I would encounter in real life, but I don't like arbitrary limitations, and this looked simple!
I first tried to fix it by increasing the number of loops to 8192 (The max Batch string size?), and putting the exit /b _inside_ the inner for loop.
But surprisingly, even for a string with a single =, and hence only two loops, the routine took almost twice as long. (251ms vs. 127ms).
I don't understand why, but this is clearly not the good way to go.
Then I tried removing the outside for /L loop altogether, and replacing it with a conditional goto back to the beginning. The %%i counter is replaced by a $i variable. This worked, and the execution time for the same test was back down to 157ms... Better than the first try, but still not as good as the original code.
And with a large number of = signs in the input string, the original code advantage kept growing.
OK, I'll live with the 99 = limitation. :-(

Then there were a few things I think were needlessly complex in the initial version:
  • There was a pair of (parentheses) around the whole routine except the final return. Why? Things work just as well without it.
  • Also the code to return the result across the endlocal barrier looked overly complex.
    As far as I can tell, the much simpler version I used is strictly equivalent to the initial one.
    I suspect that the initial version was the remainder of an abandoned attempt to fix the 99 = limitation exactly as I did at first. (In which case it is indeed necessary to use the "for %%v in '!variable!' ..." trick.)
Finally one small improvement:
I added dbenham's = presence test, from comment #26, during the initialization phase. In the absence of = signs, it returns immediately.
This significantly lowers the routine duration in the absence of = signs, from 89ms to 46ms.
If = signs are present, the increase is barely measurable. (Less than 1ms)

With all that, the = replacement routine becomes:

Code: Select all

:# Replace all = characters in a string
:# Limitations:
:# - Max 99 = characters
:# - The string must not contain ! or LF characters
:ReplaceEquals %1=STRING_VARNAME %2=REPLACEMENT_VARNAME 
if not defined %1 exit /b &:# Avoid issues with empty strings
setlocal EnableDelayedExpansion
for /F "delims==" %%v in ('set $_ 2^>NUL') do set "%%v=" &:# Clear existing $_XXX variables
:# $_=input value  $f=Termination flag  $v=output value  $r=replacement value
set "$_=!%~1!|" & set "$f=1" & set "$v=" & set "$r=!%~2!"
if /i "!$_:%$_%=%$_%!" equ "!$_!" endlocal & exit /b 0	&:# No = sign in $_. Return now to save time
for /L %%i in (0,1,99) do if defined $f (
  for /F "delims==" %%a in ('set $_') do (
    set "$a=%%a" & set "$b=!%%a!" &:# $a=$_variable name  $b=its value=all that followed the first =
    set "%%a=" & set "$_!$b!" 2>NUL || set "$f="
    if %%i gtr 0 set "$v=!$v!!$a:~2!!$r!"
  )
)
set "$v=!$v!!$b:~0,-1!" &:# The complete result, with the tail | removed in the end
endlocal & set "%~1=%$v%" & exit /b

Post Reply