InStr, ReplStr: Case sensitive search and replace routines

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

InStr, ReplStr: Case sensitive search and replace routines

#1 Post by dbenham » 23 Jun 2011 23:44

Windows batch processing has fast and convenient string search and replace using %var:search=replace%. But this has significant limitations:
  • The search is always case Insensitive. There is no native way to perform a case sensitive search and replace.
  • There is no good way to replace = * or :
  • We can't replace ! if delayed expansion is enabled
  • We can't replace % if delayed expansion is disabled

There is no native way to identify the location of a substring within a string. There are efficient methods posted on this site to do this, but they rely on search and replace and so they have the same limitations as above.

I have developed both macro and function libraries to perform both of the above functions. It is far from elegant - it relies on brute force character by character parsing of the string. But the routines are extremely flexible and powerful. There are no limitations to the characters that can be searched and/or replaced. The search is case sensitive by default, or it can be case insensitive. The routines can be called with delayed expansion enabled or disabled.

The key routine is InStr - It can find the Nth occurrence of a substring within the target starting from the beginning or the end and return the position. Or it can find all occurrences and return the positions as a space delimited string.

Once we have InStr, it is a simple matter to write a ReplStr routine that uses the position information along with standard substring operations to execute the replace functionality. I can envision that InStr could be the basis of many useful functions/macros.

The macro version is contained in the macroLib_SearchStr.bat file. It is self documenting. It requires the following libraries that can be found on my Batch "macros" with arguments - Major Update post.
  • macroLib_Base.bat
  • macroLib_Return.bat
  • macroLib_String.bat
  • callMacro.bat (only needed if you don't want to use the %macro_call% syntax)

The embedded help is difficult to read within the source code. It is best to load the library (run the batch file) and then use my margs and mhelp DOSKEY macros to read the help.

For those that are uncomfortable with using macros I have also included a pure function implementation at the end of this post. The function version has no dependencies, but it is older and I believe less robust. It is also only partially documented. I can't remember all of the differences between the macros and the older functions.

macroLib_SearchStr.bat

Code: Select all

@echo off
:: File = macroLib_SearchStr.bat
:: Dependencies: macroLib_Base.bat, macroLib_Return.bat, macroLib_String.bat
:: This batch file will fail if called while delayed expansion is enabled.
::
:: This library defines macros that involve searching for a string within
:: a string.
::
:: The library is designed to be installed in a directory in your PATH.
:: Any batch file that requires it can include it by simply placing the
:: following line of code at the top before any SETLOCAL:
::
::   IF NOT DEFINED macro\load.MacroLib_SearchStr CALL macroLib_SearchStr
::
:: In this way the library becomes resident in your command shell environment
:: where it is available to any batch file that may need it. The IF condition
:: prevents unneccessary reloads of the same library.
::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: Conditionally load dependencies (normally residing somewhere in the PATH)
if not defined macro\load.macroLib_String call macroLib_String

set macro\args.InStr=  CaseOption  TargetVar  SearchVar  OccurenceVal  [RtnVar]
set macro\help.InStr=  CaseOption  TargetVar  SearchVar  OccurenceVal  [RtnVar]%\n%
  %\n%
  Computes the position of the Nth occurrence or all occurrences of a search%\n%
  string within a target string.%\n%
  %\n%
  CaseOption must have one of the following two values:%\n%
    I (or i) = case insensitive%\n%
    S (or s) = case sensitive%\n%
  %\n%
  The target string is contained within variable TargetVar.%\n%
  %\n%
  The search string is contained within variable SearchVar.%\n%
  %\n%
  The Nth occurence is specified by the OccurenceVal. OccurenceVal may%\n%
  be specified using any expression supported by SET /A. A positive%\n%
  OccurenceVal indicates the search starts from the beginning. A negative%\n%
  OccurenceVal indicates the search starts from the end. An OccurenceVal%\n%
  of 0 directs instr to return all matches as a space delimited string%\n%
  of positions in increasing order.%\n%
  %\n%
  The resulting position(s) is always reported relative to the beginning%\n%
  of the targetStr with 0 being the first character.%\n%
  %\n%
  The result is an empty string if an error occurs%\n%
  %\n%
  The result is returned in variable RtnVar%\n%
  or the result is echoed if RtnVar is not specified%\n%
  %\n%
  The ERRORLEVEL is set as follows:%\n%
    0 - Success%\n%
    1 - The Nth occurence of SearchStr was not found%\n%
    2 - A required argument was missing or invalid%xLF%
%macro_BeginDef%
set macro.InStr=do (%\n%
  setlocal enableDelayedExpansion%\n%
  set "macro.instr.err="%\n%
  if "%%~d"=="" (set macro.instr.err=2) else (%\n%
    set "macro.instr.targetStr=^!%%~b^!"%\n%
    set "macro.instr.searchStr=^!%%~c^!"%\n%
    if not defined macro.instr.targetStr set macro.instr.err=2%\n%
    if not defined macro.instr.searchStr set macro.instr.err=2%\n%
    if /i "%%~a" neq "I" if /i "%%~a" neq "S" set macro.instr.err=2%\n%
    set /a "occ=(%%~d)" 2^^^>nul%\n%
    if errorlevel 1 set macro.instr.err=2%\n%
  )%\n%
  if not defined macro.instr.err (%\n%
    !macro_call! ("macro.instr.targetStr targetLen") !macro.StrLen!%\n%
    !macro_call! ("macro.instr.searchStr searchLen") !macro.StrLen!%\n%
    if ^^^!searchLen^^^! gtr ^^^!targetLen^^^! set macro.instr.err=1%\n%
  )%\n%
  if not defined macro.instr.err (%\n%
    if ^^^!occ^^^! geq 0 (%\n%
      set /a "beg=0, step=1, end=targetLen-searchLen"%\n%
    ) else (%\n%
      set /a "beg=targetLen-searchLen, step=-1, end=0"%\n%
    )%\n%
    if ^^^!occ^^^! neq 0 (set /a "occStep=step") else set /a "occStep=0"%\n%
    set "off="%\n%
    set "done=0"%\n%
    set /a skip=0%\n%
    for %%l in (^^^!searchLen^^^!) do for /l %%o in (^^^!beg^^^!,^^^!step^^^!,^^^!end^^^!) do if ^^^!done^^^! equ 0 (%\n%
      if ^^^!skip^^^! equ 0 (%\n%
        set "match="%\n%
        if /i %%~a==S if "^!macro.instr.targetStr:~%%o,%%l^!"=="^!macro.instr.searchStr^!" set match=1%\n%
        if /i %%~a==I if /i "^!macro.instr.targetStr:~%%o,%%l^!"=="^!macro.instr.searchStr^!" set match=1%\n%
        if defined match (%\n%
          set /a occ-=occStep%\n%
          if ^^^!occ^^^! equ 0 (%\n%
            set "off=^!off^! %%o"%\n%
            set /a done=occStep%\n%
          )%\n%
          set /a skip=searchLen-1%\n%
        )%\n%
      ) else set /a skip-=1%\n%
    )%\n%
  )%\n%
  if not defined macro.instr.err if not defined off set macro.instr.err=1%\n%
  if defined macro.instr.err (set rtn=) else (%\n%
    set "rtn=^!off:~1^!"%\n%
    set macro.instr.err=0%\n%
  )%\n%
  !macro_call! ("^!macro.instr.err^!") !macro.SetErr!%\n%
  for /f "delims=" %%v in (""^^^!rtn^^^!"") do (%\n%
    endlocal%\n%
    if "%%~e" neq "" (set "%%~e=%%~v") else echo(%%~v%\n%
  )%\n%
)
%macro_Call% ("macro.instr") %macro.EndDef%
%macro_EndAnyRtn%

set macro\args.ReplStr=  CaseOption  TargetVar  SearchVar  ReplaceVar  OccurenceVal  [RtnVar]
set macro\help.ReplStr=  CaseOption  TargetVar  SearchVar  ReplaceVar  OccurenceVal  [RtnVar]%\n%
  %\n%
  Replaces the Nth occurrence or all occurrences of a search string found%\n%
  within a target string with a replacement string.%\n%
  %\n%
  The return value may contain any combination of characters supported by DOS%\n%
  except 0x0A ^<Line Feed^> or 0x0D ^<Carreage Return^>.%\n%
  %\n%
  CaseOption must have one of the following two values:%\n%
    I (or i) = case insensitive search%\n%
    S (or s) = case sensitive search%\n%
  %\n%
  The target string is contained within variable TargetVar.%\n%
  %\n%
  The search string is contained within variable SearchVar.%\n%
  %\n%
  The replacement string is contained within variable ReplaceVar.%\n%
  An empty replacement string may be specified by an undefined variable%\n%
  or by "".%\n%
  %\n%
  The Nth occurence is specified by the OccurenceVal. OccurenceVal may%\n%
  be specified using any expression supported by SET /A. A positive%\n%
  OccurenceVal indicates the search starts from the beginning. A negative%\n%
  OccurenceVal indicates the search starts from the end. An OccurenceVal%\n%
  of 0 directs ReplStr to replace all occurrences.%\n%
  %\n%
  The result is returned in variable RtnVar%\n%
  or the result is echoed if RtnVar is not specified%\n%
  %\n%
  The ERRORLEVEL is set as follows:%\n%
    0 - Success%\n%
    2 - A required argument was missing or invalid%xLF%
%macro_BeginDef%
set macro.ReplStr= do (%\n%
  !macro_InitRtn!%\n%
  setlocal enableDelayedExpansion%\n%
  if "%%~e"=="" (set macro.replstr.err=2) else (%\n%
    set "macro.replstr.str=^!%%~b^!"%\n%
    !macro_Call! ("%%c macro.replstr.searchLen") !macro.StrLen!%\n%
    !macro_Call! ("%%a %%b %%c %%e macro.replstr.found") !macro.InStr!%\n%
    set "macro.replStr.err=^!errorlevel^!"%\n%
  )%\n%
  if ^^^!macro.replstr.err^^^! lss 2 (%\n%
    set macro.replstr.err=0%\n%
    set "repl=^!%%~d^!"%\n%
    set "rtnVar=%%~f"%\n%
    set "rtn="%\n%
    set beg=0%\n%
    for %%f in (^^^!macro.replStr.found^^^!) do (%\n%
      set /a len=%%f-beg%\n%
      for /f "tokens=1,2" %%a in ("^!beg^! ^!len^!") do set "rtn=^!rtn^!^!macro.replstr.str:~%%a,%%b^!^!repl^!"%\n%
      set /a beg=%%f+macro.replstr.searchLen%\n%
    )%\n%
    for %%a in (^^^!beg^^^!) do set "rtn=^!rtn^!^!macro.replstr.str:~%%a^!"%\n%
  ) else (%\n%
    set "rtn="%\n%
    set "rtnVar="%\n%
  )%\n%
  !macro_Call! ("macro.replstr.err 1 rtn ^!rtnVar^!") !macro.Rtn1!%\n%
)
%macro_Call% ("macro.ReplStr") %macro.EndDef%
%macro_EndAnyRtn%

set macro\args.AnyReplStr=  CaseOption  TargetVar  SearchVar  ReplaceVar  OccurenceVal  [RtnVar]
set macro\help.AnyReplStr=  CaseOption  TargetVar  SearchVar  ReplaceVar  OccurenceVal  [RtnVar]%\n%
  %\n%
  Replaces the Nth occurrence or all occurrences of a search string found%\n%
  within a target string with a replacement string.%\n%
  %\n%
  The return value may contain any combination of characters supported by DOS%\n%
  including 0x0A ^<Line Feed^> and 0x0D ^<Carriage Return^>.%\n%
  %\n%
  %%macro_EndAnyRtn%% must follow a call to this macro, and it cannot share%\n%
  a code block with the call.%\n%
  %\n%
  CaseOption must have one of the following two values:%\n%
    I (or i) = case insensitive search%\n%
    S (or s) = case sensitive search%\n%
  %\n%
  The target string is contained within variable TargetVar.%\n%
  %\n%
  The search string is contained within variable SearchVar.%\n%
  %\n%
  The replacement string is contained within variable ReplaceVar.%\n%
  An empty replacement string may be specified by an undefined variable%\n%
  or by "".%\n%
  %\n%
  The Nth occurence is specified by the OccurenceVal. OccurenceVal may%\n%
  be specified using any expression supported by SET /A. A positive%\n%
  OccurenceVal indicates the search starts from the beginning. A negative%\n%
  OccurenceVal indicates the search starts from the end. An OccurenceVal%\n%
  of 0 directs AnyReplStr to replace all occurrences.%\n%
  %\n%
  The result is returned in variable RtnVar%\n%
  or the result is echoed if RtnVar is not specified%\n%
  %\n%
  The ERRORLEVEL is set as follows:%\n%
    0 - Success%\n%
    2 - A required argument was missing or invalid%xLF%
%macro_BeginDef%
set macro.AnyReplStr= do (%\n%
  !macro_InitRtn!%\n%
  setlocal enableDelayedExpansion%\n%
  if "%%~e"=="" (set macro.replstr.err=2) else (%\n%
    set "macro.AnyReplStr.str=^!%%~b^!"%\n%
    !macro_Call! ("%%c macro.AnyReplStr.searchLen") !macro.StrLen!%\n%
    !macro_Call! ("%%a %%b %%c %%e macro.AnyReplStr.found") !macro.InStr!%\n%
    set "macro.AnyReplStr.err=^!errorlevel^!"%\n%
  )%\n%
  if ^^^!macro.AnyReplStr.err^^^! lss 2 (%\n%
    set macro.AnyReplStr.err=0%\n%
    set "repl=^!%%~d^!"%\n%
    set "rtnVar=%%~f"%\n%
    set "rtn="%\n%
    set beg=0%\n%
    for %%f in (^^^!macro.AnyReplStr.found^^^!) do (%\n%
      set /a len=%%f-beg%\n%
      for /f "tokens=1,2" %%a in ("^!beg^! ^!len^!") do set "rtn=^!rtn^!^!macro.AnyReplStr.str:~%%a,%%b^!^!repl^!"%\n%
      set /a beg=%%f+macro.AnyReplStr.searchLen%\n%
    )%\n%
    for %%a in (^^^!beg^^^!) do set "rtn=^!rtn^!^!macro.AnyReplStr.str:~%%a^!"%\n%
  ) else (%\n%
    set "rtn="%\n%
    set "rtnVar="%\n%
  )%\n%
  !macro_Call! ("macro.AnyReplStr.err 1 rtn ^!rtnVar^!") !macro.AnyRtn1!%\n%
)
%macro_Call% ("macro.AnyReplStr") %macro.EndDef%
%macro_EndAnyRtn%

::----------------------------------------------------
:: Mark that this library has been loaded.

set macro\load.%~n0=1



Older function versions of the routines:

Code: Select all

@echo off
call :%*
exit /b

:replStr [/I] TargetVar SearchVar ReplaceVar OccuranceVal [RtnVar]
  %InitFcnRtn%
  setlocal enableDelayedExpansion
  if /i "%~1"=="/I" (
    set "replStr.opt=/I"
    shift /1
  ) else set "replStr.opt="
  call :instr %replStr.opt% %1 %2 %4 replStr.found
  set replStr.err=%errorlevel%
  if %replStr.err% lss 2 (
    set replStr.err=0
    set "replStr.str=!%~1!"
    call :strLen %2 replStr.searchLen
    set "repl=!%~3!"
    set "rtnVar=%~5"
    set "rtn="
    set beg=0
    for %%f in (!replStr.found!) do (
         set /a len=%%f-beg
         for /f "tokens=1,2" %%a in ("!beg! !len!") do set "rtn=!rtn!!replStr.str:~%%a,%%b!!repl!"
         set /a beg=%%f+replStr.searchLen
    )
    for %%a in (!beg!) do set "rtn=!rtn!!replStr.str:~%%a!"
  ) else (
    set "rtn="
    set "rtnVar="
  )
  ( endlocal
    if "%~5" neq "" (set "%~5=%rtn%") else echo(%rtn%
    exit /b %replStr.err%
  )
exit /b

:instr [/I] TargetVar SearchVar OccuranceVal [RtnVar]
::
:: Computes the position of the Nth occurrance of a search string within
:: a target string.
::
:: The case insensitive /I option directs instr to perform a case
:: insensitive search.
::
:: The target string is contained within variable TargetVar.
::
:: The search string is contained within variable SearchVar.
::
:: The Nth occurance is specified by the OccuranceVal. OccuranceVal may
:: be specified using any expression supported by SET /A. A positive
:: OccuranceVal indicates the search starts from the beginning. A negative
:: OccuranceVal indicates the search starts from the end. An OccuranceVal
:: of 0 directs instr to return all matches as a space delimited string
:: of positions in increasing order.
::
:: The resulting position(s) is always reported relative to the beginning
:: of the targetStr with 0 being the first character.
::
:: The result is an empty string if an error occurs
::
:: The result is returned in variable RtnVar
:: or the result is echoed if RtnVar is not specified
::
:: The ERRORLEVEL is set as follows:
::   0 - Success
::   1 - The Nth occurance of SearchStr was not found
::   2 - A required argument was missing or invalid
::
  setlocal enableDelayedExpansion
  set instr.err=
  if /i "%~1"=="/I" (
    set "instr.opt=/I"
    shift /1
  ) else set "instr.opt="
  if "%~3"=="" set instr.err=2 else (
    set "instr.targetStr=!%~1!"
    set "instr.searchStr=!%~2!"
    if not defined instr.targetStr set instr.err=2
    if not defined instr.searchStr set instr.err=2
    set /a occ=(%3) 2>nul
    if errorlevel 1 set instr.err=2
  )
   if not defined instr.err (
    call :strlen instr.targetStr targetLen
    call :strlen instr.searchStr searchLen
    if !searchLen! gtr !targetLen! set instr.err=1
  )
  if not defined instr.err (
    if !occ! geq 0 (
      set /a "beg=0, step=1, end=targetLen-searchLen"
    ) else (
      set /a "beg=targetLen-searchLen, step=-1, end=0"
    )
    if !occ! neq 0 (set /a "occStep=step") else set /a "occStep=0"
    set "off="
    set "done=0"
    set /a skip=0
    for %%l in (!searchLen!) do for /l %%o in (!beg!,!step!,!end!) do if !done! equ 0 (
      if !skip! equ 0 (
        if %instr.opt% "!instr.targetStr:~%%o,%%l!"=="!instr.searchStr!" (
          set /a occ-=occStep
          if !occ! equ 0 (
                 set off=!off! %%o
                 set /a done=occStep
               )
          set /a skip=searchLen-1
        )
      ) else set /a skip-=1
    )
  )
  if not defined instr.err if not defined off set instr.err=1
  if defined instr.err (set rtn=) else set "rtn=!off:~1!"& set instr.err=0
  (endlocal & rem -- return values
    if "%~4" neq "" (set %~4=%rtn%) else (echo:%rtn%)
    exit /b %instr.err%
  )
exit /b

:strLen string len -- returns the length of a string
::                 -- string [in]  - variable name containing the string being measured for length
::                 -- len    [out] - variable to be used to return the string length
:: Many thanks to 'sowgtsoi', but also 'jeb' and 'amel27' dostips forum users helped making this short and efficient
:$created 20081122 :$changed 20101116 :$categories StringOperation
:$source http://www.dostips.com
(   SETLOCAL ENABLEDELAYEDEXPANSION
    set "str=A!%~1!"&rem keep the A up front to ensure we get the length and not the upper bound
                     rem it also avoids trouble in case of empty string
    set "len=0"
    for /L %%A in (12,-1,0) do (
        set /a "len|=1<<%%A"
        for %%B in (!len!) do if "!str:~%%B,1!"=="" set /a "len&=~1<<%%A"
    )
)
( ENDLOCAL & REM RETURN VALUES
    IF "%~2" NEQ "" SET /a %~2=%len%
)
EXIT /b


Dave Benham

Post Reply