JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
dbenham
Expert
Posts: 2270
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT v7.14 - regex text processor now with Unicode and XRegExp support

#391 Post by dbenham » 15 Oct 2018 19:13

Michael.Uray wrote:
15 Oct 2018 14:33
Hmm, I had to add that again in the .cmd file.
"find" and "repl" are envirnment variables so I have to use them with "%" on the beginn and on the end, right?
I have to also put it under quotes because the string (^\d+ 0 obj\s*$|^/Name /Paragraph\s*$|^endobj\s*$|^.*$) contains empty spaces, right?
No, the /V option instructs JREPL to read the strings from the specified environment variables. You must not expand them with percents when using /V.

dbenham
Expert
Posts: 2270
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT v7.14 - regex text processor now with Unicode and XRegExp support

#392 Post by dbenham » 16 Oct 2018 04:58

Damn, we are sunk :(

Your input contains null bytes (0x00), and JREPL requires the /M option whenever nulls are present. But your actual file is too large for /M :evil:

Code: Select all

C:\test>jrepl /?/m

      /M  - Multi-line mode. The entire input is read and processed in one
            pass instead of line by line, thus enabling search for \n. This
            also enables preservation of the original line terminators.
            The /M option is incompatible with the /A option unless the /S
            option is also present.

            Note: If working with binary data containing NULL bytes,
                  then the /M option must be used.
The limitation is a result of how JScript reads text data and processes strings.

To confirm this is the problem, I used JREPL with the /M option to substitute \0xFF for every \0x00. My original JREPL solution was able to process the result.

So there is no way to use JREPL in your situation.

I suggest trying PowerShell. I think you should be able to use your original regex strategy.


Dave Benham

Michael.Uray
Posts: 5
Joined: 12 Oct 2018 19:02

Re: JREPL.BAT v7.14 - regex text processor now with Unicode and XRegExp support

#393 Post by Michael.Uray » 16 Oct 2018 05:45

Ok understood.
Thanks for all your help and explainations Dave.

dbenham
Expert
Posts: 2270
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#394 Post by dbenham » 20 Oct 2018 21:36

Here is version 7.15
JREPL7.15.zip
Downloaded 2106 times over 6 months from the primary release post while v7.15 was current
(27.61 KiB) Downloaded 204 times

Summary of Changes

Code: Select all

C:>jrepl /?history

    2018-10-20 v7.15: Add a string literal syntax to the /INC and /EXC options.
The new string literal syntax does not add any capability, but rather makes it easier to look for string literals because regular expression meta characters no longer need to be escaped.

/INC and /EXC BlockList syntax:

Code: Select all

            The syntax for specifying a BlockList is complex. Whitespace
            should not appear anywhere except for possibly within a Regex or
            String.

              BlockList     = {Block}[,{Block}]...
              {Block}       = {SingleLine}|{LineRange}
              {SingleLine}  = {LineSpec}[{Offset}]
              {LineRange}   = {LineSpec}[{Offset}]:{EndLineSpec}[{Offset}]
              {LineSpec}    = [-]LineNumber|{Regex}[/]|{String}[/]
              {EndLineSpec} = [-]LineNumber|+Number|{Regex}|{String}
              {Regex}       = /Regex/[i|b|e]...
              {String}      = 'String'[i|b|e]...
              {Offset}      = +Number|-Number
Any ' literal within a 'String' must be escaped as '' (two single quotes)

Both /Regex/ and 'String' may be followed by any combination of the following flags:
  • i = Ignore case
  • b = Match Beginning of line only
  • e = Match End of line only
Both /Regex/ and 'String' may include any of the /XSEQ escape sequences by default - the /XSEQ option need not be specified. So a backslash literal should always be escaped as \\

Examples:

Code: Select all

              /EXC "'[START]':'[STOP]'"
                 Exclude lines beginning with a line that contains the literal
                 [START] and ending with the next line that contains [STOP].

              /EXC "'[START]'be:'[STOP]'be"
                 Exclude lines beginning with a [START] line (exact match)
                 and ending with the next [STOP] line (exact match).

Dave Benham

onlinestatements
Posts: 10
Joined: 24 Oct 2018 09:54

using /EXC with binary option /M

#395 Post by onlinestatements » 24 Oct 2018 10:21

I am trying to exclude lines but need to use the /M option since there are NULL values in the files I am processing.
It states that /EXC is incompatible with the /M option.
Is there another way to exclude lines when your file has NULL's in it?

I have posted my code that I have so far below.
It gets a list of all the files with a .out extension and cycles through them using a variable %%a then it calls the jrepl program and deletes something from a file that contains NULL's in the first 5 or 10 lines.

Code: Select all

FOR /R %%a IN ("*.out") DO call C:\qsi\jrepl.bat "ESCE" "" /f "%%a" /m /o -
FOR /R %%a IN ("*.out") DO call C:\qsi\jrepl.bat "FF*****" "*****" /f "%%a" /L /m /o -
I have attached a screenshot of the code above from NotePad++
code.JPG
code.JPG (16.53 KiB) Viewed 3724 times
Above the ESC represents an Escape character followed by the letter E.
The Escape E sequence resets the printer in PCL print files.
And the FF represents the Form Feed character.
The FF inserts a blank page or advances to the next page.
I only want to delete the FF code that appears in the first 10 lines of a text file.
This will prevent an extra blank page from spitting out on every print file.
I need to leave the rest of the FF's so the statements print correctly.

Is there a way to make the Escape E get deleted searching only the first 5 lines of a file?
And the FF get deleted only within the first 10 lines of a file?
The FF has asterisks after it.
I don't want to touch FF characters that do not have asterisks.
Again all these files have NULL's within the first 5 lines of every file,
My file doesn't work properly if the /M option isn't included.

Sometimes there can be over 10,000 pages of code in the PCL print files.
I am am aware that the /M imposes size limits.

Thank You!

onlinestatements
Posts: 10
Joined: 24 Oct 2018 09:54

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#396 Post by onlinestatements » 24 Oct 2018 11:14

Maybe theres a way to make it work using /INC /M instead of /EXC

dbenham
Expert
Posts: 2270
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#397 Post by dbenham » 24 Oct 2018 14:22

Both the /EXC and /INC options are dependent on JREPL reading the input file one line at a time. But JScript does not read lines containing NULL properly.

The /M option reads the entire file into memory, and is able to properly read NULL bytes, but is limited as to the size of the file that can be processed.

So there is no way to combine /EXC or /INC with /M.

I do have an ugly way to achieve your goal using the /M option, coupled with the /P and /PFLAG options, provided your input does not exceed the size limit for JRPEL /M processing.

The /P option specifies which portions of the file have the find/replace applied. Below it specifies either 1-5 lines or 1-10 lines. The empty /PFLAG option makes the filter use only the first matching /P, so only the first 5 or 10 lines will be replaced.

With the /XSEQ option you can use the \f escape sequence to represent FormFeed, and \x1B for the escape character.

Code: Select all

@echo off
for /r %%A in ("*.out") do (
  call jrepl "\x1BE" "" /l /xseq /p "(?:[^\n]*\n){1,5}" /pflag "" /f "%%A" /o -
  call jrepl "\f*****" "*****" /l /xseq /p "(?:[^\n]*\n){1,10}" /pflag "" /f "%%A" /o -
)
But if the above fails because a file is too large (out of memory error), then I don't think there is any way to hack JREPL to work for you.


Dave Benham

onlinestatements
Posts: 10
Joined: 24 Oct 2018 09:54

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#398 Post by onlinestatements » 24 Oct 2018 15:24

The NUL characters will only ever appear in the very first line of every file I ever process.
Could we somehow tell it to always exclude or skip the first line and not have to use /M option.
Then I just need to scan only lines 2 thru 10 for the removal of the FF and the Esc E
Then I also wouldn't have to put the asterisks after the FF since it won't ever reach the other FF's.
Thanks

dbenham
Expert
Posts: 2270
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#399 Post by dbenham » 24 Oct 2018 19:33

No, that is not possible.

onlinestatements
Posts: 10
Joined: 24 Oct 2018 09:54

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#400 Post by onlinestatements » 25 Oct 2018 07:05

What is the size limit for JRPEL /M processing.
Is it a line limit or a fize size limit?

onlinestatements
Posts: 10
Joined: 24 Oct 2018 09:54

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#401 Post by onlinestatements » 25 Oct 2018 07:18

The code you provided didn't delete the Escape E nor the FF and doesn't play nice with the NULL's on the first line and it is very slow.

The only problem I had with my original code was that I needed it to only delete any FF within the first 10 lines.

Is there maybe another text manipulation tool that might work better.
Previously I was using a macro I programmed in Notepad++ but I had to manually run this macro everytime over every file in the printroom folder so it rescanned even old files.

Squashman
Expert
Posts: 4107
Joined: 23 Dec 2011 13:59

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#402 Post by Squashman » 25 Oct 2018 22:31

onlinestatements wrote:
25 Oct 2018 07:05
Is it a line limit or a fize size limit?
Let us count how many times the word file is written.
dbenham wrote:
24 Oct 2018 14:22
The /M option reads the entire file into memory, and is able to properly read NULL bytes, but is limited as to the size of the file that can be processed.

But if the above fails because a file is too large (out of memory error), then I don't think there is any way to hack JREPL to work for you.

mribraqdbra
Posts: 2
Joined: 25 Mar 2019 07:15

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#403 Post by mribraqdbra » 25 Mar 2019 07:58

In my strings.xml i have the following xml!
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="apply_changes">Apply</string>
<string name="okay">Okay!</string>
<string name="skip">Skip</string>
</resources>

The question is how i can change/replace the vaule between tags using repl.bat?

EXAMPLE:
change
<string name="skip">Skip</string>
to
<string name="skip">Skip this</string>

this's my first post!
sorry for that!

mribraqdbra
Posts: 2
Joined: 25 Mar 2019 07:15

Re: JREPL.BAT v7.15 - regex text processor now with Unicode and XRegExp support

#404 Post by mribraqdbra » 29 Mar 2019 02:06

mribraqdbra wrote:
25 Mar 2019 07:58
In my strings.xml i have the following xml!
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="apply_changes">Apply</string>
<string name="okay">Okay!</string>
<string name="skip">Skip</string>
</resources>

The question is how i can change/replace the vaule between tags using repl.bat?

EXAMPLE:
change
<string name="skip">Skip</string>
to
<string name="skip">Skip this</string>

this's my first post!
sorry for that!
What's about if the string are long ?

example:
<string name="about_app">This app is for the personal use only!</string>
how to replace value to:
<string name="about_app">This app is free forever</string>

dbenham
Expert
Posts: 2270
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT 8.0 - regex text processor with support for text highlighting and alternate character sets

#405 Post by dbenham » 15 May 2019 12:22

Once I learned how to configure the Windows 10 console to support ANSI escape sequences, I created a new JREPL version that supports highlighting of matched or replaced text.

Here is version 8.0
JREPL8.0.zip
Downloaded 39 times over 4 days from the main release page while 8.0 was the current version
(28.17 KiB) Downloaded 71 times

Summary of Changes

Code: Select all

C:\>jrepl /?history

    2019-05-15 v8.0: Add /Hxxx options for highlighting matched/replaced text.
                     Bug fix - /OFF was wrong when used with /P.
                     Change - /OFF is now incompatible with /PREPL.
                     Change - /PREPL may now be used with /K and /R.
                     Doc fix - corrected syntax for /F [|NB] option.
                     Doc fix - /OFF is for first match if used with /K.
   . . .                     

/H /HON /HOFF and /HU options for highlighting matched or replaced text
Note that the help output has been split across multiple code blocks to eliminate scrolling.

Code: Select all

C:\>jrpel /?/h

      /H  - Highlight all replaced or matched text in the output using the
            strings defined by /HON and /HOFF.

            /HON and /HOFF default to ANSI escape sequences that swap the
            foreground and background colors.

            /HU may be a better option if the current COLOR does not match
            the console default.

Code: Select all

            Native support for ANSI escape sequences requires Windows 10 or
            higher. ANSI escape sequences only work on the Windows 10 console
            under the following conditions:

             - The console must have the "Use legacy console" option OFF
             - The registry must have the following DWORD defined
                  [HKEY_CURRENT_USER\Console]
                  "VirtualTerminalLevel"=dword:00000001

Code: Select all

C:\>jrepl /?/hon

      /HON HighlightStart

            Defines the string to start highlighting text, normally an ANSI
            escape sequence.

            Default is \x1B[7m - Swap foreground and background colors.

Code: Select all

C:\>jrepl /?/hoff

      /HOFF HighlightEnd

            Defines the string to end highlighting text, normally an ANSI
            escape sequence.

            Default is \x1B[0m - Return to the console default format.

Code: Select all

C:\>jrepl /?/hu

      /HU - Underline all replaced or matched text in the output using ANSI
            escape sequences.

            This is the same as using /H with /HON=\x1B[4m and /HOFF=\x1B[24m
Examples
Note - I don't want to post images, so I use /HON and /HOFF to surround the matched/replaced text with braces instead of using escape sequences to highlight the text. Without the /HON and /HOFF options the text would be highlighted as reversed colors, assuming your console has been configured properly as described in the /H help.

Highlight all words that contain "o":

Code: Select all

C:\>echo Goodbye cruel world | jrepl "\w*o\w*" "" /k 0 /h /hon { /hoff }
{Goodbye} cruel {world}
Change all "o" to "O", and highlight the changes:

Code: Select all

C:\>echo Goodbye cruel world | jrepl o O /h /hon { /hoff }
G{O}{O}dbye cruel w{O}rld

Changed /OFF behavior
The /OFF option used to print out the wrong offset when combined with the /P option. The old bugged behavior was to print out the offset within the matched /P prefilter. The corrected behavior in version 8.0 is to print out the offset within the current line, or the entire file if the /M option is used.

As part of the bug fix, I completely reworked the /K and /R implementations such that /PREPL can now be combined with /K or /R.

I was not able to correct the /OFF output when the /PREPL option was used, so starting with version 8.0, /OFF cannot be combined with /PREPL.


Dave Benham

Post Reply