Page 20 of 37

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 12 Apr 2017 13:58
by Aacini
You may directly do that with a Batch file:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

for /F "delims=" %%a in (test.txt) do set "string=%%a"

echo !string!
echo/

for /F "delims=" %%a in (^"!string:^<br^>^=^<br^>^
% Do NOT remove this line %
!^") do (
  echo %%a
)

Antonio

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 12 Apr 2017 15:40
by kyouniis
I'm trying to replace some bytes with jrepl.bat using this syntax:

Code: Select all

@echo off
jrepl "\x11\x3C\xC9\x31\x01\x0C\x60\x7C\x04\x8E\xD4\x31\x01\x0C\x60\x7C\x04\x8E\xCD\x5A" "\x11\x3C\xCE\x31\x01\x0C\x60\x7C\x04\x8E\xD4\x31\x01\x0C\x60\x7C\x04\x8E\xCD\x5A" /m /x /f dumpin.bin /o dumpout.bin
pause

But the new file is 30KB bigger than before. What's causing it? I can upload the file if you want to check, it's 1.5MB compressed.

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 13 Apr 2017 07:58
by dbenham
Argh. That should not be :?

Your search and replace strings are the same length, and you have properly used the /M option, so the size should not change.

I definitely would like to have access to the file so I can test. But I am pretty sure you cannot add an attachment that large to this site. So you will have to use some external service and provide a link in your post here. I'm familiar with dropbox, but there are lots of other options. If you prefer, you can send me a private message with the link to the file.


Dave Benham

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 16 Apr 2017 12:37
by zimxavier
Hi!

How can you search a line break with inc option ? I read that \m is incompatible with \inc
This code does nothing:

Code: Select all

@echo off
for /f "delims=" %%a in ('dir /b /a-d "GAME\*.txt" ') do (
call JREPL "\n" "|" /inc "/^BEGIN$/+1:/^END$/-1" /x /f "GAME\%%~a" /o -
)

\n works fine when in Replace though.

Thanks!

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 18 Apr 2017 20:39
by dbenham
Normally (without /M) JREPL reads and processes one line at a time, and the terminating \r\n is not included in the string. So of course searching for \n is pointless. But \r\n automatically gets restored when the resultant line is written.

As stated in the documentation, the only way to productively search for \n is to use the /M option, which puts the entire binary image of the file into memory. But, as you say, the /M option is incompatible with /INC.

So you need a different approach.

I'm not 100% sure of your end goal.

Given the following input

Code: Select all

Preserve 1
Preserve 2
BEGIN
A
B
C
END
Preserve 3
Preserve 4

Then I interpret your desired output to be

Code: Select all

Preserve 1
Preserve 2
BEGIN
A|B|C
END
Preserve 3
Preserve 4

The following should give a result like above. I use the /M option to capture everything between the BEGIN (inclusive) and END (exclusive). I use the /JMATCHQ option to apply a second find/replace on the lines after BEGIN, substituting | for each \r\n.

The code is simpler with a simple FOR instead of FOR /F

Code: Select all

@echo off
for %%F in ("GAME\*.txt"') do (
  call jrepl "(^BEGIN\r?\n)([\s\S]*?)(?=\nEND$)" "$txt=$1+$2.replace(/\r?\n/gm,'|')" /jq /m /f "%%F" /o -
)


Dave Benham

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 21 Apr 2017 09:24
by Arc
Hi everyone!

I've signed up to thank Dave Benham. I watched also his videos on Youtube. He's not really an ordinary person :)

I'm not a professional user. I used the gui and macros of numerous text editors. Just search for and replace. I tried to learn regular expressions. But it's really boring to stick to an application. JREPL.BAT was a great savior. Thank you for closing this gap.

I think this simple but effective tool should be heard by every user. Please create a manual with more examples. I have a hard time reading the manuals. Real life examples teach a lot for beginner user. Otherwise it looks very impossible. So please spread to all users.

Now my question: I have difficulty using T command. For example

Code: Select all

<ab>.......................<td>
<\ab>......................<\td>
<efg>......................<tr>
<\efg>.....................<\tr>
<hij lang="en-us"><td>.....<td><hij lang="en-us">
<hij lang="xx"><td>........<td><hij lang="xx">
xml:lang=".................lang="


To change all of them I know these regex:

Code: Select all

(<\?)ab>........................\1td>
(<\?)efg>.......................\1tr>
(<hij lang=".+?">)(<td>)........\2\1
xml:lang="......................lang="


Now I'd created a try.bat

Code: Select all

@echo off
  call jrepl "(<\?)ab>|(<\?)efg>|(<hij lang=".+?">)(<td>)|xml:lang=" ^
             "\1td>|\1tr>|\2\1|lang=" /x /t "|" /f "input.txt" /o "output.txt"


I only know \1 should be $1. But I don't know what else to do anymore.

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 21 Apr 2017 19:37
by dbenham
Thanks for the complements :D

First off, you cannot include a double quote literal in your command line find or replace strings. You need to use either \q (with /X option), or \x22.

Arc wrote:I only know \1 should be $1. But I don't know what else to do anymore.
Nope :!: :twisted:

The /T option is probably the trickiest option to use. The critical piece of information that you have not grasped is from this paragraph from the built-in documentation (jrepl /?/t)

Code: Select all

            The search expressions may be regular expressions, possibly with
            captured groups. Note that each expression is itself converted into
            a captured group behind the scene, and the operation is performed
            as a single search/replace upon execution. So backreferences within
            each regex, and $n references within each replacement expression,
            must be adjusted accordingly. The total number of expressions plus
            captured groups must not exceed 99.

The concept is hard to put into words, but there is an example within the docs that effectively demonstrates the concept:

Code: Select all

          Pig Latin - This example shows how /T can be used with regular
          expressions, and it demonstrates how the numbering of captured
          groups must be adjusted. The /T delimiter is set to a space.
 
          The first regex is captured as $1, and it matches words that begin
          with a consonant. The first captured group ($2) contains the initial
          sequence of consonants, and the second captured group ($3) contains
          the balance of the word. The corresponding replacement string moves
          $2 after $3, with a "-" in between, and appends "ay".
 
          The second regex matches any word, and it is captured as $4 because
          the prior regex ended with group $3. Because the first regex matched
          all words that begin with consonants, the only thing the second
          regex can match is a word that begins with a vowel. The replacement
          string simply adds "-yay" to the end of $4. Note that $0 could have
          been used instead of $4, and it would yield the same result.
 
            echo Can you speak Pig Latin? | jrepl^
             "\b((?:qu(?=[aeiou])|[bcdfghj-np-twxz])+)([a-z']+)\b \b[a-z']+\b"^
             "$3-$2ay $4-yay" /t " " /i
 
            -- OUTPUT --
 
            an-Cay you-yay eak-spay ig-Pay atin-Lay?
Don't forget that parenthesized groups that begin with ?: or ?= or ?! are not captured.

So now, looking at your expressions, I have the main expression number to the left, and each captured group number above:

Code: Select all

      $2
$1 = "(<\?)ab>"    -->   "$2td>"

      $4
$3 = "(<\?)efg>"  -->  "$4tr>"

      $6                  $7
$5 = "(<hij lang=\q.+?\q>)(<td>)"  -->  "$7$6"

$8 = "xml:lang=\q"  -->  "lang=\q"

I would simplify the last expression by using a look ahead expression:

Code: Select all

$8 = "xml:(?=lang=\q)"  -->  ""

So the complete command becomes

Code: Select all

call jrepl "(<\?)ab>|(<\?)efg>|(<hij lang=\q.+?\q>)(<td>)|xml:(?=lang=\q)" ^
           "$2td>|$4tr>|$7$6|" /x /t "|" /f "input.txt" /o "output.txt"


Dave Benham

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 21 Apr 2017 20:34
by kyouniis
dbenham wrote:Argh. That should not be :?

Your search and replace strings are the same length, and you have properly used the /M option, so the size should not change.

I definitely would like to have access to the file so I can test. But I am pretty sure you cannot add an attachment that large to this site. So you will have to use some external service and provide a link in your post here. I'm familiar with dropbox, but there are lots of other options. If you prefer, you can send me a private message with the link to the file.


Dave Benham

Hi Dave, I posted a link to the file but for some reason my post didn't show up, I'll try to upload it again and send you the link through PM.

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 22 Apr 2017 12:46
by catalinnc
dbenham wrote:Argh. That should not be :?

Your search and replace strings are the same length, and you have properly used the /M option, so the size should not change.

I definitely would like to have access to the file so I can test. But I am pretty sure you cannot add an attachment that large to this site. So you will have to use some external service and provide a link in your post here. I'm familiar with dropbox, but there are lots of other options. If you prefer, you can send me a private message with the link to the file.


Dave Benham


what is the ETA for fixing this bug?
_

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 22 Apr 2017 13:34
by dbenham
Through a PM, I was able to get a copy of kyouniis' source file, and I could not reproduce his problem. The script successfully modified a single byte in the binary file without changing the total length, exactly as the script was designed to do.

So as far as I know, there is no bug.

I am still working with kyouniis to try to diagnose why he is getting (or thinks he is getting) a different result.

My best guess at the moment is either 1) - there might be some character code pages that do not work properly with JREPL (corrupt the output), or 2) - the output that he is looking at is not actually coming from JREPL, but rather some other source. But those are truly just guesses.

I will post the final result, when (if) we figure out what is actually going on on his machine.


Dave Benham

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 01 May 2017 10:56
by LM459
I just found out about JREPL and have downloaded it. I am having trouble trying to do the following.
I have a number of text files that contain the "|" character, where a line break should be present.
I would like to replace the "|" character with the 2-character HEX value of 0D 0A (Carriage return/line break), but I am not having any luck with the formatting of the instruction.
I am attempting type oldfile.txt jrepl "\|" "\u0D0A" /X >> newfile.txt, as a test, but the result is not what I'm expecting.

Can someone please provide the proper syntax to accomplish this task.

Thanks!

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 01 May 2017 13:56
by Arc
LM459 wrote:I just found out about JREPL and have downloaded it. I am having trouble trying to do the following.
I have a number of text files that contain the "|" character, where a line break should be present.
I would like to replace the "|" character with the 2-character HEX value of 0D 0A (Carriage return/line break), but I am not having any luck with the formatting of the instruction.
I am attempting type oldfile.txt jrepl "\|" "\u0D0A" /X >> newfile.txt, as a test, but the result is not what I'm expecting.

Can someone please provide the proper syntax to accomplish this task.

Thanks!


Welcome to DosTips.com. I'm also newbie :) but I can help. In such cases I would rather use unicode value. You can get this value from charmap.exe or on the net.

For only one file:

Code: Select all

jrepl.bat "\u007C" "\r\n" /x /m /f oldtextfile.txt /o newtextfile.txt


For batch mode, save as batch.bat and run:

Code: Select all

@chcp 65001>nul
@echo off
echo.
echo Drag your txt folder!
echo.
set /p fullpath=
for /f "delims="  %%? in ('dir /b /s "%fullpath:"=%\"*.txt') do (
call jrepl.bat "\u007C" "\r\n" /x /m /f "%%?" /o "%%~dp?/%%~n?_newfile.txt"
)

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 01 May 2017 14:54
by Aacini
@LM459,

You not need Unicode characters nor hundreds of lines of code to perform a replacement as simple as this one. The two-lines Batch file below (save it with .BAT extension) do what you want using the same method of JREPL.BAT...

Code: Select all

@set @a=0 // & cscript //nologo //E:JScript "%~F0" < oldfile.txt > newfile.txt & goto :EOF

WScript.Stdout.Write(WScript.Stdin.ReadAll().replace(/\|/g,"\r\n"));

Antonio

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 02 May 2017 22:15
by brinda
dave,

need add-on help you have done previously from link below

viewtopic.php?f=3&t=6044&start=90#p42892
viewtopic.php?f=3&t=6044&start=105#p42894

Actual paragraph
I couldn't believe that I could actually understand what I was reading. Using the incredible power of the human brain, according to research at Cambridge University, it doesn't matter in what order the letters in a word are, the only important thing is that the first and last letter be in the right place. The rest can be a total, mess and you can read it without a problem. This is because the human mind does not read every letter by itself, but the word as a whole. Amazing, huh? Yeah and I always thought spelling was important! See if your friends can read this too!

Jumble0
I cnduo't bvleiee taht I culod aulaclty uesdtannrd waht I was rdnaieg. Unisg the icndeblire pweor of the hmuan mnid, aocdcrnig to rseecrah at Cmabrigde Uinervtisy, it dseno't mttaer in waht oderr the lterets in a wrod are, the olny irpoamtnt tihng is taht the frsit and lsat ltteer be in the rhgit pclae. The rset can be a taotl mses and you can sitll raed it whoutit a pboerlm. Tihs is bucseae the huamn mnid deos not raed ervey ltteer by istlef, but the wrod as a wlohe. Aaznmig, huh? Yaeh and I awlyas tghhuot slelinpg was ipmorantt! See if yuor fdreins can raed tihs too.

link
https://www.ecenglish.com/learnenglish/lessons/can-you-read

Jumble0 [new add on request]
Jumble0 mix criteria,
a)Only the first and last letter in a word should remain in its original position.
b)Maintain Letter Capitalization if available
c)Original position of space should remain

Jumble1 mix criteria,
a)Only the last letter in a word should remain in its original position.
b)Maintain Letter Capitalization
c)Original position of space should remain

Jumble2 Mirror criteria
a)Reversal of letters position and the word position from left to right. E.g "Sri Advaita" becomes "atiavdA irS"
b)Maintain Letter Capitalization
c)Original position of space should remain

Jumble3 Reverse criteria
a)Reversal of letters position from left to right. Word position remains. E.g "Sri Advaita" becomes "irS atiavdA"
b)Maintain Letter Capitalization
c)Original position of space should remain



Normal word list input on text file looks below

Code: Select all

Sri Advaita
Bhagavad Gita
Saptaham
Maha Bali Puram




Processed list text file (Normal, jumble0, jumble,mirror,reverse)
Code: Select all

Normal,jumble0,jumble,mirror,reverse

Code: Select all

Sri Advaita,Sri Avaidta,rSi vAdiata,atiavdA irS,irS atiavdA 
Bhagavad Gita,Bhvaagad Gtia,avBgahd tiGa,atiG davagahB,davagahB atiG
Saptaham,Sahaptam,tpahaaSm,mahatpaS,mahatpaS
Maha Bali Puram,Mhaa Blai Parum,ahMa lBai arPum,maruP ilaB ahaM,ahaM ilaB maruP 

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

Posted: 04 May 2017 11:33
by Arc
Sample.txt

Code: Select all

"

{Delete " and the two lines with above.}
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Maecenas egestas efficitur lobortis.
"

{Don't delete the two lines above.}
Vestibulum id dui nec nisi mattis tristique.
Donec pretium felis eu odio iaculis maximus.

Ok let's try...

Code: Select all

JREPL.BAT "\q\r\n\r\n" "" /INC "1:3" /m /x /f
"input.txt" /o output.txt
Not working, why?
"The /INC option is incompatible with /M and /S."

/INC and /EXC are great but they cannot be used with /M. Is there any alternative?