Batch script to exctract everything between 2 regex
Moderator: DosItHelp
-
- Posts: 6
- Joined: 04 Feb 2020 09:49
Batch script to exctract everything between 2 regex
Hello everyone. I am new to programming I have about 400 *.txt files in a folder, they all contain data between 2 specific text and I need that data either in a new file (one file per *.txt)
I have read about jrepl.bat but i really have no idea how to do it i have try everything and nothing works.
Can someone help me with that, what should I add in the jrepl.bat for such thing to happend, and where in the .bat file
I have read about jrepl.bat but i really have no idea how to do it i have try everything and nothing works.
Can someone help me with that, what should I add in the jrepl.bat for such thing to happend, and where in the .bat file
Re: Batch script to exctract everything between 2 regex
You have not given enough information to give any help.
What are the text markers that identify the text to keep? Are they string literals? or regex?
Should the markers be included in the output?
Do the markers constitute an entire line? If not then the line containing the markers may need to be split.
In addition to fully describing the rules, you should provide a small test file as well as the expected output. It doesn't need to be real data, just enough to illustrate the rules.
Dave Benham
What are the text markers that identify the text to keep? Are they string literals? or regex?
Should the markers be included in the output?
Do the markers constitute an entire line? If not then the line containing the markers may need to be split.
In addition to fully describing the rules, you should provide a small test file as well as the expected output. It doesn't need to be real data, just enough to illustrate the rules.
Dave Benham
-
- Posts: 6
- Joined: 04 Feb 2020 09:49
Re: Batch script to exctract everything between 2 regex
Ok sorry for not giving info. Here is an example
I have a lot of files in a folder, with many names, all are *.txt
Original_file.txt
Line 1
Line 2
Line 3
Line 4
Line 5
I want a new file (or the file itself can be modified) with this result
Output_file.txt
Line 3
Line 4
The point is, getting everything that is between the regex values "Line 2" and "Line 5"
Or, at least delete everything above "Line 3" in one script and another to delete everything after "Line 4"
Keep in mind that the folder has hundreds of *.txt files so i just want the batch to do it automatically with everything inside that folder.
Thank you very much for the help.
I have a lot of files in a folder, with many names, all are *.txt
Original_file.txt
Line 1
Line 2
Line 3
Line 4
Line 5
I want a new file (or the file itself can be modified) with this result
Output_file.txt
Line 3
Line 4
The point is, getting everything that is between the regex values "Line 2" and "Line 5"
Or, at least delete everything above "Line 3" in one script and another to delete everything after "Line 4"
Keep in mind that the folder has hundreds of *.txt files so i just want the batch to do it automatically with everything inside that folder.
Thank you very much for the help.
Re: Batch script to exctract everything between 2 regex
You say regex, but it looks like you are looking to match a string literal.
I'll run with what you have given.
The /INC option specifies which lines to include in the search. It starts with the line after the "Line 1" line, and ends with the line before the "Line 5" line. The BE characters specify the strings must match both the beginning and the end of the line (exact line match).
The /K 0 option signifies to output all lines that match the search. In this case the search simply matches the beginning of a line, so it matches all lines that pass the /INC test. The replace argument is ignored. The 0 specifies not to include any additional lines before or after the matching line.
I'll run with what you have given.
Code: Select all
for %%F in (*.txt) do call jrepl "^" "" /k 0 /inc "'Line 2'be+1:'Line 5'be-1" /f "%%F" /o "%%~nF.mod.txt"
The /K 0 option signifies to output all lines that match the search. In this case the search simply matches the beginning of a line, so it matches all lines that pass the /INC test. The replace argument is ignored. The 0 specifies not to include any additional lines before or after the matching line.
-
- Posts: 6
- Joined: 04 Feb 2020 09:49
Re: Batch script to exctract everything between 2 regex
If I add that code to a .bat file i get
JScript runtime error opening input file: File not found
If i try to use it on a CMD window opened on the directory i get
%%F was unexpected at this time.
All the files in that directory have different names
JScript runtime error opening input file: File not found
If i try to use it on a CMD window opened on the directory i get
%%F was unexpected at this time.
All the files in that directory have different names
-
- Posts: 6
- Joined: 04 Feb 2020 09:49
Re: Batch script to exctract everything between 2 regex
Also, does the '' around "line2" are necesary?
Re: Batch script to exctract everything between 2 regex
Sorry, I forgot the /f option, and I also had a case problem. I've updated the code in my prior post.
All quotes are needed as written. The outer double quotes are needed to treat the entire construct as a single parameter. The single quotes signify that the string is to be treated as a string literal
All quotes are needed as written. The outer double quotes are needed to treat the entire construct as a single parameter. The single quotes signify that the string is to be treated as a string literal
-
- Posts: 6
- Joined: 04 Feb 2020 09:49
Re: Batch script to exctract everything between 2 regex
Yes now its working fine, Thank You. It seems like it needs the whole line value and not just a partial value.
Now, is there any way to do the complete opposite? Keep only the content OUTSIDE that range.
Now, is there any way to do the complete opposite? Keep only the content OUTSIDE that range.
Last edited by Elpolloloco on 05 Feb 2020 14:18, edited 1 time in total.
Re: Batch script to exctract everything between 2 regex
Argh. My mind is obviously elsewhere, and I am making silly mistakes.
My original code was correct, except for a lower case %%f that should have be %%F. I corrected that post once again.
But I just realized you are trying to run the command from the command line. I was assuming you wanted a batch script.
To run the command from the command line you need to change all doubled %% to a single % as follows:
You should never have a reason to change the JREPL.BAT file, (unless a new version comes out)
JREPL has extensive capability through its many command line options. Full documentation is available via JREPL /?
My original code was correct, except for a lower case %%f that should have be %%F. I corrected that post once again.
But I just realized you are trying to run the command from the command line. I was assuming you wanted a batch script.
To run the command from the command line you need to change all doubled %% to a single % as follows:
Code: Select all
for %F in (*.txt) do call jrepl "^" "" /k 0 /inc "'Line 2'be+1:'Line 5'be-1" /f "%F" /o "%~nF.mod.txt"
JREPL has extensive capability through its many command line options. Full documentation is available via JREPL /?
-
- Posts: 6
- Joined: 04 Feb 2020 09:49
Re: Batch script to exctract everything between 2 regex
Thanks, I get a lot of these documents every month and they are not always so predictable, so I would appreciate if you could help me out with a few more codes in order to do any other possible scenario
1. String 1
2. String 2
3. String 3
4. String 4
5. String 5
Keep everything ABOVE String 2
Keep everything ABOVE 3. (line, not string)
Keep everything BELOW String 4
Keep everything BELOW 4. (Line, not string)
The opposite of the former code, delete everything except whatever is between String 2 and String 4
The opposite of the former code (by line), delete everything except whatever is between 2. (line, not string) and 4. (Line, not string)
With these im sure I will be able to handle pretty much any scenario I get in a future.
1. String 1
2. String 2
3. String 3
4. String 4
5. String 5
Keep everything ABOVE String 2
Keep everything ABOVE 3. (line, not string)
Keep everything BELOW String 4
Keep everything BELOW 4. (Line, not string)
The opposite of the former code, delete everything except whatever is between String 2 and String 4
The opposite of the former code (by line), delete everything except whatever is between 2. (line, not string) and 4. (Line, not string)
With these im sure I will be able to handle pretty much any scenario I get in a future.
Re: Batch script to exctract everything between 2 regex
I don't mind giving an example how to do something, but at this point you should learn to do this your self.
Enter JREPL /?/EXC and JREPL /?/INC to get documentation on the syntax for specifying which lines to include or exclude. Study the help, including the examples, and also study the code I already gave you. You should be able to adapt that code to meet your needs.
Enter JREPL /?/EXC and JREPL /?/INC to get documentation on the syntax for specifying which lines to include or exclude. Study the help, including the examples, and also study the code I already gave you. You should be able to adapt that code to meet your needs.