Page 1 of 1

Search and insert multi-line pattern in text file?

Posted: 24 Sep 2019 09:56
by pstein
Assume I have 3 text files:

1st file "searchpattern.txt" which contains one or multiple text lines with a text pattern to search
2nd file "insertpattern.txt" which contains one or multiple text lines with a text pattern to insert
3rd file "targetfile.txt" which may contain the searchpattern and where the insert pattern should be inserted.

The task is now to search for the searchpattern inside the targetfile and to insert insertpattern immediately after the searchpattern.
If searchpattern is not found nothing should be inserted

How can I achieve this conveniently with a DOS batch script?

If necessary a 3rd party cmdline tool is acceptable.

Thank you
Peter

Re: Search and insert multi-line pattern in text file?

Posted: 24 Sep 2019 10:40
by dbenham
Simple with JREPL.BAT with the /T FILE option.

Code: Select all

jrepl "find.txt" "replace.txt" /l /t file /f "input.txt" /o "output.txt"
where find.txt contains n number of search terms, one search term per line, and replace.txt contains the replacement strings, with the same n number of lines.
The /L option specifies the search strings are literal. The default is to treat them as regular expressions.

If using regular expressions then you can reference captured groups in your replace strings as &1, &2... etc.

The list of replacement pairs is order dependent. Earlier lines take precedence over later lines, and strings are not recursively replaced.

Use JREPL /?/T for more information about the /T option.

If you add the /XSEQ option you can then use escape sequences in both your find and replacement strings. Use JREPL /?/XSEQ for details.

You can specify character encodings for both the source and output files by appending "|charsetName" to the file name(s). use JREPL /?/I and JREPL /?/O for details. JREPL /?CHARSET/ gives a list of all available encodings.


Dave Benham

Re: Search and insert multi-line pattern in text file?

Posted: 25 Sep 2019 05:43
by dbenham
I just reread your question and realized that I misinterpreted your requirements in multiple ways.
- you want multiple lines to represent a single search term (search across line breaks), not treat each line as a new term.
- you want to append the the insert pattern after the found text, not replace it.
- you want to apply the changes to the targetfile.txt, not create a new file

Unfortunately JREPL does not meet your requirements directly. I imagine writing a custom JScript script (or hybrid JScript/batch) would not be too difficult.

But you could use JREPL to modify your searchPattern.txt and insertPattern.txt into files that could give the correct result when combined with JREPL /T.

First off, you want to search across line breaks, so the /M option is required, which limits the size of the file that can be processed (not sure the exact limit, but I think 1 GB is too large)

Within searchPattern.txt all carriage returns and line feeds must be replace by \r and \n. Also regular expression meta-characters must be escaped.

Within insertPattern $ must be escaped as $$, \ should be escaped as \\, and carriage returns and line feeds must be replaced by \r and \n, and finally $& needs to be prepended.

Once this is complete then JREPL can be used with the /T option to perform your task.

The entire operation can be completed in 4 steps - 5 if you want to delete the intermediate find.txt and repl.txt files.

Code: Select all

call jrepl "[.\c$*+?()[{\\|] \r \n" "\\$& \\r \\n" /m /xseq /t " " /f "searchPattern.txt" /o "find.txt"
<nul >repl.txt set /p "=$&"
call jrepl "\\ \$ \r \n" "\\\\ $$$$ \\r \\n" /m /xseq /t " " /f "insertPattern.txt" /o "repl.txt" /app
call jrepl find.txt repl.txt /m /xseq /t file /f "targetfile.txt" /o -
del find.txt repl.txt
Remember that all carriage returns and line feeds are significant in both searchPattern.txt and insertPattern.txt


Dave Benham