two findstr questions

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

two findstr questions

#1 Post by Sponge Belly » 04 Jan 2014 08:49

Happy 2014 to All! :-)

I thought I could use findstr to scan a file for lines that had more than 8191 characters. I came up with:

Code: Select all

findstr "^" longlines.txt | findstr "^" >nul
FINDSTR: Line 3 is too long.
FINDSTR: Line 5 is too long.


Are there different language versions of findstr? Is the number of the long line always the third token in the error message? If not, is there a straightforward method of extracting only the number from the error message?

But there’s another even knottier problem. If a line contains 8191 characters followed by a linefeed, findstr accepts it without complaint. But 8191 characters followed by a CR+LF pair and findstr says the line is too long. In other words, a carriage return immediately preceding a linefeed is counted as an ordinary character. Is there a workaround for this?

Thanks in advance! :-)

- SB

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: two findstr questions

#2 Post by foxidrive » 04 Jan 2014 09:02

Read dbenham's page on findstr limitations on Stack Overflow.

Line length is one of them and it will likely report lines-too-long on line lengths that are far shorter than 8K

Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: two findstr questions

#3 Post by Sponge Belly » 05 Jan 2014 16:36

Hi Foxi,

In SO: Undocumented Features and Linitations of FindStr, Dave Benham wrote:

Piped data and Redirected input is limited to 8191 bytes per line. This limit is a “feature” of FINDSTR. It is not inherent to pipes or redirection. FINDSTR using redirected stdin or piped input will never match any line that is >=8k bytes. Lines >= 8k generate an error message to stderr, but ERRORLEVEL is still 0 if the search string is found in at least one line of at least one file.


My tests indicate that it is 8192 bytes per line including the linefeed at end of line and that a carriage return immediately before a linefeed has no special significance as far as findstr is concerned.

On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

Thanks in advance! ;-)

- SB

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: two findstr questions

#4 Post by foxidrive » 05 Jan 2014 18:23

foxidrive wrote:Read dbenham's page on findstr limitations on Stack Overflow.

Line length is one of them and it will likely report lines-too-long on line lengths that are far shorter than 8K


Oops. I was thinking of "Search String length limits".

berserker
Posts: 95
Joined: 18 Dec 2013 00:51

Re: two findstr questions

#5 Post by berserker » 05 Jan 2014 19:11

Sponge Belly wrote:On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

you can always get a better tool if it allows.
Download GNU grep

Code: Select all

type myBigFile.txt | grep "^something"

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: two findstr questions

#6 Post by Squashman » 05 Jan 2014 20:31

berserker wrote:
Sponge Belly wrote:On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

you can always get a better tool if it allows.
Download GNU grep

Code: Select all

type myBigFile.txt | grep "^something"

Not sure how that solves his problem. Also, why not post a link to GREP from a trusted source like UnixUtils or GNUWin.

@SpongeBelly
Not sure if FindRepl.bat or Repl.bat would help.

My only other thought on finding long lines would be to use a temp file. Read the file in one line at a time and echo it to a temp file. Use the command modifiers in a for loop to determine the file size and then compare it to what value of length you are looking for.

Or maybe pass each line from the For Loop to the String Length function.
http://www.dostips.com/DtCodeCmdLib.php#Function.strLen

berserker
Posts: 95
Joined: 18 Dec 2013 00:51

Re: two findstr questions

#7 Post by berserker » 05 Jan 2014 20:38

Squashman wrote:Not sure how that solves his problem. Also, why not post a link to GREP from a trusted source like UnixUtils or GNUWin.

yes it does. the problem is the "|" pipe right? findstr does gives error, but tools like GNU grep doesn't. the tools in the dropbox are downloaded from trusted source first and then its uploaded to dropbox. If need be, its as easy as going to GNU win32 website and download himself. I just made it convenient to store in one place.



Squashman wrote:My only other thought on finding long lines would be to use a temp file. Read the file in one line at a time and echo it to a temp file. Use the command modifiers in a for loop to determine the file size and then compare it to what value of length you are looking for.

that's one way, but tedious. extra i/o

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: two findstr questions

#8 Post by Squashman » 05 Jan 2014 20:43

berserker wrote:
Squashman wrote:Not sure how that solves his problem. Also, why not post a link to GREP from a trusted source like UnixUtils or GNUWin.

yes it does. the problem is the "|" pipe right? findstr does gives error, but tools like GNU grep doesn't. the tools in the dropbox are downloaded from trusted source first and then its uploaded to dropbox. If need be, its as easy as going to GNU win32 website and download himself. I just made it convenient to store in one place.

Why would anyone trust a person they don't know or who is to say the dropbox link wasn't hacked?

Reread his original post on what he is trying to do and answer my question. How does GREP help him find long lines. I am pretty sure the hybrid Repl.bat and FindRepl are not limited to line length but I could be wrong.

berserker
Posts: 95
Joined: 18 Dec 2013 00:51

Re: two findstr questions

#9 Post by berserker » 05 Jan 2014 20:51

Squashman wrote:
berserker wrote:

Why would anyone trust a person they don't know or who is to say the dropbox link wasn't hacked?

ermm use a virus scanner? the same question I want to ask you, the bhx2.1 utility. Why would anyone trust the hex code in there or who is to say it wasn't a trojan?

Squashman wrote:Reread his original post on what he is trying to do and answer my question. How does GREP help him find long lines. I am pretty sure the hybrid Repl.bat and FindRepl are not limited to line length but I could be wrong.

but you are not sure if repl or findrepl can do that either?

this one check 8191 characters. Check for errorlevel 1 if the line is more than that

Code: Select all

type test.bat | grep -P "^.{8191}$"

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: two findstr questions

#10 Post by Squashman » 05 Jan 2014 21:03

I totally forgot that Dave gave me some code to get the length of a line and determine if the line is terminated with a CR/LF pair or just a LF.
viewtopic.php?p=12603#p12603

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: two findstr questions

#11 Post by Squashman » 05 Jan 2014 21:10

And Antonio posted some of is his own optimized code as well and it uses FINDSTR.
viewtopic.php?p=23973#p23973

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: two findstr questions

#12 Post by Squashman » 05 Jan 2014 21:15

SB,
I should note that Dave and Antonio's code does a lot more than you need. Most of it you really don't need so it can be pared down. But Brian's GREP solution looks quite nice.

bars143
Posts: 87
Joined: 01 Sep 2013 20:47

Re: two findstr questions

#13 Post by bars143 » 05 Jan 2014 21:26

berserker wrote:
Sponge Belly wrote:On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

you can always get a better tool if it allows.
Download GNU grep

Code: Select all

type myBigFile.txt | grep "^something"


libintl3.dll is missing when installing to my netbook after downloading a grep.exe from a link specified above.

thank,

bars
window 7 32bit user

berserker
Posts: 95
Joined: 18 Dec 2013 00:51

Re: two findstr questions

#14 Post by berserker » 05 Jan 2014 21:52

bars143 wrote:libintl3.dll is missing when installing to my netbook after downloading a grep.exe from a link specified above.

some of the commands need some extra dll. i have uploaded the dlls into dll folder.
alternatively, if you are interested to use GNU tools for windows, please go to the http://gnuwin32.sourceforge.net/packages.html website
to download.

bars143
Posts: 87
Joined: 01 Sep 2013 20:47

Re: two findstr questions

#15 Post by bars143 » 06 Jan 2014 01:30

berserker wrote:
bars143 wrote:libintl3.dll is missing when installing to my netbook after downloading a grep.exe from a link specified above.

some of the commands need some extra dll. i have uploaded the dlls into dll folder.
alternatively, if you are interested to use GNU tools for windows, please go to the http://gnuwin32.sourceforge.net/packages.html website
to download.


hi, berserker

can you recommend me a forum site specializing in grep.exe usage like dostips.com in cmd.exe usage?


thanks,

bars

Post Reply