Page 1 of 2

two findstr questions

Posted: 04 Jan 2014 08:49
by Sponge Belly
Happy 2014 to All! :-)

I thought I could use findstr to scan a file for lines that had more than 8191 characters. I came up with:

Code: Select all

findstr "^" longlines.txt | findstr "^" >nul
FINDSTR: Line 3 is too long.
FINDSTR: Line 5 is too long.


Are there different language versions of findstr? Is the number of the long line always the third token in the error message? If not, is there a straightforward method of extracting only the number from the error message?

But there’s another even knottier problem. If a line contains 8191 characters followed by a linefeed, findstr accepts it without complaint. But 8191 characters followed by a CR+LF pair and findstr says the line is too long. In other words, a carriage return immediately preceding a linefeed is counted as an ordinary character. Is there a workaround for this?

Thanks in advance! :-)

- SB

Re: two findstr questions

Posted: 04 Jan 2014 09:02
by foxidrive
Read dbenham's page on findstr limitations on Stack Overflow.

Line length is one of them and it will likely report lines-too-long on line lengths that are far shorter than 8K

Re: two findstr questions

Posted: 05 Jan 2014 16:36
by Sponge Belly
Hi Foxi,

In SO: Undocumented Features and Linitations of FindStr, Dave Benham wrote:

Piped data and Redirected input is limited to 8191 bytes per line. This limit is a “feature” of FINDSTR. It is not inherent to pipes or redirection. FINDSTR using redirected stdin or piped input will never match any line that is >=8k bytes. Lines >= 8k generate an error message to stderr, but ERRORLEVEL is still 0 if the search string is found in at least one line of at least one file.


My tests indicate that it is 8192 bytes per line including the linefeed at end of line and that a carriage return immediately before a linefeed has no special significance as far as findstr is concerned.

On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

Thanks in advance! ;-)

- SB

Re: two findstr questions

Posted: 05 Jan 2014 18:23
by foxidrive
foxidrive wrote:Read dbenham's page on findstr limitations on Stack Overflow.

Line length is one of them and it will likely report lines-too-long on line lengths that are far shorter than 8K


Oops. I was thinking of "Search String length limits".

Re: two findstr questions

Posted: 05 Jan 2014 19:11
by berserker
Sponge Belly wrote:On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

you can always get a better tool if it allows.
Download GNU grep

Code: Select all

type myBigFile.txt | grep "^something"

Re: two findstr questions

Posted: 05 Jan 2014 20:31
by Squashman
berserker wrote:
Sponge Belly wrote:On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

you can always get a better tool if it allows.
Download GNU grep

Code: Select all

type myBigFile.txt | grep "^something"

Not sure how that solves his problem. Also, why not post a link to GREP from a trusted source like UnixUtils or GNUWin.

@SpongeBelly
Not sure if FindRepl.bat or Repl.bat would help.

My only other thought on finding long lines would be to use a temp file. Read the file in one line at a time and echo it to a temp file. Use the command modifiers in a for loop to determine the file size and then compare it to what value of length you are looking for.

Or maybe pass each line from the For Loop to the String Length function.
http://www.dostips.com/DtCodeCmdLib.php#Function.strLen

Re: two findstr questions

Posted: 05 Jan 2014 20:38
by berserker
Squashman wrote:Not sure how that solves his problem. Also, why not post a link to GREP from a trusted source like UnixUtils or GNUWin.

yes it does. the problem is the "|" pipe right? findstr does gives error, but tools like GNU grep doesn't. the tools in the dropbox are downloaded from trusted source first and then its uploaded to dropbox. If need be, its as easy as going to GNU win32 website and download himself. I just made it convenient to store in one place.



Squashman wrote:My only other thought on finding long lines would be to use a temp file. Read the file in one line at a time and echo it to a temp file. Use the command modifiers in a for loop to determine the file size and then compare it to what value of length you are looking for.

that's one way, but tedious. extra i/o

Re: two findstr questions

Posted: 05 Jan 2014 20:43
by Squashman
berserker wrote:
Squashman wrote:Not sure how that solves his problem. Also, why not post a link to GREP from a trusted source like UnixUtils or GNUWin.

yes it does. the problem is the "|" pipe right? findstr does gives error, but tools like GNU grep doesn't. the tools in the dropbox are downloaded from trusted source first and then its uploaded to dropbox. If need be, its as easy as going to GNU win32 website and download himself. I just made it convenient to store in one place.

Why would anyone trust a person they don't know or who is to say the dropbox link wasn't hacked?

Reread his original post on what he is trying to do and answer my question. How does GREP help him find long lines. I am pretty sure the hybrid Repl.bat and FindRepl are not limited to line length but I could be wrong.

Re: two findstr questions

Posted: 05 Jan 2014 20:51
by berserker
Squashman wrote:
berserker wrote:

Why would anyone trust a person they don't know or who is to say the dropbox link wasn't hacked?

ermm use a virus scanner? the same question I want to ask you, the bhx2.1 utility. Why would anyone trust the hex code in there or who is to say it wasn't a trojan?

Squashman wrote:Reread his original post on what he is trying to do and answer my question. How does GREP help him find long lines. I am pretty sure the hybrid Repl.bat and FindRepl are not limited to line length but I could be wrong.

but you are not sure if repl or findrepl can do that either?

this one check 8191 characters. Check for errorlevel 1 if the line is more than that

Code: Select all

type test.bat | grep -P "^.{8191}$"

Re: two findstr questions

Posted: 05 Jan 2014 21:03
by Squashman
I totally forgot that Dave gave me some code to get the length of a line and determine if the line is terminated with a CR/LF pair or just a LF.
viewtopic.php?p=12603#p12603

Re: two findstr questions

Posted: 05 Jan 2014 21:10
by Squashman
And Antonio posted some of is his own optimized code as well and it uses FINDSTR.
viewtopic.php?p=23973#p23973

Re: two findstr questions

Posted: 05 Jan 2014 21:15
by Squashman
SB,
I should note that Dave and Antonio's code does a lot more than you need. Most of it you really don't need so it can be pared down. But Brian's GREP solution looks quite nice.

Re: two findstr questions

Posted: 05 Jan 2014 21:26
by bars143
berserker wrote:
Sponge Belly wrote:On the other question, can anyone let me know if there are different language versions of findstr? Is the line number in the error message always the third token?

you can always get a better tool if it allows.
Download GNU grep

Code: Select all

type myBigFile.txt | grep "^something"


libintl3.dll is missing when installing to my netbook after downloading a grep.exe from a link specified above.

thank,

bars
window 7 32bit user

Re: two findstr questions

Posted: 05 Jan 2014 21:52
by berserker
bars143 wrote:libintl3.dll is missing when installing to my netbook after downloading a grep.exe from a link specified above.

some of the commands need some extra dll. i have uploaded the dlls into dll folder.
alternatively, if you are interested to use GNU tools for windows, please go to the http://gnuwin32.sourceforge.net/packages.html website
to download.

Re: two findstr questions

Posted: 06 Jan 2014 01:30
by bars143
berserker wrote:
bars143 wrote:libintl3.dll is missing when installing to my netbook after downloading a grep.exe from a link specified above.

some of the commands need some extra dll. i have uploaded the dlls into dll folder.
alternatively, if you are interested to use GNU tools for windows, please go to the http://gnuwin32.sourceforge.net/packages.html website
to download.


hi, berserker

can you recommend me a forum site specializing in grep.exe usage like dostips.com in cmd.exe usage?


thanks,

bars