new findstr bug

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Sponge Belly
Posts: 216
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

new findstr bug

#1 Post by Sponge Belly » 16 Jun 2013 15:29

Dear DosTips,

I tried piping a single character into findstr using set /p, type, and findstr itself. The results were the same: absolutely nothing! findstr silently gobbled it up. :!:

Code: Select all

> <nul set /p "=#" | findstr /n "^"
(no output)

> <nul set /p "=##" | findstr /n "^"
1:##


But it gets worse. I piped a multi-line text file into findstr that didn’t end with a newline and had a single character on the last line. You guessed it—findstr omitted the single-character last line. :shock:

Code: Select all

> type sample.txt
ordinary line<CR><LF>
#

> type sample.txt | findstr /n "^"
1:ordinary line<CR><LF>


The only exception is if the single character is a Line Feed (ASCII 10).

Kudos to anyone who can find a way to exploit this bug. Isn’t it wonderful that Batch can still surprise us after all this time? ;-)

- SB

PS: I’m using Windows 7 Home Premium 32-bit, fwiw.

Endoro
Posts: 244
Joined: 27 Mar 2013 01:29
Location: Bozen

Re: new findstr bug

#2 Post by Endoro » 16 Jun 2013 16:32

on XP:

Code: Select all

><nul set /p "=#" | findstr /n "^"

><nul set /p =# | findstr /n "^"
1:#

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: new findstr bug

#3 Post by foxidrive » 17 Jun 2013 02:48

Same with Win 8, Endoro.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: new findstr bug

#4 Post by Squashman » 17 Jun 2013 09:54

Win7 64bit

Code: Select all

H:\><nul set /p "=#" | findstr /n "^"

H:\><nul set /p =# | findstr /n "^"
1:#

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: new findstr bug

#5 Post by dbenham » 19 Jun 2013 19:32

Great (but unfortunate) discovery and post.

I've updated my StackOverflow What are the undocumented features and limitations of the Windows FINDSTR command? Q&A with the new information.


Dave Benham

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: new findstr bug

#6 Post by penpen » 23 Jun 2013 17:30

Hi there,

i'm sorry, but i'm sure that this is not a bug in findstr.

Somewhen in 2001 or 2002 i have read how findstr handles redirected input within the XP dos shell - it's long ago, so i can't remember the link address; and i havn't found it in actual MSDN - sry for that - but i'm sure that it was described in this way:

The program findstring expects full lines (cpp-style: "String\r\n", where \r is a carriage return char and the\n is a new line char) on redirected input.

If findstr does not get, what is assumed, it may be of one of these reasons:
reason 1) the delivering process/thread may have crashed for what reasons ever, or
reason 2) all was ok, just forgotten the "\r\n".

The error handling is as follows on piped data (using |):
It is assumed that nobody needs a string search on a single character, so if one character is read, it is assumed that reason 1) is the cause; to avoid errors the execution is aborted (for this last char).
On 2 or more chars it is assumed, that reason 2 is the cause, so execution is not interrupted.
(Ms uses findstr on 2 char strings, and never on 1 char strings, so this is their limit.)

Try:

Code: Select all

((for /l %i in (0,1,9) do @echo:Line_%i.)&set/p"=x"<nul)|findstr "^"
((for /l %i in (0,1,9) do @echo:Line_%i.)&set/p"=x "<nul)|findstr "^"


Btw, for file redirection (findstr "^" < file.txt) the following is assumed:
Files that contain input are valid (\r\n at the end of every file), and files may be too huge to be read in one single disk read access.
So it is assumed that the input file reader takes just some time to provide the next characters.

This may result in an infinit loop, if the file is not valid.
So validating such files is recommended.

This is all that i remember, i try to find (and post/link the full article) it.


penpen

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: new findstr bug

#7 Post by Squashman » 23 Jun 2013 17:41

I am not understanding the purpose of your FOR /L loop?

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: new findstr bug

#8 Post by penpen » 23 Jun 2013 18:30

These examples should only show, that this error handling is done per line,
not only on the first char in the pipe:
-the upper version interrupts execution, and so ignores the single char at the last line, while
-the lower version is fully executed.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: new findstr bug

#9 Post by Squashman » 23 Jun 2013 20:34

Still not following.
The SET command does not execute until the for loop is done.

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: new findstr bug

#10 Post by penpen » 24 Jun 2013 06:44

Yes, the set command should be executed after the echo commands in the for loop.
Their output form a multiline document with only one invalid line at the end.

It's the same Sponge Belly does in the opening post using the command type.
I have written an own version just to change the number of lines easily,
and to avoid creating the sample.txt file.

There is no way to make the error handling concept visible in a (batch) program.
So the code lines i have given above do not provide new information, or a clue to see the error handling.
The two code lines only provokes the internal error handling of findstr with different output results.

What i wanted to say in my post above in short:
The program findstr.exe works as intended, according to the behaviour described in the opening post.
Then i've scatched the concept as far as I can remember; this means i don't know anymore what
the concept says about the newline as the only char in a line, if it was described there.
The document where i have read this was something about programming guidelines with some examples on
how to write correct programs under c++, nothing special about findstr.exe.

penpen

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: new findstr bug

#11 Post by Squashman » 24 Jun 2013 06:53

penpen wrote:Yes, the set command should be executed after the echo commands in the for loop.

You are making it sound like the SET command is executed once for each ECHO in the FOR loop. It does not. The FOR loop executes all the iterations before it moves on to the SET command.

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: new findstr bug

#12 Post by penpen » 24 Jun 2013 07:21

You are making it sound like the SET command is executed once for each ECHO in the FOR loop
Sry, for this misunderstanding, it was meant as you say it then:
The FOR loop executes all the iterations before it moves on to the SET command.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: new findstr bug

#13 Post by Squashman » 24 Jun 2013 07:33

So back to my first question. Why the need for the FOR /L loop?

I think the real question is why does it work when it is not Quoted but does not work when it is quoted.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: new findstr bug

#14 Post by dbenham » 24 Jun 2013 07:39

@Squashman - No, penpen is intentionally issuing the SET/P command after the FOR has completed. The code is simply piping 9 normal lines followed by 1 non terminated line consisting of a single character to FINDSTR. It is exactly what Sponge Belly did with sample.txt, except the content is generated dynamically instead of using a static file.

@penpen - I don't doubt that the odd behavior is intentional, it is hard to fathom how the behavior could be an accidental artifact of a bug. However, I believe it is poor design. Assuming that no one ever needs to search a single character seems a bad assumption to me. Why make a single character a special case? Why should piped results be any different then those achieved by reading the file directly? Why assume that the absence of <cr><lf> indicates a crashed pipe source?

Life would be so much simpler if the output for a given set of input were identical, regardless if the input comes from a pipe, redirection, or an opened file. But alas, the developer of FINDSTR was too clever for our own good. There is no inherent reason why the inconsistencies needed to be put in place. Neither SORT nor MORE suffer from any of these odd design choices that were put into FINDSTR.


Dave Benham

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: new findstr bug

#15 Post by penpen » 24 Jun 2013 07:52

@Squashman
I think the real question is why does it work when it is not Quoted but does not work when it is quoted.

This is easy:
the unquoted version produces two chars (the <nul part produces it).
You can make it visible by creating a file named piped.bat with the following content:

Code: Select all

@echo off
setlocal
set INPUT=
:readPipe
set /p INPUT=
if not defined INPUT goto :readPipe
echo:"%INPUT%"
endlocal


Then you get the following output:

Code: Select all

Z:\>set/p"=x"<nul|piped.bat
"x"

Z:\set/p=x<nul|piped.bat
"x "

Z:\set/p=x|piped.bat
"x"




@dbenham
From this point of view: Aggreed.

Post Reply