Page 2 of 2

Re: Undocumented FINDSTR features and limitations

Posted: 22 Jan 2012 15:20
by aGerman
Haha, no I also didn't want to discredit Dave. He already mentioned it was untested. I corrected this for all the Google users who stumble upon this thread :wink:

Regards
aGerman

Re: Undocumented FINDSTR features and limitations

Posted: 06 Jun 2012 21:13
by dbenham
I updated my SO post: http://stackoverflow.com/a/8844873/1012053

I used to think that the Windows pipe operator appended <CR><LF> to the input if the last character in the stream was not a <LF>. But I've since discovered that FINDSTR is actually doing the alteration of the input.

FINDSTR also appends <CR><LF> to redirected input on Vista (and XP?) if the last character of the redirected file is not <LF>.

I've discovered a nasty FINDSTR "feature" running on Windows 7: it hangs indefinitely on Windows 7 if you search redirected input and the redirected file does not end with <LF>. :shock: :roll:


Dave Benham

Re: Undocumented FINDSTR features and limitations

Posted: 06 Jun 2012 21:48
by Fawers
How could I not see this before? I have so many doubts on using FINDSTR.

Added this page to my favs. Gonna read it when I've got enough time to.
Thanks, Dave.

Re: Undocumented FINDSTR features and limitations

Posted: 06 Jun 2012 21:57
by foxidrive
dbenham wrote:FINDSTR also appends <CR><LF> to redirected input on Vista (and XP?) if the last character of the redirected file is not <LF>.

I've discovered a nasty FINDSTR "feature" running on Windows 7: it hangs indefinitely on Windows 7 if you search redirected input and the redirected file does not end with <LF>. :shock: :roll:


XP also has the issue if the file does not end with appropriate line endings. It hangs.

Re: Undocumented FINDSTR features and limitations

Posted: 27 Nov 2012 21:29
by dbenham
I updated my SO FINDSTR post with two new sections:

1) Description of XP behavior displaying most control characters as dots

2) Bugged /S and /D options may fail to find files if short 8.3 names are encountered.


Dave Benham

Re: Undocumented FINDSTR features and limitations

Posted: 27 Nov 2012 22:02
by carlos
Thanks.
I remember that the default is /R not /L.
Example:

Code: Select all

echo.#|Findstr "."

print #
but

Code: Select all

echo.#|Findstr /L "."

don't print anything. The default is /R not /L

Re: Undocumented FINDSTR features and limitations

Posted: 27 Nov 2012 23:18
by dbenham
I think you did not read the post carefully. It is more complicated than that.

I stated that the default for the /C option is literal.

The default for all other methods (anything other than /C option) depends on the content of the 1st search string. If the 1st search string contains an un-escaped meta character and the string is a valid regex, then all searches will be treated as regex. If the first string does not contain an un-escaped meta character, or if it is not a valid regex, then all search strings will be treated as literals.

The following is a regex search that matches because the first string is a valid regex that contains a meta character.

Code: Select all

echo #|findstr ". a"

But this next example is a literal search that does not match because the first search string does not contain a meta character

Code: Select all

echo #|findstr "a ."


Dave Benham

Re: Undocumented FINDSTR features and limitations

Posted: 28 Nov 2012 00:18
by carlos
Thanks for the info.
When I specify /L or /R using the /C option, I get a message that says that the /C option was omitted.

Code: Select all

C:\Users\Carlos>echo.#|findstr /c /l "#"
FINDSTR: se ha omitido /c
#

C:\Users\Carlos>echo.#|findstr /c /r "#"
FINDSTR: se ha omitido /c
#



Also, I have a dude with the /O option.
I have these file:

Code: Select all

all#everybody#is#ok

If I use:

Code: Select all

findstr /N /O "#" file.txt

it print:

Code: Select all

1:0:all#everybody#is#ok


The offset should be 3 not 0?

Re: Undocumented FINDSTR features and limitations

Posted: 28 Nov 2012 03:13
by foxidrive
It seems buggy,

file.txt
d#dd
aaaa#aa#
all#everybody#is#ok
aaa


d:\ABC>findstr /O /c:"#" file.txt
0:d#dd
6:aaaa#aa#
16:all#everybody#is#ok

Re: Undocumented FINDSTR features and limitations

Posted: 28 Nov 2012 06:53
by dbenham
Here again, the correct information is already in my SO post.
SO FINDSTR post wrote:lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified.

Note - it is the byte offset of the beginning of the line that matches (measured from the beginning of the file), not the offset of the beginning of the match itself. Also, don't forget to count the CarriageReturn/LineFeed line terminators.

The /N and /O options specify the same locations within the file, but the /N option counts the number of lines, whereas the /O option counts the number of bytes. The /N option is 1 based, the /O option is 0 based.

So the results given by the Carlos and Foxidrive examples are corect/not bugged.


Dave Benham

Re: Undocumented FINDSTR features and limitations

Posted: 28 Nov 2012 08:52
by dbenham
I've updated my SO post to clarify the /O option;
SO FINDSTR post wrote:lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified. This is not the offset of the match within the line. It is the number of bytes from the beginning of the file to the beginning of the line.


Dave Benham

Re: Undocumented FINDSTR features and limitations

Posted: 28 Nov 2012 11:08
by foxidrive
dbenham wrote:I've updated my SO post to clarify the /O option;
SO FINDSTR post wrote:lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified. This is not the offset of the match within the line. It is the number of bytes from the beginning of the file to the beginning of the line.


Dave Benham


Thanks Dave.

That could be used to count the length of a line also, or several lines.

It was very confusing as you normally think of the character offset to be to a match of the regexp/literal. The description should say "prints file offset before each matching line."

/O Prints character offset before each matching line.

Re: Undocumented FINDSTR features and limitations

Posted: 28 Nov 2012 13:12
by dbenham
foxidrive wrote:That could be used to count the length of a line also, or several lines.

Cool idea foxidrive :!: 8) :idea:

Code: Select all

@echo off
setlocal
set "test=Hello world!"

:: Echo the length of TEST
call :strLen test

:: Store the length of TEST in LEN
call :strLen test len
echo len=%len%
exit /b

:strLen  strVar  [rtnVar]
setlocal disableDelayedExpansion
set len=0
if defined %~1 for /f "delims=:" %%N in (
  '"(cmd /v:on /c echo !%~1!&echo()|findstr /o ^^"'
) do set /a "len=%%N-3"
endlocal & if "%~2" neq "" (set %~2=%len%) else echo %len%
exit /b

I haven't figured out why I must subtract 3 instead of 2, but it appears to work.

Dave Benam

Matching Whole Words

Posted: 13 Dec 2017 19:19
by Squashman
What am I not understanding about matching two whole words.

Given the following input

Code: Select all

squash, 22, 14, 15, 12, 18, 19
squashman,22,14,15,12,18,19
josh,10, 16, 19, 3, 5, 19, 18, 7, 2, 4
joshua,10, 16, 19, 3, 5, 19, 18, 7, 2, 4
And using this code

Code: Select all

@echo off
set "userid=squash"
set "number=15"
echo match whole word userid
findstr "\<%userid%\>" "wholetest.txt"
echo match whole word number
findstr "\<%number%\>" "wholetest.txt"
echo match two whole words
findstr "\<%userid%\>.*\<%number%\>" "wholetest.txt"
pause
goto :EOF
I get this output

Code: Select all

match whole word userid
squash, 22, 14, 15, 12, 18, 19
match whole word number
squash, 22, 14, 15, 12, 18, 19
squashman,22,14,15,12,18,19
match two whole words
Why does it not match two whole words?

Code: Select all

squash, 22, 14, 15, 12, 18, 19

Re: Undocumented FINDSTR features and limitations

Posted: 13 Dec 2017 21:07
by dbenham
The explanation is within the italicized portions of the following quote from the 2nd answer in my SO Q&A:'
dbenham on StackOverflow wrote: Regex word boundary
\< must be the very first term in the regex. The regex will not match anything if any other characters precede it. \< corresponds to either the very beginning of the input, the beginning of a line (the position immediately following a <LF>), or the position immediately following any "non-word" character. The next character need not be a "word" character.

\> must be the very last term in the regex. The regex will not match anything if any other characters follow it. \> corresponds to either the end of input, the position immediately prior to a <CR>, or the position immediately preceding any "non-word" character. The preceding character need not be a "word" character.

Dave Benham