New bug in FOR command?

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: New bug in FOR command?

#16 Post by Ed Dyreen » 28 Dec 2012 19:56

Aacini wrote:This is what I call BUG!

Antonio
In that case the only proper bugFix would seem to abort the batfile and a runtime error.
It is also possible not to fix, it is always safe to just index files, but renaming is not so even though it may be faster :)
If the data has to be bufferred then where is the advantage against using For Dir . .?
I try using it only for reading :|

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: New bug in FOR command?

#17 Post by foxidrive » 28 Dec 2012 20:26

Liviu wrote:
foxidrive wrote:dir /b /o:n would fix it. :)

Only on lucky days ;-)

Code: Select all

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\>dir /b /on images
Poster 1.jpg
Poster 10.jpg
Poster 2.jpg


It's just as flawed on NTFS though.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: New bug in FOR command?

#18 Post by Liviu » 28 Dec 2012 21:36

Aacini wrote:I think the "expected behavior" should be the same in any case. In the original example I used to start this topic, if FOR-command would check that this condition not happen, then "Poster 1.jpg" file would not be renamed to "Poster 3.jpg". On the other hand, if FOR-command does NOT have this checking, then it would enter into an endless rename loop ("Poster 2" to "Poster 4", "Poster 3" to "Poster 5", etc). Why FOR-command rename an already renamed file just one time? This is what I call BUG!

Sorry, still beg to differ, though I can certainly understand your standpoint. But "BUG" is a term generally reserved for behaviors that fail some documented promise. I can find no particular "expected behavior" being documented for cases like the ones presented here, so to me that's not a "bug" but rather "undefined behavior" where any outcome is possible, and none is guaranteed.

The next obvious question is why that was left out. My guess is that it's not (reasonably) easy to guarantee any given behavior in all possible scenarios. In the general case, the target 'for /f' directory could be on a remote machine, using an unknown file system, and running a daemon which creates/deletes/renames files every millisecond. Taking a "static snapshot" of the directory at one single point in time would not be trivial, and perhaps just not possible altogether. It may be technically possible on a known, local, NTFS volume, but even there it's not trivial - and definitely not from batch language.

Liviu

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: New bug in FOR command?

#19 Post by dbenham » 28 Dec 2012 22:12

Aacini wrote:Ok. The core of this problem is clearly defined now: a FOR-set that remain static while the FOR is executed vs. a FOR-set that change because a command in DO part. Note that this difference is NOT necessarily related to FOR vs. FOR /F; for example:

Code: Select all

for /F "delims=" %%a in (thefile.txt) do (
   echo %%a >> thefile.txt
)

I think previous FOR may read lines created by itself if the file size before the FOR is larger than the buffer used to read the file.

I disagree - it has everything to do with FOR vs. FOR /F. I've done a fair bit of testing on this in the past, and I have never seen a DO clause modify the iterations of a FOR /F. I just ran a FOR /F test on a 12.5 mb file with many thousands of lines. The very first iteration I overwrite the entire file with a single line of text. Never-the-less, my FOR /F printed out all the original lines, and the newly created line was not read.

It does not matter which flavor of FOR /F is used: string, file, or command - the entire result set is static and cached somehow before any iteration takes place. The DO clause cannot affect the content of the iterations. I've never seen documentation that states, this. It is only what I have observed. I'd be very interested to hear of any reproducible example that contradicts this theory.

But the simple FOR iterations can most definitely be impacted by the DO clause


Aacini wrote:The possible explanations of FOR behavior may give us the way to understand it and plan methods to avoid the problem. However, I want to emphasize a particular point of this problem.

Liviu wrote:Don't know that I'd call it a "bug", since there is no "expected" behavior mandated or guaranteed in any docs I've seen about what "should" happen once the code in the loop modifies the fileset the loop itself works on...

...but then one would have to define what the "expected outcome" would be, first.

Liviu

I think the "expected behavior" should be the same in any case. In the original example I used to start this topic, if FOR-command would check that this condition not happen, then "Poster 1.jpg" file would not be renamed to "Poster 3.jpg". On the other hand, if FOR-command does NOT have this checking, then it would enter into an endless rename loop ("Poster 2" to "Poster 4", "Poster 3" to "Poster 5", etc). Why FOR-command rename an already renamed file just one time? This is what I call BUG!

I don't fully understand the entire mechanism, but the before and after names play a big role in the end result. If the initial files sort after "Poster", then you only get 2 iterations. Or if you change the new name to "file n.jpg" instead of "Poster n.jpg", then again, only 2 iterations. In either case, the new names sorts before the original names. (I tested on an NTFS volume, I'm not sure what happens on FAT32).

In your example, both "Poster 1" and "Poster 2" sort after the original names. I don't understand why only one is iterated a 2nd time.

But I also agree with Liviu. I would not consider this a bug, since no documentation exists to state what the behavior should be. The behavior is undefined.

However, I do consider it a major design limitation. Some might say that is just semantics, but there is a real difference in my mind.


Dave Benham

Aacini
Expert
Posts: 1885
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: New bug in FOR command?

#20 Post by Aacini » 30 Dec 2012 15:44

dbenham wrote:I disagree - it has everything to do with FOR vs. FOR /F. I've done a fair bit of testing on this in the past, and I have never seen a DO clause modify the iterations of a FOR /F. I just ran a FOR /F test on a 12.5 mb file with many thousands of lines. The very first iteration I overwrite the entire file with a single line of text. Never-the-less, my FOR /F printed out all the original lines, and the newly created line was not read.

It does not matter which flavor of FOR /F is used: string, file, or command - the entire result set is static and cached somehow before any iteration takes place. The DO clause cannot affect the content of the iterations. I've never seen documentation that states, this. It is only what I have observed. I'd be very interested to hear of any reproducible example that contradicts this theory.

Dave Benham

I tried to find such an example, so I ran next program with a file every time larger:

Code: Select all

@echo off
setlocal EnableDelayedExpansion
for /F %%a in ('find /V /C "" ^< largeFile.txt') do set lines=%%a
echo Total lines: %lines%
set /A line=0, last10=lines-10
for /F "delims=" %%a in (largeFile.txt) do (
   set /A line+=1
   if !line! leq 10 (
      echo Copy of line !line!: %%a >> largeFile.txt
      echo Line !line! copied
   )
   if !line! geq %last10% echo Line !line!: %%a
)
I ran previous program with a file up to 60 MB size with no success: FOR /F command always read just the original file lines. There is no way that FOR /F load the file into a buffer that large, so the only explanation is that the file handle that FOR /F use to read the file is not the same used in ">> largeFile.txt" append operations inside FOR-body, although both handles refer to the same physicall file. This way, the file handle used by FOR /F command is not modified by any other access to the same file, until FOR terminate and its handle is closed.

A way to prove this idea is trying to force that both handles be the same, that is, to open the handle used by our append operations at the same time FOR /F command open the file. I wrote this program to test that:

Code: Select all

@echo off
setlocal EnableDelayedExpansion
> Days.txt (for %%a in (Monday Tuesday Wednesday Thursday Friday Saturday Sunday) do echo %%a)
type Days.txt
>> Days.txt (for /F "delims=" %%a in (Days.txt) do (
   set /A line+=1
   echo Copy of line !line!: %%a
))
However, when I ran this program something unexpected happen:

Code: Select all

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
The system cannot find the file Days.txt.
After this result I am pretty sure that FOR /F command open the file for Reading only and Exclusive access. This way, if the file is already opened when FOR is executed, the file just can't be opened by FOR /F command. This mechanism assure that there is no way to modify the content of the file being read by FOR /F command.

Antonio

Post Reply