I created a ~1.5Gbyte text file and used the following batch to read the first line and exit:
Code: Select all
echo on
for /f "delims=" %%a in (%1) do echo %%a&goto quit
:quit
First the for loop line was immediately echoed, and then my disk drive thrashed for what seemed over a minute before anything more happened.
Next the FOR DO clause was echoed and the 1st line was immediately echoed, and the batch script ended.
But then my disk drive continued to thrash for what seemed like at lest 1/2 hour. I brought up the Resource Monitor to look at the disk activity and saw a process named "svchost.exe (LocalServiceNetworkRrestricted)" that was continuously reading from pagefile.sys (virtual memory). There was virtually no write activity. At long last the process terminated and all was quiet again.
So I have two questions:
1) Why the long delay before reading and echoing the 1st line? - I think I know this one.
2) What was my machine reading for so long after the batch script terminated? - I have a theory
1) Why the long delay before reading and echoing the 1st line?
I figure that FOR /F must load the entire file into memory (virtual memory if it is large enough) prior to reading any of the lines. If this is true, then lines that are read should be immune to any changes that are made to the file by the DO clause. So I decided to test this theory.
I created a small test.txt file
Code: Select all
1
2
3
4
5
Processed by this batch
Code: Select all
echo off
echo Within append loop
for /f %%a in (test.txt) do (
echo %%a
if %%a==1 echo %%a>>test.txt
)
echo(
echo After append loop
type test.txt
echo(
echo Within modify loop
set "flag="
for /f %%a in (test.txt) do (
if not defined flag del test.txt&set flag=1
echo %%a
echo Line %%a>>test.txt
)
echo(
echo After modify loop
type test.txt
echo(
echo Delete test
for /f "delims=" %%a in (test.txt) do (
echo %%a
if exist test.txt del test.txt
)
With these results
Code: Select all
Within append loop
1
2
3
4
5
After append loop
1
2
3
4
5
1
Within modify loop
1
2
3
4
5
1
After modify loop
Line 1
Line 2
Line 3
Line 4
Line 5
Line 1
Delete test
Line 1
Line 2
Line 3
Line 4
Line 5
Line 1
Indeed, the FOR /F reads the file as it existed at the start of the command, ignoring any changes that occur in the DO clause.

2) What was my machine reading for so long after the batch script terminated?Edit - I disprove this theory later in the thread. I haven't a clue what is actually happening
I have a theory, but I don't know how to test it. I'm guessing that FOR /F creates an asynchronous process to actually read the "file" from virtual memory and then consumes the output in a similar fashion to how pipes work. The parent batch process physically ends quickly when the FOR loop terminates after the 1st iteration because of the GOTO. But the auxiliary process is left open and it continues to read all 1.5GB of data to completion. I'm wondering if the auxiliary process might be responsible for the actual parsing of each line into tokens?
Well, that is my theory. I'd be curious if anyone can provide more evidence that this is correct. Or perhaps someone has a better theory?
Dave Benham