directory listing - optimize code

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
brinda
Posts: 78
Joined: 25 Apr 2012 23:51

directory listing - optimize code

#1 Post by brinda » 01 May 2012 07:08

hi all,

need help.

below is the code that i have been using to list out directories recursively and which has long filenames and spaces in between

Code: Select all

for /f "delims=" %%i in ('dir c:\files /b /s /a') do (
  echo %%~ftzai >> m.txt


The code above works fine and gives a sample output below

Code: Select all

--a------ 04/26/12 03:47p 127 c:\files\Copy of ctest.bat 


I can than use findstr to search for attributes of files size etc.

until the files started growing along with the directories which are

directories = 250 and growing
files = 67005 and growing

Running the script above takes more than 25mins to finish on win2000 laptop.

Running just plain dir /b/s/a (around 1 min) and attrib /s /d(2 mins) is even more faster but they do not produce the combined code as needed in one line like the sample.

Any tips to improve the time would be appreciated. Thanks.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: directory listing - optimize code

#2 Post by foxidrive » 01 May 2012 07:57

brinda wrote:Any tips to improve the time would be appreciated. Thanks.


A) Write it in a higher level language

B) get a faster laptop

C) Store your files on a solid state drive

D) write the output file on a RAMdrive

If the drive is FAT32 format then convert it to NTFS ?

Squashman
Expert
Posts: 4488
Joined: 23 Dec 2011 13:59

Re: directory listing - optimize code

#3 Post by Squashman » 01 May 2012 08:01

Faster Laptop would definitely help.

I just ran that code against one of my Network drives at work, which had about 50,000 files. Not sure about the directories but there is a bunch with nested folders as well and it took 14 minutes. If you were running this against a local hard drive then it is definitely just a really slow laptop.

brinda
Posts: 78
Joined: 25 Apr 2012 23:51

Re: directory listing - optimize code

#4 Post by brinda » 01 May 2012 08:23

foxidrive/Squashman,

Thank you for answering. I have tested this on NTFS and the time difference is around 5 mins shaved from FAT32.

I was thinking if there is a code which could join dir /a/s and attrib /s/d to give the same output as the line below

Code: Select all

--a------ 04/26/12 03:47p 127 c:\files\Copy of ctest.bat


The reason is, these 2 command runs even faster than the for /f command.

thanks again.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: directory listing - optimize code

#5 Post by foxidrive » 01 May 2012 08:32

I ran the code on a local HDD and it processed 27,000 files in 33 seconds.
It took 26 seconds when redirected to a file on an SSD

If yours is taking 25 minutes then maybe your HDD has slid into PIO transfer mode instead of DMA.
Check the IDE controller properties on each channel to see which mode it is in.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: directory listing - optimize code

#6 Post by foxidrive » 01 May 2012 10:06

I think I've cracked the issue here and it's not necessarily a slow machine (but do check your PIO vs DMA transfer speed).

It seems that when parsing very large lists that the FOR-IN-DO takes longer and longer to start the task - an exponential curve perhaps.

For EG:
A) Running the code on a tree with 27,000 files/folders took 33 seconds.
B) Running the code on a tree with 137,000 files/folder was still thinking 12 minutes later and had not even created a file.

I think there is a better solution and this code with 137,000 files and folders took 4:45 minutes to complete:
It splits the task into smaller chunks - creating a list of folders only, and then parsing each folder to list the information about the files.

You can launch it like this: mybatch "c:\files"
and it will create the file in the current directory.


Code: Select all

@echo off
set "file=%cd%\m.txt"

for /f "delims=" %%a in ('dir "%~1" /b /s /a:d') do (
echo %%~ftzaa>> "%file%"
pushd "%%a"
for %%b in (*.*) do (
echo %%~ftzab>> "%file%"
)
popd
)
goto :eof


foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: directory listing - optimize code

#7 Post by foxidrive » 01 May 2012 10:19

This is the same code but it will include hidden files etc.

Code: Select all

@echo off
set "file=%cd%\m.txt"

for /f "delims=" %%a in ('dir "%~1" /b /s /a:d') do (
echo %%~ftzaa>> "%file%"
pushd "%%a"
for /f "delims=" %%b in ('dir /b /a-d') do (
echo %%~ftzab>> "%file%"
)
popd
)
goto :eof



Edit: modified /a-d in inner loop.
Last edited by foxidrive on 02 May 2012 08:42, edited 1 time in total.

Fawers
Posts: 187
Joined: 08 Apr 2012 17:11
Contact:

Re: directory listing - optimize code

#8 Post by Fawers » 01 May 2012 11:16

foxidrive wrote:This is the same code but it will include hidden files etc.

Code: Select all

@echo off
set "file=%cd%\m.txt"

for /f "delims=" %%a in ('dir "%~1" /b /s /a:d') do (
echo %%~ftzaa>> "%file%"
pushd "%%a"
for /f "delims=" %%b in ('dir /b /a') do (
echo %%~ftzab>> "%file%"
)
popd
)
goto :eof


It's a very clever work around, but I think the for %%b part will also output duplicates of the folders.
In your for /f ... %%a instruction, it is already echoing the folders with the /a:d switch.
But in the for %%b part, it will also echo folders, since /a switch alone will get hidden/system files as well as folders.

In essence, I would only change that line to

Code: Select all

for /f "delims=" %%b in ('dir /b /a-d') do (

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: directory listing - optimize code

#9 Post by foxidrive » 01 May 2012 11:47

Fawers wrote:In essence, I would only change that line to

Code: Select all

for /f "delims=" %%b in ('dir /b /a-d') do (


Yes, good point. Thanks.

aGerman
Expert
Posts: 4741
Joined: 22 Jan 2010 18:01
Location: Germany

Re: directory listing - optimize code

#10 Post by aGerman » 01 May 2012 12:16

Well, it's more or less the same like brindas because switch /a means that all (hidden, system, etc.) files/folders are listed in that case.

I guess the reason why it takes so much time is that the for /f loop buffers the output of dir first. Since it is only a list of the file/folder names (not a reference to the files/folders themselves) the cmd has to read the properties of each file/folder separately when %%~ftzai is expanded.

I'm afraid I can't pull a solution out of the hat though :(

Regards
aGerman

neorobin
Posts: 47
Joined: 01 May 2012 12:18

Re: directory listing - optimize code

#11 Post by neorobin » 01 May 2012 12:32

Try

Code: Select all

@echo off
for /r c: %%i in (*) do echo %%~ftzai


If you want to output to a file, try this:

Code: Select all

@echo off
>m.txt (for /r c: %%i in (*) do echo %%~ftzai)


Code: Select all

> file ( for ... do (... echo text) )
run fast than

Code: Select all

for ... do (... echo text >> file)

aGerman
Expert
Posts: 4741
Joined: 22 Jan 2010 18:01
Location: Germany

Re: directory listing - optimize code

#12 Post by aGerman » 01 May 2012 12:38

The for /r runs faster, but it does only process the objects that are displayed in the explorer window. Hidden objects are skipped.

Regards
aGerman

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: directory listing - optimize code

#13 Post by foxidrive » 01 May 2012 12:54

It's odd - CMD doesn't respond to control C during the time it is parsing the list either.

I wrote a test which uses this batch file and it merely repeats "d:\ABC\A.BAT" a different number of times, running the batch file and measuring the elapsed time.

Code: Select all

@echo off
for %%a in (
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
"d:\ABC\A.BAT"
) do echo %%~ftzaa>>file.txt


This doesn't take as long as a real tree with long pathnames so there may be fewer buffers allocated to the shorter line.


Here is the test code:

Code: Select all

@echo off
set "file=%cd%\file.bat"

call :next 10000
call :next 30000
call :next 60000
call :next 90000
call :next 100000
call :next 150000


pause
goto :eof
:next
del file.txt 2>nul
> "%file%" echo.@echo off
>>"%file%" echo for %%%%a in ^(
for /L %%a in (1,1,%1) do (
>> "%file%" echo "%~f0"
)
>> "%file%" echo ^) do echo %%%%~ftzaa^>^>file.txt
echo running... with %1 lines
set a=%time%
call %file%
echo %a%
echo %time%
del file.txt 2>nul
goto :eof


and the results:

running... with 10000 lines ~8 seconds
4:27:01.95
4:27:09.26
running... with 30000 lines ~ 30 seconds
4:27:18.87
4:27:52.09
running... with 60000 lines ~ 1 minute 40 seconds
4:28:11.93
4:29:53.23
running... with 90000 lines ~ 3 minutes 20 seconds
4:30:23.09
4:33:48.60
running... with 100000 lines ~ 4 minutes 5 seconds
4:34:21.84
4:38:29.34
running... with 150000 lines ~ 8 minutes 35 seconds
4:39:18.96
4:47:58.85


For this next test instead of lines containing this
"d:\ABC\A.BAT"
I used this
c:\WINDOWS\Microsoft.NET\assembly\GAC_32\CustomMarshalers\v4.0_4.0.0.0__b03f5f7f11d50a3a\CustomMarshalers.dll

and this was the result so parsing a list with longer elements takes a heavy toll.

running... with 10000 lines ~ 23 seconds
4:58:39.57
4:59:03.85
running... with 30000 lines ~ 2 minutes 48 seconds
4:59:12.70
5:02:00.85
running... with 60000 lines ~ 10 minutes 30 seconds
5:02:17.68
5:12:55.85
running... with 90000 lines ~ 33 minutes 20 seconds
5:13:22.78
5:36:52.89


And writing the same line 150,000 times in a separate for-in-do command takes a vastly shorter time.

5:39:17.70 ~2 minutes 40 seconds
5:41:58.37


using this code:

Code: Select all

@echo off
echo %time%
for /L %%a in (1,1,150000) do (
for %%b in ("c:\WINDOWS\Microsoft.NET\assembly\GAC_32\CustomMarshalers\v4.0_4.0.0.0__b03f5f7f11d50a3a\CustomMarshalers.dll") do echo %%~ftzab>>file.txt
)
echo %time%
pause



So the upshot is that if you have a huge list then writing it in individual commands using SED etc will reduce the time taken markedly.
Last edited by foxidrive on 01 May 2012 13:52, edited 4 times in total.

neorobin
Posts: 47
Joined: 01 May 2012 12:18

Re: directory listing - optimize code

#14 Post by neorobin » 01 May 2012 13:04

Yes, I have forgot that problem of "for /r".
Maybe let "for /r" combine to "dir /ah"

Code: Select all

for /f "delims=" %%i in ('dir C:\ /ah /s /b') do echo %%~ftzai
for /r C:\ %%i in (*) do echo %%~ftzai

neorobin
Posts: 47
Joined: 01 May 2012 12:18

Re: directory listing - optimize code

#15 Post by neorobin » 01 May 2012 13:29

foxidrive wrote:It's odd - CMD doesn't respond to control C during the time it is parsing the list either.


for /L %%a in (1,1,%1) do (
>> "%file%" echo "%~f0"
)


It will write at least 10,000 lines to the file, and so much content will be the param of "for %%a in (param)", but that length is OUT OF the length(about 8192 characters) that can be processed.
Last edited by neorobin on 01 May 2012 13:45, edited 1 time in total.

Post Reply