Splitting large text file at specific line without for loop

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
vin97
Posts: 35
Joined: 17 Apr 2020 08:30

Splitting large text file at specific line without for loop

#1 Post by vin97 » 13 Jun 2022 07:12

I have a very large text file that I need to split at a certain line number.
Can this be done without having to for-loop through all the lines?
I'm open to external utilities.


Unrelated question: Does using unnecessary delayed variables (! instead of %) significantly reduce speed?

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Splitting large text file at specific line without for loop

#2 Post by aGerman » 13 Jun 2022 08:33

Imagine a file stream as a chain of bytes. Even 3rd party utilities can't predict where the bytes 0D 0A for the newline appear in this chain (provided lines have different lengths). So every tool has to compare the bytes read to find the line end, just like the internal implementation of the FOR /F loop. The disadvantage of FOR /F is the performed buffering before it even begins to iterate. However, there are a bunch of 3rd party tools out there. Google helps you to find them easily.
Does using unnecessary delayed variables (! instead of %) significantly reduce speed?
Not that I'm aware of.

Steffen

vin97
Posts: 35
Joined: 17 Apr 2020 08:30

Re: Splitting large text file at specific line without for loop

#3 Post by vin97 » 13 Jun 2022 09:19

Hmm, ok. I thought there maybe is a native way in batch because findstr can process the whole file way way faster than a for /f readloop.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Splitting large text file at specific line without for loop

#4 Post by aGerman » 13 Jun 2022 09:49

Theres more than only FOR /F to read a file line by line. E.g.

Code: Select all

set "file=some.txt"
setlocal EnableDelayedExpansion
<"!file!" (
  for /f %%i in ('type "!file!"^|find /c /v ""') do for /l %%j in (1 1 %%i) do (
    set "line="&set /p "line="
    echo(!line!
  )
)
This migth be faster, particularly because reading happens in the body of the FOR /L loop where no huge buffering is in place. The disadvantage is that SET /P can only read a line up to 1021 characters reliably.

Steffen

Post Reply