Formatting a File

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
shadeclan
Posts: 61
Joined: 02 Jun 2011 11:29
Location: USA - Somewhere between Albany NY and Bennington VT

Formatting a File

#1 Post by shadeclan » 17 May 2013 08:25

Suppose you have a rather large text file and you need to format each line for consumption by another application. You could do a FOR loop through the file, arrange the data and ECHO it out to a text file simply enough ...

Code: Select all

for /f "tokens=1-14,* delims=:. " %%a in (%cDataFileTmp%) do (
      @echo.%%d%%e%%f%%g%%h,%%i %%k:%%l%%m%%n,%%o>>%cDataFile%
      @echo.%%d %%e %%f %%g %%h,%%i>>%cM204File%
)
... but that takes about 14 minutes. Is there a faster way to do this?

3rd party applications are OK if they are free, portable and reliable, so REGEXP apps like GREP are out. I'd prefer a purely batch solution if there is one.

Squashman
Expert
Posts: 4470
Joined: 23 Dec 2011 13:59

Re: Formatting a File

#2 Post by Squashman » 17 May 2013 11:06

shadeclan wrote:3rd party applications are OK if they are free, portable and reliable, so REGEXP apps like GREP are out.

That confuses me as grep is free portable and reliable.

If you have a very large text file it is going to take a while because a for loop reads in the entire file into memory before it starts processing the file.

Endoro
Posts: 244
Joined: 27 Mar 2013 01:29
Location: Bozen

Re: Formatting a File

#3 Post by Endoro » 17 May 2013 12:42

look at awk to extract columns.

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: Formatting a File

#4 Post by Ed Dyreen » 17 May 2013 20:11

Batch is interpreted line by line by cmd.EXE, batch is slow. Using external executables is always faster, they run from memory and don't need to be translated to machinecode. You could opt for another language like vbScript, that should be about 200 times faster here. Wonder if this is a little faster;

Code: Select all

@echo off &prompt $G

set cDataFileTmp=0.TXT
set    cDataFile=1.TXT
set    cM204File=2.TXT

:: > "%cDataFile%" type nul
:: > "%cM204File%" type nul

3>> "%cDataFile%" (

     4>> "%cM204File%" (

          for /f "useback tokens=1-14* delims=:. " %%a in (

               "%cDataFileTmp%"

          ) do (

               >&3 (echo(%%d%%e%%f%%g%%h,%%i %%k:%%l%%m%%n,%%o)
               >&4 (echo(%%d %%e %%f %%g %%h,%%i)
          )
     )
)

pause
See ya :wink:

shadeclan
Posts: 61
Joined: 02 Jun 2011 11:29
Location: USA - Somewhere between Albany NY and Bennington VT

Re: Formatting a File

#5 Post by shadeclan » 20 May 2013 06:31

Squashman wrote:
shadeclan wrote:3rd party applications are OK if they are free, portable and reliable, so REGEXP apps like GREP are out.

That confuses me as grep is free portable and reliable.

I didn't want to add the grep dependencies to our server. I have a hard enough time convincing my boss to let me use stuff I find on the internet without having to do a major install.

Squashman wrote:If you have a very large text file it is going to take a while because a for loop reads in the entire file into memory before it starts processing the file.

I know. That's my problem. I was hoping to find a solution close to the speed of the FindStr function. Even if I could cut the time down to 5 minutes, that would be helpful.

shadeclan
Posts: 61
Joined: 02 Jun 2011 11:29
Location: USA - Somewhere between Albany NY and Bennington VT

Re: Formatting a File

#6 Post by shadeclan » 20 May 2013 06:37

Ed Dyreen wrote:Batch is interpreted line by line by cmd.EXE, batch is slow. Using external executables is always faster, they run from memory and don't need to be translated to machinecode. You could opt for another language like vbScript, that should be about 200 times faster here. Wonder if this is a little faster;

Code: Select all

@echo off &prompt $G

set cDataFileTmp=0.TXT
set    cDataFile=1.TXT
set    cM204File=2.TXT

:: > "%cDataFile%" type nul
:: > "%cM204File%" type nul

3>> "%cDataFile%" (

     4>> "%cM204File%" (

          for /f "useback tokens=1-14* delims=:. " %%a in (

               "%cDataFileTmp%"

          ) do (

               >&3 (echo(%%d%%e%%f%%g%%h,%%i %%k:%%l%%m%%n,%%o)
               >&4 (echo(%%d %%e %%f %%g %%h,%%i)
          )
     )
)

pause
See ya :wink:

I've seen this sort of notation before. The double colons don't appear to be comments in this context and the redirection symbols don't make sense to me - they seem to be pointing in the wrong direction. Still, I'll give it a try. Thanks, Ed.

I know that VBScript would be faster. I just wanted to get this done as quickly as possible without having to figure out how to do it in another language.

Post Reply