Remove Last 2 Lines of CSV Files Using For Loop

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
modulus
Posts: 5
Joined: 17 Jun 2013 10:42

Remove Last 2 Lines of CSV Files Using For Loop

#1 Post by modulus » 17 Jun 2013 10:52

Need some help. I'm trying to create a .bat script to automatically scan the current working directory for .csv files and delete the last 2 lines from each file. I successfully created a script to scan all of the .csv files and I created another script to successfully remove the last 2 lines of a single text file, but I can't seem to get them to work when I put them together. Here's my code:

Finds All CSV Files

Code: Select all

For /F "Tokens=1 Delims=" %%a In ('Dir /A-D /B *.csv') Do (
   Echo %%a
)


Removes Last 2 Lines of a CSV File Named w1.csv

Code: Select all

find /v /c "#$@$#@" < w1.csv > %temp%\#
set /p total=<%temp%\#
set /a num2keep=%total%-2
set lineNUM=0
for /f "tokens=*" %%L in (w1.csv) do (
call :sub1 %%L)
:sub1
set /a lineNUM+=1
if %lineNUM% gtr %num2keep% goto :eof
echo %* >> newfile.csv


This Is How I'm Trying to Combine - Not Working

Code: Select all

For /F "Tokens=1 Delims=" %%a In ('Dir /A-D /B *.csv') Do (
   find /v /c "#$@$#@" < %%a > %temp%\#
   set /p total=<%temp%\#
   set /a num2keep=%total%-2
   set lineNUM=0
   for /f "tokens=*" %%L in (%%a) do (
   call :sub1 %%L)
   :sub1
   set /a lineNUM+=1
   if %lineNUM% gtr %num2keep% goto :eof
   echo %* >> %%a
)

Endoro
Posts: 244
Joined: 27 Mar 2013 01:29
Location: Bozen

Re: Remove Last 2 Lines of CSV Files Using For Loop

#2 Post by Endoro » 17 Jun 2013 11:34

you can try:

Code: Select all

@echo off &SETLOCAL
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof

:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "delims=:" %%i IN ('findstr /n "^" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
SET /a counter=0
(FOR /f "delims=" %%i IN ('findstr /n "^" "%fname%"') DO (
   SET "line=%%i"
   SET /a counter+=1
   SETLOCAL ENABLEDELAYEDEXPANSION
   SET "line=!line:*:=!"
   IF !counter! leq %lines% ECHO(!line!
   ENDLOCAL
))>"%fname%.new"
goto:eof

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Remove Last 2 Lines of CSV Files Using For Loop

#3 Post by Squashman » 17 Jun 2013 11:57

I can't recall if I have tested if it is faster to use FIND or FINDSTR to get the number of lines in a file. I normally use the FIND command.

But I think it might be faster to use the FOR /L loop with the number of lines-2 as the end and stream in the input file. This way you wouldn't have to set a counter to check the number of lines you have processed and it wouldn't have to parse the entire file to get your output.

Endoro
Posts: 244
Joined: 27 Mar 2013 01:29
Location: Bozen

Re: Remove Last 2 Lines of CSV Files Using For Loop

#4 Post by Endoro » 17 Jun 2013 12:49

yes, this is maybe faster

Code: Select all

@echo off &SETLOCAL
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof

:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "tokens=2delims=:" %%i IN ('find /c /v "" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
<"%fname%" (
FOR /l %%i IN (1,1,%lines%) DO (
   SET "line="
   SET/p "line="
   SETLOCAL ENABLEDELAYEDEXPANSION
   ECHO(!line!
   ENDLOCAL
))>"%fname%.new"
goto:eof
Last edited by Endoro on 17 Jun 2013 13:12, edited 2 times in total.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Remove Last 2 Lines of CSV Files Using For Loop

#5 Post by aGerman » 17 Jun 2013 13:04

@Endoro

Reading from files with SET /P is safe also with delayed expansion enabled all the time. That's one of the greatest advantages of this technique btw.

Endoro
Posts: 244
Joined: 27 Mar 2013 01:29
Location: Bozen

Re: Remove Last 2 Lines of CSV Files Using For Loop

#6 Post by Endoro » 17 Jun 2013 13:11

thanks, never tried :roll:

Code: Select all

@echo off &SETLOCAL
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof

:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "tokens=2delims=:" %%i IN ('find /c /v "" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
SETLOCAL ENABLEDELAYEDEXPANSION
<"%fname%" (
FOR /l %%i IN (1,1,%lines%) DO (
   SET "line="
   SET/p "line="
   ECHO(!line!
))>"%fname%.new"
ENDLOCAL
goto:eof


But it must be toggled in the sub program for file names.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Remove Last 2 Lines of CSV Files Using For Loop

#7 Post by aGerman » 17 Jun 2013 13:41

You might be interrested in that thread:
http://www.dostips.com/forum/viewtopic.php?f=3&t=2128
Note that this technique fails with non-Windows line breaks (such as LF instead of CR LF) and lines with a length greater than 1023 characters (if I remember correctly).

modulus
Posts: 5
Joined: 17 Jun 2013 10:42

Re: Remove Last 2 Lines of CSV Files Using For Loop

#8 Post by modulus » 17 Jun 2013 14:40

Instead of creating a new .csv file with the last 2 lines removed, is there a way for the script to overwrite each of the original .csv files?

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Remove Last 2 Lines of CSV Files Using For Loop

#9 Post by Squashman » 17 Jun 2013 15:04

modulus wrote:Instead of creating a new .csv file with the last 2 lines removed, is there a way for the script to overwrite each of the original .csv files?

Delete the old file and rename the new file to the old file name.

modulus
Posts: 5
Joined: 17 Jun 2013 10:42

Re: Remove Last 2 Lines of CSV Files Using For Loop

#10 Post by modulus » 17 Jun 2013 20:45

Thank you everybody for your help! I'm getting super close, but still running into a few snags. I'm trying to perform the 3 operations below on a handful of large (200+ MB) .csv files. I'm able to complete 1 and 2, but not 3. I can't seem to get the script to continue running after #2; I assume this is because goto:eof is being used. I've tried to setup #2 to run within a loop so it can continue after completion, but am having no luck. I'm attaching my full code below:

1. Remove the first line of each .csv file in the directory
2. Remove the last 2 lines of each .csv file in the directory
3. Combine all .csv files into a single .csv file and append column-headers.txt to the top of that file

Code: Select all

@echo off

:: use this script to combine multiple .csv files into one file
:: .csv files must be exported from DFP in Excel CSV format without header and with ID's
:: the first line of each .csv file will be removed. column headers are inserted from a separate file

:: ========== USER PREREQUISITES ========== ::
echo Did you complete the following?
echo   1. Save data files from DFP in Excel CSV format without headers and with ID's
echo   2. Save each .csv file using the following naming convention: "w##.csv"
echo   3. Save each .csv file to the same folder as this script
echo   4. Save column-headers.txt to same folder as this script
echo   5. Make sure this script has not been run previously for these files
echo Enter y for yes or n for no
set /p warning=
if %warning% == y goto process
if not %warning% == y goto stop


:: ========== WARN USER & EXIT PROGRAM ========== ::
:stop
echo.
echo.
echo Please complete the steps above before running this file.
pause
exit


:: ========== RUN PROGRAM IF PREREQUISITES MET ========== ::
:process
cls

:: get current working directory
Set _Path=%cd%
PushD %_Path%


:: ~~~~~~~~~~ CHECK FOR REQUIRED FILES ~~~~~~~~~~ ::
if not exist column-headers.txt (
   echo Please upload the following file and start again: column-headers.txt
   echo.
   pause
   exit
)

if not exist w*.csv (
   echo Please upload the following file and start again: w##.csv
   echo.
   pause
   exit
)


:: ~~~~~~~~~~ INFORM USER TO WAIT FOR SCRIPT TO RUN ~~~~~~~~~~ ::
cls
echo Processing - Please wait 45-60 minutes for confirmation message to appear...
echo.
echo User Running Script: %USERNAME%
echo Start Time: %time%
echo.


:: ~~~~~~~~~~ DELETE PREVIOUS COMBINED FILE ~~~~~~~~~~ ::
:: delete previous combined .csv file, if one was created
IF EXIST combined.csv del combined.csv


:: ~~~~~~~~~~ REMOVE FIRST LINE OF EACH CSV FILE ~~~~~~~~~~ ::
:: delete first 1 lines of each file -- this contains the column headers
For /F "Tokens=1 Delims=" %%a In ('Dir /A-D /B *.csv') Do (
   Echo %%a

   :: define how many lines to remove. 0 lines includes column headers; 1 lines excludes column headers
   For /F "Usebackq Tokens=1 skip=1 Delims=" %%I In ("%%a") Do >>"%temp%\%%a" Echo.%%I
   Move /Y "%temp%\%%a" .\
)
PopD


:: ~~~~~~~~~~ CREATE TEMP FOLDER ~~~~~~~~~~ ::
:: create a 'temp' directory. delete directory and contents if already exists
IF EXIST temp rmdir temp /s/q
mkdir temp


:: ~~~~~~~~~~ REMOVE LAST TWO LINES OF EACH CSV FILE ~~~~~~~~~~ ::
:: delete last 2 lines of each file -- these contain totals calculations
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof

:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "tokens=2delims=:" %%i IN ('find /c /v "" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
<"%fname%" (
FOR /l %%i IN (1,1,%lines%) DO (
   SET "line="
   SET/p "line="
   SETLOCAL ENABLEDELAYEDEXPANSION
   ECHO(!line!
   ENDLOCAL
))>"temp\%fname%"
goto:eof


:: ~~~~~~~~~~ COMBINE CSV FILES ~~~~~~~~~~ ::
:: combine all .csv files into one file and use column-headers.txt as headers for the combined file
copy column-headers.txt+/b temp\*.csv combined.csv


:: ~~~~~~~~~~ DELETE TEMP FOLDER ~~~~~~~~~~ ::
rmdir temp\ /s/q


:: ~~~~~~~~~~ CONFIRMATION MESSAGE ~~~~~~~~~~ ::
echo.
echo.
echo Success! All CSV files were combined into the following file: combined.csv
echo End Time: %time%
echo.
pause

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Remove Last 2 Lines of CSV Files Using For Loop

#11 Post by foxidrive » 17 Jun 2013 22:23

put all your combine routines between these two lines:

Code: Select all

For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
:: routines go here.

:: batch file finishes here
goto:eof

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Remove Last 2 Lines of CSV Files Using For Loop

#12 Post by Squashman » 18 Jun 2013 05:34

modulus wrote:3. Combine all .csv files into a single .csv file and append column-headers.txt to the top of that file

copy /a headers.txt + *.csv combined.tmp
rename combined.tmp combined.csv

modulus
Posts: 5
Joined: 17 Jun 2013 10:42

Re: Remove Last 2 Lines of CSV Files Using For Loop

#13 Post by modulus » 18 Jun 2013 12:40

The catch is I'll need to run a few additional commands (delete temp folder, show confirmation message, etc) after the last 2 lines are stripped out of each file. Is there a way where I can run the last 2 lines routine in a loop without using goto:eof?

Post Reply