Page 1 of 1
Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 10:52
by modulus
Need some help. I'm trying to create a .bat script to automatically scan the current working directory for .csv files and delete the last 2 lines from each file. I successfully created a script to scan all of the .csv files and I created another script to successfully remove the last 2 lines of a single text file, but I can't seem to get them to work when I put them together. Here's my code:
Finds All CSV FilesCode: Select all
For /F "Tokens=1 Delims=" %%a In ('Dir /A-D /B *.csv') Do (
Echo %%a
)
Removes Last 2 Lines of a CSV File Named w1.csvCode: Select all
find /v /c "#$@$#@" < w1.csv > %temp%\#
set /p total=<%temp%\#
set /a num2keep=%total%-2
set lineNUM=0
for /f "tokens=*" %%L in (w1.csv) do (
call :sub1 %%L)
:sub1
set /a lineNUM+=1
if %lineNUM% gtr %num2keep% goto :eof
echo %* >> newfile.csv
This Is How I'm Trying to Combine - Not WorkingCode: Select all
For /F "Tokens=1 Delims=" %%a In ('Dir /A-D /B *.csv') Do (
find /v /c "#$@$#@" < %%a > %temp%\#
set /p total=<%temp%\#
set /a num2keep=%total%-2
set lineNUM=0
for /f "tokens=*" %%L in (%%a) do (
call :sub1 %%L)
:sub1
set /a lineNUM+=1
if %lineNUM% gtr %num2keep% goto :eof
echo %* >> %%a
)
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 11:34
by Endoro
you can try:
Code: Select all
@echo off &SETLOCAL
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof
:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "delims=:" %%i IN ('findstr /n "^" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
SET /a counter=0
(FOR /f "delims=" %%i IN ('findstr /n "^" "%fname%"') DO (
SET "line=%%i"
SET /a counter+=1
SETLOCAL ENABLEDELAYEDEXPANSION
SET "line=!line:*:=!"
IF !counter! leq %lines% ECHO(!line!
ENDLOCAL
))>"%fname%.new"
goto:eof
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 11:57
by Squashman
I can't recall if I have tested if it is faster to use FIND or FINDSTR to get the number of lines in a file. I normally use the FIND command.
But I think it might be faster to use the FOR /L loop with the number of lines-2 as the end and stream in the input file. This way you wouldn't have to set a counter to check the number of lines you have processed and it wouldn't have to parse the entire file to get your output.
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 12:49
by Endoro
yes, this is maybe faster
Code: Select all
@echo off &SETLOCAL
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof
:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "tokens=2delims=:" %%i IN ('find /c /v "" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
<"%fname%" (
FOR /l %%i IN (1,1,%lines%) DO (
SET "line="
SET/p "line="
SETLOCAL ENABLEDELAYEDEXPANSION
ECHO(!line!
ENDLOCAL
))>"%fname%.new"
goto:eof
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 13:04
by aGerman
@Endoro
Reading from files with SET /P is safe also with delayed expansion enabled all the time. That's one of the greatest advantages of this technique btw.
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 13:11
by Endoro
thanks, never tried
Code: Select all
@echo off &SETLOCAL
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof
:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "tokens=2delims=:" %%i IN ('find /c /v "" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
SETLOCAL ENABLEDELAYEDEXPANSION
<"%fname%" (
FOR /l %%i IN (1,1,%lines%) DO (
SET "line="
SET/p "line="
ECHO(!line!
))>"%fname%.new"
ENDLOCAL
goto:eof
But it must be toggled in the sub program for file names.
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 13:41
by aGerman
You might be interrested in that thread:
http://www.dostips.com/forum/viewtopic.php?f=3&t=2128Note that this technique fails with non-Windows line breaks (such as LF instead of CR LF) and lines with a length greater than 1023 characters (if I remember correctly).
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 14:40
by modulus
Instead of creating a new .csv file with the last 2 lines removed, is there a way for the script to overwrite each of the original .csv files?
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 15:04
by Squashman
modulus wrote:Instead of creating a new .csv file with the last 2 lines removed, is there a way for the script to overwrite each of the original .csv files?
Delete the old file and rename the new file to the old file name.
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 20:45
by modulus
Thank you everybody for your help! I'm getting super close, but still running into a few snags. I'm trying to perform the 3 operations below on a handful of large (200+ MB) .csv files. I'm able to complete 1 and 2, but not 3. I can't seem to get the script to continue running after #2; I assume this is because
goto:eof is being used. I've tried to setup #2 to run within a loop so it can continue after completion, but am having no luck. I'm attaching my full code below:
1. Remove the first line of each .csv file in the directory
2. Remove the last 2 lines of each .csv file in the directory
3. Combine all .csv files into a single .csv file and append column-headers.txt to the top of that file
Code: Select all
@echo off
:: use this script to combine multiple .csv files into one file
:: .csv files must be exported from DFP in Excel CSV format without header and with ID's
:: the first line of each .csv file will be removed. column headers are inserted from a separate file
:: ========== USER PREREQUISITES ========== ::
echo Did you complete the following?
echo 1. Save data files from DFP in Excel CSV format without headers and with ID's
echo 2. Save each .csv file using the following naming convention: "w##.csv"
echo 3. Save each .csv file to the same folder as this script
echo 4. Save column-headers.txt to same folder as this script
echo 5. Make sure this script has not been run previously for these files
echo Enter y for yes or n for no
set /p warning=
if %warning% == y goto process
if not %warning% == y goto stop
:: ========== WARN USER & EXIT PROGRAM ========== ::
:stop
echo.
echo.
echo Please complete the steps above before running this file.
pause
exit
:: ========== RUN PROGRAM IF PREREQUISITES MET ========== ::
:process
cls
:: get current working directory
Set _Path=%cd%
PushD %_Path%
:: ~~~~~~~~~~ CHECK FOR REQUIRED FILES ~~~~~~~~~~ ::
if not exist column-headers.txt (
echo Please upload the following file and start again: column-headers.txt
echo.
pause
exit
)
if not exist w*.csv (
echo Please upload the following file and start again: w##.csv
echo.
pause
exit
)
:: ~~~~~~~~~~ INFORM USER TO WAIT FOR SCRIPT TO RUN ~~~~~~~~~~ ::
cls
echo Processing - Please wait 45-60 minutes for confirmation message to appear...
echo.
echo User Running Script: %USERNAME%
echo Start Time: %time%
echo.
:: ~~~~~~~~~~ DELETE PREVIOUS COMBINED FILE ~~~~~~~~~~ ::
:: delete previous combined .csv file, if one was created
IF EXIST combined.csv del combined.csv
:: ~~~~~~~~~~ REMOVE FIRST LINE OF EACH CSV FILE ~~~~~~~~~~ ::
:: delete first 1 lines of each file -- this contains the column headers
For /F "Tokens=1 Delims=" %%a In ('Dir /A-D /B *.csv') Do (
Echo %%a
:: define how many lines to remove. 0 lines includes column headers; 1 lines excludes column headers
For /F "Usebackq Tokens=1 skip=1 Delims=" %%I In ("%%a") Do >>"%temp%\%%a" Echo.%%I
Move /Y "%temp%\%%a" .\
)
PopD
:: ~~~~~~~~~~ CREATE TEMP FOLDER ~~~~~~~~~~ ::
:: create a 'temp' directory. delete directory and contents if already exists
IF EXIST temp rmdir temp /s/q
mkdir temp
:: ~~~~~~~~~~ REMOVE LAST TWO LINES OF EACH CSV FILE ~~~~~~~~~~ ::
:: delete last 2 lines of each file -- these contain totals calculations
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
goto:eof
:ProcessFile
SET "fname=%~1"
ECHO(processing %fname%
FOR /f "tokens=2delims=:" %%i IN ('find /c /v "" "%fname%"') DO SET "lines=%%i"
SET /a lines-=2
<"%fname%" (
FOR /l %%i IN (1,1,%lines%) DO (
SET "line="
SET/p "line="
SETLOCAL ENABLEDELAYEDEXPANSION
ECHO(!line!
ENDLOCAL
))>"temp\%fname%"
goto:eof
:: ~~~~~~~~~~ COMBINE CSV FILES ~~~~~~~~~~ ::
:: combine all .csv files into one file and use column-headers.txt as headers for the combined file
copy column-headers.txt+/b temp\*.csv combined.csv
:: ~~~~~~~~~~ DELETE TEMP FOLDER ~~~~~~~~~~ ::
rmdir temp\ /s/q
:: ~~~~~~~~~~ CONFIRMATION MESSAGE ~~~~~~~~~~ ::
echo.
echo.
echo Success! All CSV files were combined into the following file: combined.csv
echo End Time: %time%
echo.
pause
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 17 Jun 2013 22:23
by foxidrive
put all your combine routines between these two lines:
Code: Select all
For %%a In (*.csv) Do CALL:ProcessFile "%%~a"
:: routines go here.
:: batch file finishes here
goto:eof
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 18 Jun 2013 05:34
by Squashman
modulus wrote:3. Combine all .csv files into a single .csv file and append column-headers.txt to the top of that file
copy /a headers.txt + *.csv combined.tmp
rename combined.tmp combined.csv
Re: Remove Last 2 Lines of CSV Files Using For Loop
Posted: 18 Jun 2013 12:40
by modulus
The catch is I'll need to run a few additional commands (delete temp folder, show confirmation message, etc) after the last 2 lines are stripped out of each file. Is there a way where I can run the last 2 lines routine in a loop without using goto:eof?