Detecting same file size and deleting

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
nerd
Posts: 7
Joined: 01 Jul 2010 03:39

Detecting same file size and deleting

#1 Post by nerd » 10 Jul 2010 10:24

I am trying to write a script to detect go through sub-folders in the main folder to detect the same file size using "dir /os" followed by a deletion of the duplicate files in the sub-folder. Just wondering how to I combine this together using the IF command to delete if the file size is same.

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Detecting same file size and deleting

#2 Post by aGerman » 10 Jul 2010 11:41

Compare and delete files depending on their size is not a good idea. you should use a MD5 tool.

I don't understand what you want to do. Where exactly are the files placed and which of found files should be deleted?

Regards
aGerman

!k
Expert
Posts: 378
Joined: 17 Oct 2009 08:30
Location: Russia

Re: Detecting same file size and deleting

#3 Post by !k » 11 Jul 2010 00:00

nerd

Code: Select all

@echo off
echo Used RHash http://rhash.anz.ru/
setlocal enableextensions
set "folder=c:\del dups"

set "hash="
for /f "tokens=1,*" %%a in (
'rhash.exe -H -r "%folder%" ^|sort'
) do call :d "%%a" "%%b"
goto :eof

:d
if "%hash%" == "%~1" del /q %2
set "hash=%~1"
goto :eof

nerd
Posts: 7
Joined: 01 Jul 2010 03:39

Re: Detecting same file size and deleting

#4 Post by nerd » 11 Jul 2010 20:19

Hi,

aGerman: I am grabbing some files online for a project, and somehow, I get lots of duplicated files. This duplicated files have the same file size, so I am trying to delete the duplicated files while maintaining original file. I am creating a folder each time I am trying to grab the files. For eg. I create "ABC" folder, I grab the files and place it in the folder. Next, I created "BCD" folder, I grab the files and place it in that folder.

So "ABC" folder has 2 duplicated files size of 1111kb and "BCA" folder has 1 file size of 1111kb as well. So the batch program will go through "ABC" folder and delete the duplicated copy of 1111kb as it has detected the same file size within the same folder. It will not delete the file with 1111kb in "BCA" folder. I have never tried md5 before, so will take a look at it.

!k: Thanks for the input, I will try that out!

miskox
Posts: 553
Joined: 28 Jun 2010 03:46

Re: Detecting same file size and deleting

#5 Post by miskox » 12 Jul 2010 06:34

I would suggest using

Code: Select all

fc /b file1 file2 >nul 2>nul


for this.

I have a batch file for finding .pdf files from different (sub)folders and I compare them with .pdf files that are located in ONE folder only. But files in (sub)folders have different filenames than the one in ONE folder.
I make a list of files to a file and then make binary compare.

Can post a batch program later/tomorrow - don't have it here.

Saso

aGerman
Expert
Posts: 4654
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Detecting same file size and deleting

#6 Post by aGerman » 12 Jul 2010 11:57

If you are sure there are never two files with same size but different contents, you could use this:

Code: Select all

@echo off &setlocal
set rootfolder=c:\your data root
pushd "%rootfolder%" ||goto :eof
for /d %%a in (*) do (
  set "subfolder=%%~fa"
  call :proc
)
popd
pause
goto :eof

:proc
pushd "%subfolder%"
for %%a in (*) do (
  for %%b in (*) do (
    if "%%a" neq "%%b" (
      if "%%~za"=="%%~zb" (
        del "%%b"
      )
    )
  )
)
popd
goto :eof


BTW: !k's tool could also calculate the md5 hash of files (option -M instead of -H).

Regards
aGerman

nerd
Posts: 7
Joined: 01 Jul 2010 03:39

Re: Detecting same file size and deleting

#7 Post by nerd » 14 Jul 2010 10:26

Thanks aGerman :D

Post Reply