Extract UNIQUE rows ONLY from .txt file.

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
PAB
Posts: 89
Joined: 12 Aug 2019 13:57

Extract UNIQUE rows ONLY from .txt file.

#1 Post by PAB » 07 Jun 2020 10:08

Good afternoon,

I hope you are all keeping well and safe!

I have a .txt file that gets produced with many lines of text, which could include many duplicates.
Here is the part of the code that creates the .txt file . . .

Code: Select all

type "%tmp%" | findstr /I /G:"%Filter%" >> "%Output_File%"
[1] I want to exclude duplicates.
[2] I want the original order kept [ excluding the duplictes of course! ].

I already had a bit of code in my collection which I have adapted to do the above, which it does, but is there anyway to incorporate it into my existing code above please instead of running it seperately?

Code: Select all

@echo off

set "InputFile=C:\Users\System-Admin\Desktop\Errors.txt"
set "OutputFile=C:\Users\System-Admin\Desktop\DISM_Errors2.txt"

set "PSScript=%Temp%\~tmpRemoveDupe.ps1"
if exist "%PSScript%" del /q /f "%PSScript%"
echo Get-Content "%InputFile%" ^| Get-Unique ^> "%OutputFile%" >> "%PSScript%"
set "PowerShellDir=C:\Windows\System32\WindowsPowerShell\v1.0"
cd /D "%PowerShellDir%"
Powershell -ExecutionPolicy Bypass -Command "& '%PSScript%'"
del "%PSScript%"
pause
goto :EOF
EOF
Thanks in advance.

Hackoo
Posts: 86
Joined: 15 Apr 2014 17:59

Re: Extract UNIQUE rows ONLY from .txt file.

#2 Post by Hackoo » 07 Jun 2020 10:41

Hi :)
You can try like that :

Code: Select all

@echo off
set "InputFile=C:\Users\System-Admin\Desktop\Errors.txt"
set "OutputFile=C:\Users\System-Admin\Desktop\DISM_Errors2.txt"
Call :RemoveDuplicateEntry %InputFile% %OutputFile%
Pause & Exit
::----------------------------------------------------
:RemoveDuplicateEntry <InputFile> <OutPutFile>
Powershell  ^
$Contents=Get-Content '%1';  ^
$LowerContents=$Contents.ToLower(^);  ^
$LowerContents ^| select -unique ^| Out-File '%2'
Exit /b
::----------------------------------------------------

PAB
Posts: 89
Joined: 12 Aug 2019 13:57

Re: Extract UNIQUE rows ONLY from .txt file.

#3 Post by PAB » 07 Jun 2020 11:51

Thanks for the reply, it is appreciated.

I have tried all different ways of getting this to work but I get at least one error on the PS side. One being . . .
Method invocation failed because [System.Object[]] doesn't contain a
method named 'ToLower'.
At line:1 char:97
+ $Contents=Get-Content 'C:\Users\System-Admin\Desktop\Dups.txt'; $Lo
werContents=$Contents.ToLower <<<< (); $LowerContents | select -uniqu
e | Out-File 'C:\Users\System-Admin\Desktop\DISM_Errors.txt'
+ CategoryInfo : InvalidOperation: (ToLower:String) [],
RuntimeException
+ FullyQualifiedErrorId : MethodNotFound

Code: Select all

) else (
  type "%tmp%" | findstr /I /G:"%Filter%" >> "%InputFile%"
  Call :RemoveDuplicateEntry %InputFile% %OutputFile%
  echo. & echo ^>Press ANY key to EXIT . . . & pause >nul
  goto :Exit
)
:Exit

:RemoveDuplicateEntry <InputFile> <OutputFile>
Powershell  ^
$Contents=Get-Content '%1';  ^
$LowerContents=$Contents.ToLower(^);  ^
$LowerContents ^| select -unique ^| Out-File '%2'
Exit /b
UPDATE:

It is important that the file is NOT sorted.
The .txt file is a log file that is sorted in yyyy-mm-dd hh-mm-secs, therefore, when there are duplicate rows, they are actually together anyway, pretty much as if they were already sorted.
The code I posted previously works great, except that I need this done within the same file rather than having another file perform this!

Thanks in advance.
Last edited by PAB on 07 Jun 2020 17:39, edited 2 times in total.

PAB
Posts: 89
Joined: 12 Aug 2019 13:57

Re: Extract UNIQUE rows ONLY from .txt file.

#4 Post by PAB » 07 Jun 2020 15:02

OK, this actually works . . .

Code: Select all

) else (
  cls
  del "%Input_File%"
  type "%tmp%" | findstr /I /G:"%Filter%" >> "%Input_File%"
  del "%tmp%"
  echo. > "%Output_File%" & echo ERRORS FOUND . . . >> "%Output_File%" & echo. >> "%Output_File%"
  for /f "tokens=* delims= " %%a in (%Input_File%) do (
  find "%%a" < "%Output_File%" >nul || >> "%Output_File%" echo.%%a
  )
  del "%Input_File%"
  echo. & echo ^>Press ANY key to EXIT . . . & pause >nul
  goto :Exit
)
:Exit
One question please.

Throughout my code I have added speech marks around my path variables as I always do.
For some reason however, if I add speech marks around in ("%Input_File%") do, it returns a single line of the path and the file name.
Without the speech marks it returns the list of unique rows as expected!

Thanks in advance.

Post Reply