Count string occurrences in text file

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Count string occurrences in text file

#16 Post by foxidrive » 30 Jul 2012 07:53

kmbarre wrote:the problem is my files are very large and consist of one line.



How large are they? All data is on the single line? Is it plain text?

Find.exe and Findstr.exe have fairly small line length limits.

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#17 Post by kmbarre » 30 Jul 2012 08:47

The files are 25 MB and contain 25000 records but there are no carriage returns so it is all one line.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Count string occurrences in text file

#18 Post by foxidrive » 30 Jul 2012 09:39

What do you want to find? You could possibly do it in WSH using VB.

Do you want to find something at a defined byte position in each record, if each record is 1024 bytes long.

It is a text file, right, using printable characters?

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#19 Post by kmbarre » 30 Jul 2012 10:59

The purpose is to confirm the number of records in the file. I am searching for the delimiter. For example, a "$", which seperates each record. I've got it working for one file as I update the script for each indivdual file I want to scan. But what I'dl like to do is have it display what find does.

find /c "$" *.txt

---------- file1.txt 5
---------- file2.txt 10
---------- file3.txt 25
. . . etc . .

The problem is that find searches for the number of lines that contain the given character. And there are no carriage returns in this file, so it is all one line.

This is what I have.

@echo off
set "str=$"
set file=file1.txt
set cnt=0
for /f ^"eol^=^

delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr

echo '%str%' appears %cnt% times in %file% (case insensitive)
exit /b

:countStr
setlocal enableDelayedExpansion
:loop
set "ln2=!ln:*%str%=!"
if "!ln2!" neq "!ln!" (
set "ln=!ln2!"
set /a "cnt+=1"
goto :loop
)
endlocal & set cnt=%cnt%
exit /b

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Count string occurrences in text file

#20 Post by foxidrive » 30 Jul 2012 12:52

Try this on one of your files.

You'll need to download GnuSED for Windows.

launch it like this: batch1.bat "fileabc.txt"

Code: Select all

@echo off
set file=%1
sed "s/\$/\n/g" "%~1" >file.tmp
for /f %%a in ('find /c /v "" ^<file.tmp') do echo %file%  - %%a
del file.tmp


and to run it on multiple files use a second batch file to call the first one:

Code: Select all

@echo off
for %%a in (filespec.txt) do call batch1 "%%a"

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#21 Post by kmbarre » 30 Jul 2012 13:38

Do you have a good link for GnuSED?

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#22 Post by kmbarre » 30 Jul 2012 14:02

ok Got it! And the scrpt works GREAT!! Thank you!

One last question. Is there a way I could put the script files in a subdirectory of the one I am searching? Wonder what sytax in the scripts would need to change?

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Count string occurrences in text file

#23 Post by foxidrive » 30 Jul 2012 15:04

kmbarre wrote:ok Got it! And the scrpt works GREAT!! Thank you!


That's nice to hear, Thanks.

One last question. Is there a way I could put the script files in a subdirectory of the one I am searching? Wonder what sytax in the scripts would need to change?


How do you mean? What you can do is put the two scripts on the PATH such as in c:\windows and then type the command. I'm not sure how you use them.

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#24 Post by kmbarre » 31 Jul 2012 07:46

One thing I just noticed is that this script adds 1 to the count for each file. For example, I'm expecting 25000 and it is giving me 25001.

Any thoughts?

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Count string occurrences in text file

#25 Post by Squashman » 31 Jul 2012 07:52

kmbarre wrote:One thing I just noticed is that this script adds 1 to the count for each file. For example, I'm expecting 25000 and it is giving me 25001.

Any thoughts?

Is there Blank lines that you think it shouldn't be counting. I assume your file would have a double $

Just run the code to create the temp file. Then open up the file with any Text Editor that shows line numbers along the side. You should be able to identify why it is counting an extra line.

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#26 Post by kmbarre » 31 Jul 2012 08:12

No luck. No blank lines. one file actually is just one short line with none of the search string text in it. And still shows up as one.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Count string occurrences in text file

#27 Post by Squashman » 31 Jul 2012 08:17

kmbarre wrote:one file actually is just one short line with none of the search string text in it. And still shows up as one.

It is one line. Of course it is going to come out as 1.

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#28 Post by kmbarre » 31 Jul 2012 08:43

Well my intent is to count the number of times a string occurs in each file. So what might I check or change to remove this extra count?

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Count string occurrences in text file

#29 Post by Squashman » 31 Jul 2012 08:49

kmbarre wrote:Well my intent is to count the number of times a string occurs in each file. So what might I check or change to remove this extra count?

So basically what you are saying is you don't understand what the code you were given is doing?

kmbarre
Posts: 13
Joined: 30 Jul 2012 07:09

Re: Count string occurrences in text file

#30 Post by kmbarre » 31 Jul 2012 10:45

Yes I understand what it is doing. Its the syntax that is a bit confusing. This is a support forum, is it not?

Post Reply