Count string occurrences in text file
Moderator: DosItHelp
Count string occurrences in text file
Experts,
how do you count string occurrences? For example, I have a text file with these contents:
hand randomtexthand randomtext
hand randomtext
hand randomtext
hand randomtext
I want to know the total count of hand,so ìt should be 5. But when using find /c the output is 4,it just counts the line. How can I solve this?
how do you count string occurrences? For example, I have a text file with these contents:
hand randomtexthand randomtext
hand randomtext
hand randomtext
hand randomtext
I want to know the total count of hand,so ìt should be 5. But when using find /c the output is 4,it just counts the line. How can I solve this?
Last edited by renzlo on 18 May 2011 15:55, edited 1 time in total.
Re: Count string occurrences
Code: Select all
@echo off &SetLocal EnableExtensions EnableDelayedExpansion
echo.SKIP>"TST1.TXT"
::
for /f "useback tokens=*" %%! in (
"TST.TXT"
) do echo.%%~! >>"TST1.TXT"
set "ReadLine="
::
set /a COUNT = 0
set /a MEM = !COUNT!
::
set /a SKIP = 1
::
:LOOP ()
::
if !COUNT! equ !MEM! echo.SKIP=!SKIP!_
for /f "useback skip=%SKIP% tokens=*" %%! in (
"TST1.TXT"
) do (
if defined ReadLine (
::
echo.if /i ["!ReadLine:~0,4!"] == ["hand"]
if /i ["!ReadLine:~0,4!"] == ["hand"] (
::
set /a COUNT += 1
echo.COUNT=!COUNT!_
set "ReadLine=!ReadLine:~4!"
echo.ReadLine2=!ReadLine!_
) else (
set "ReadLine=!ReadLine:~1!"
echo.ReadLine0=!ReadLine!_
)
if not defined ReadLine (
::
set /a MEM = !COUNT!
::
set /a SKIP += 1
)
) else (
set "ReadLine=%%~!"
echo.ReadLine1=!ReadLine!_
::
if /i ["!ReadLine:~0,4!"] == ["hand"] (
::
set /a COUNT += 1
echo.COUNT=!COUNT!_
set "ReadLine=!ReadLine:~4!"
echo.ReadLine2=!ReadLine!_
)
if not defined ReadLine (
::
set /a MEM = !COUNT!
::
set /a SKIP += 1
)
)
::
goto :LOOP "()"
)
echo.COUNT=!COUNT!_
echo.end
pause
exit
TST.TXT
Code: Select all
hand1 randomtext hand2 handhand
hand5 randomtext
hand6 randomtext
**censored** I lost a whole hour helpin U
censored == Kut in het nederlands
censored == Sheisse in deutscher Sprache
censored == Merde en francais
Why are our languages not beeing censored !
That is discrimination and racism
Re: Count string occurrences
Thanks but it seems inaccurate. It should be 5 not 6.
Re: Count string occurrences
Can you solve it for me No offence I am tired.
Gatte learn the language, it's not that hard use pause to debug
Gatte learn the language, it's not that hard use pause to debug
Last edited by Ed Dyreen on 18 May 2011 15:58, edited 1 time in total.
Re: Count string occurrences
Ed, thanks for the time, i really appreciate it. If only i knew what's the problem is, but im still a newbie.
Re: Count string occurrences in text file
Maybe delete isdefined += 1 somwhere
Re: Count string occurrences in text file
Code: Select all
@echo off &SetLocal EnableExtensions EnableDelayedExpansion
echo.SKIP>"TST1.TXT"
::
for /f "useback tokens=*" %%! in (
"TST.TXT"
) do echo.%%~! >>"TST1.TXT"
set "ReadLine="
::
set /a COUNT = 0
set /a MEM = !COUNT!
::
set /a SKIP = 1
::
:LOOP ()
::
if !COUNT! equ !MEM! echo.SKIP=!SKIP!_
for /f "useback skip=%SKIP% tokens=*" %%! in (
"TST1.TXT"
) do (
if defined ReadLine (
::
echo.if /i ["!ReadLine:~0,4!"] == ["hand"]
if /i ["!ReadLine:~0,4!"] == ["hand"] (
::
set /a COUNT += 1
echo.COUNT=!COUNT!_
set "ReadLine=!ReadLine:~4!"
echo.ReadLine2=!ReadLine!_
) else (
set "ReadLine=!ReadLine:~1!"
echo.ReadLine0=!ReadLine!_
)
if not defined ReadLine (
::
set /a MEM = !COUNT!
::
set /a SKIP += 1
)
) else (
set "ReadLine=%%~!"
echo.ReadLine1=!ReadLine!_
::
if /i ["!ReadLine:~0,4!"] == ["hand"] (
::
set /a COUNT += 1
echo.COUNT=!COUNT!_
set "ReadLine=!ReadLine:~4!"
echo.ReadLine2=!ReadLine!_
)
if not defined ReadLine (
::
set /a MEM = !COUNT!
::
set /a SKIP += 1
)
)
::
goto :LOOP "()"
)
echo.COUNT=!COUNT!_
echo.end
pause
exit
the count is six
hand1 randomtext hand2 handhand
hand5 randomtext
hand6 randomtext
Re: Count string occurrences in text file
Thanks Ed. Im gonna test the above code when i get back home.
Re: Count string occurrences in text file
If a case insensitive search is acceptable then the following is simpler and I presume significantly faster:
hands.txt
results:
I changed the search string to include a leading space and changed the text file to demonstrate issues with the FOR /F eol option. See the eol discussion embedded within Sorting tokens within a string for more info. Note that the last hand in hands.txt should not be counted because it is not preceded by a space.
I also changed the text file to include uppercase to demonstrate that the search is indeed case insensitive. FINDSTR is by default case sensitive, but the string substitution technique that I used can only be case insensitive.
Using FINDSTR is still very worth while because we don't want to waste time parsing lines that don't have the search string anywhere within them.
Dave Benham
Code: Select all
@echo off
set "str= hand"
set file=hands.txt
set cnt=0
for /f ^"eol^=^
delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr
echo '%str%' appears %cnt% times in hands.txt (case insensitive)
exit /b
:countStr
setlocal enableDelayedExpansion
:loop
set "ln2=!ln:*%str%=!"
if "!ln2!" neq "!ln!" (
set "ln=!ln2!"
set /a "cnt+=1"
goto :loop
)
endlocal & set cnt=%cnt%
exit /b
hands.txt
Code: Select all
HAND randomtext handrandomtext
handrandomtext
; hand randomtext
hand randomtext
hand
results:
Code: Select all
' hand' appears 5 times in hands.txt (case insensitive)
I changed the search string to include a leading space and changed the text file to demonstrate issues with the FOR /F eol option. See the eol discussion embedded within Sorting tokens within a string for more info. Note that the last hand in hands.txt should not be counted because it is not preceded by a space.
I also changed the text file to include uppercase to demonstrate that the search is indeed case insensitive. FINDSTR is by default case sensitive, but the string substitution technique that I used can only be case insensitive.
Using FINDSTR is still very worth while because we don't want to waste time parsing lines that don't have the search string anywhere within them.
Dave Benham
Re: Count string occurrences in text file
thanks dave for the script. it is working. the problem now is that when hand was mixed with some words like handrandomtext, it is not counted, is there a way to solve this?
Re: Count string occurrences in text file
renzlo wrote: the problem now is that when hand was mixed with some words like handrandomtext, it is not counted, is there a way to solve this?
I don't understand - the example text file I ran has that exact string and it is counted properly. If you mean that something like "randomHANDrandom" is not counted, that is because my script intentionally is looking for a " hand" with a space in the front. You can simply modify my script to remove the leading space from the search string.
In other words set "str= hand" becomes set "str=hand"
Dave
Re: Count string occurrences in text file
yes i did that dave.
the code:
the contents of hands.txt:
output:
it should be 6.
the code:
Code: Select all
@echo off
set "str=hand"
set file=hands.txt
set cnt=0
for /f ^"eol^=^
delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr
echo '%str%' appears %cnt% times in hands.txt (case insensitive)
exit /b
:countStr
setlocal enableDelayedExpansion
:loop
set "ln2=!ln:*%str%=!"
if "!ln2!" neq "!ln!" (
set "ln=!ln2!"
set /a "cnt+=1"
goto :loop
)
endlocal & set cnt=%cnt%
exit /b
the contents of hands.txt:
Code: Select all
HAND randomtexthandrandomtext
handrandomtext
hand randomtext
hand randomtext
hand
output:
Code: Select all
'hand' appears 8 times in hands.txt (case insensitive)
it should be 6.
Re: Count string occurrences in text file
Oops! There was a bug when the entire line consists of nothing but "hand"(s)
Here is the fix:
Dave
Here is the fix:
Code: Select all
@echo off
set "str=hand"
set file=hands.txt
set cnt=0
for /f ^"eol^=^
delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr
echo '%str%' appears %cnt% times in hands.txt (case insensitive)
exit /b
:countStr
setlocal enableDelayedExpansion
:loop
if defined ln (
set "ln2=!ln:*%str%=!"
if "!ln2!" neq "!ln!" (
set "ln=!ln2!"
set /a "cnt+=1"
goto :loop
)
)
endlocal & set cnt=%cnt%
exit /b
Dave
Re: Count string occurrences in text file
working great, thanks dave. thanks for your time. i really appreciate it.
Re: Count string occurrences in text file
I know this is a really old thread. I've tried this and it works great. I was wonder how I might get it to loop through a series of files in a directory. I've tried the find dos command, but the problem is my files are very large and consist of one line. So the find command does not work as it counts theh number of lines that contain the string value. What the find command does do is spit out a list of files (if I use *.txt as my file identifier) and the count. Can someone help me convert this script to do something similar?