DosTips.com

A Forum all about DOS Batch
It is currently 13 Feb 2016 11:11

All times are UTC-06:00




Post new topic  Reply to topic  [ 44 posts ]  Go to page 1 2 3 Next
Author Message
PostPosted: 18 May 2011 09:19 
Offline

Joined: 03 May 2011 19:06
Posts: 116
Experts,

how do you count string occurrences? For example, I have a text file with these contents:
hand randomtexthand randomtext
hand randomtext
hand randomtext
hand randomtext

I want to know the total count of hand,so ìt should be 5. But when using find /c the output is 4,it just counts the line. How can I solve this?


Last edited by renzlo on 18 May 2011 15:55, edited 1 time in total.

Top
   
PostPosted: 18 May 2011 10:33 
Offline
Expert

Joined: 16 May 2011 08:21
Posts: 1433
Location: Flanders(Belgium)
Code:
@echo off &SetLocal EnableExtensions EnableDelayedExpansion

echo.SKIP>"TST1.TXT"
::
for /f "useback tokens=*" %%! in (

   "TST.TXT"

) do    echo.%%~! >>"TST1.TXT"

set "ReadLine="
::
set /a COUNT = 0
set /a MEM = !COUNT!
::
set /a SKIP = 1
::
:LOOP ()
::
if !COUNT! equ !MEM! echo.SKIP=!SKIP!_

for /f "useback skip=%SKIP% tokens=*" %%! in (

   "TST1.TXT"

) do (
   if defined ReadLine (
      ::
      echo.if /i ["!ReadLine:~0,4!"] == ["hand"]
      if /i ["!ReadLine:~0,4!"] == ["hand"] (
         ::
         set /a COUNT += 1
         echo.COUNT=!COUNT!_

         set "ReadLine=!ReadLine:~4!"
         echo.ReadLine2=!ReadLine!_

      ) else (
         set "ReadLine=!ReadLine:~1!"
         echo.ReadLine0=!ReadLine!_
      )

      if not defined ReadLine (
         ::
         set /a MEM = !COUNT!
         ::
         set /a SKIP += 1
      )

   ) else (
      set "ReadLine=%%~!"
      echo.ReadLine1=!ReadLine!_
      ::
      if /i ["!ReadLine:~0,4!"] == ["hand"] (
         ::
         set /a COUNT += 1
         echo.COUNT=!COUNT!_

         set "ReadLine=!ReadLine:~4!"
         echo.ReadLine2=!ReadLine!_
      )

      if not defined ReadLine (
         ::
         set /a MEM = !COUNT!
         ::
         set /a SKIP += 1
      )
   )
   ::
   goto :LOOP "()"
)
echo.COUNT=!COUNT!_
echo.end
pause
exit


TST.TXT
Code:
hand1 randomtext hand2 handhand
hand5 randomtext
hand6 randomtext


**censored** :!: I lost a whole hour helpin U :!:
censored == Kut in het nederlands
censored == Sheisse in deutscher Sprache
censored == Merde en francais
Why are our languages not beeing censored !
That is discrimination and racism :mrgreen:


Top
   
PostPosted: 18 May 2011 15:44 
Offline

Joined: 03 May 2011 19:06
Posts: 116
Thanks but it seems inaccurate. It should be 5 not 6.


Top
   
PostPosted: 18 May 2011 15:45 
Offline
Expert

Joined: 16 May 2011 08:21
Posts: 1433
Location: Flanders(Belgium)
Can you solve it for me :?: No offence I am tired.

Gatte learn the language, it's not that hard use pause to debug


Last edited by Ed Dyreen on 18 May 2011 15:58, edited 1 time in total.

Top
   
PostPosted: 18 May 2011 15:54 
Offline

Joined: 03 May 2011 19:06
Posts: 116
Ed, thanks for the time, i really appreciate it. If only i knew what's the problem is, but im still a newbie. :(


Top
   
PostPosted: 18 May 2011 15:58 
Offline
Expert

Joined: 16 May 2011 08:21
Posts: 1433
Location: Flanders(Belgium)
Maybe delete isdefined += 1 somwhere


Top
   
PostPosted: 18 May 2011 16:04 
Offline
Expert

Joined: 16 May 2011 08:21
Posts: 1433
Location: Flanders(Belgium)
Code:
@echo off &SetLocal EnableExtensions EnableDelayedExpansion

echo.SKIP>"TST1.TXT"
::
for /f "useback tokens=*" %%! in (

   "TST.TXT"

) do    echo.%%~! >>"TST1.TXT"

set "ReadLine="
::
set /a COUNT = 0
set /a MEM = !COUNT!
::
set /a SKIP = 1
::
:LOOP ()
::
if !COUNT! equ !MEM! echo.SKIP=!SKIP!_

for /f "useback skip=%SKIP% tokens=*" %%! in (

   "TST1.TXT"

) do (
   if defined ReadLine (
      ::
      echo.if /i ["!ReadLine:~0,4!"] == ["hand"]
      if /i ["!ReadLine:~0,4!"] == ["hand"] (
         ::
         set /a COUNT += 1
         echo.COUNT=!COUNT!_

         set "ReadLine=!ReadLine:~4!"
         echo.ReadLine2=!ReadLine!_

      ) else (
         set "ReadLine=!ReadLine:~1!"
         echo.ReadLine0=!ReadLine!_
      )

      if not defined ReadLine (
         ::
         set /a MEM = !COUNT!
         ::
         set /a SKIP += 1
      )

   ) else (
      set "ReadLine=%%~!"
      echo.ReadLine1=!ReadLine!_
      ::
      if /i ["!ReadLine:~0,4!"] == ["hand"] (
         ::
         set /a COUNT += 1
         echo.COUNT=!COUNT!_

         set "ReadLine=!ReadLine:~4!"
         echo.ReadLine2=!ReadLine!_
      )

      if not defined ReadLine (
         ::
         set /a MEM = !COUNT!
         ::
         set /a SKIP += 1
      )
   )
   ::
   goto :LOOP "()"
)
echo.COUNT=!COUNT!_
echo.end
pause
exit


the count is six

hand1 randomtext hand2 handhand
hand5 randomtext
hand6 randomtext


Top
   
PostPosted: 18 May 2011 17:36 
Offline

Joined: 03 May 2011 19:06
Posts: 116
Thanks Ed. Im gonna test the above code when i get back home.


Top
   
PostPosted: 18 May 2011 19:13 
Online
Expert

Joined: 12 Feb 2011 21:02
Posts: 1608
Location: United States (east coast)
If a case insensitive search is acceptable then the following is simpler and I presume significantly faster:
Code:
@echo off
set "str= hand"
set file=hands.txt
set cnt=0
for /f ^"eol^=^

delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr

echo '%str%' appears %cnt% times in hands.txt (case insensitive)
exit /b

:countStr
  setlocal enableDelayedExpansion
  :loop
  set "ln2=!ln:*%str%=!"
  if "!ln2!" neq "!ln!" (
    set "ln=!ln2!"
    set /a "cnt+=1"
    goto :loop
  )
  endlocal & set cnt=%cnt%
exit /b

hands.txt
Code:
  HAND randomtext handrandomtext
  handrandomtext
; hand randomtext
  hand randomtext
hand

results:
Code:
' hand' appears 5 times in hands.txt (case insensitive)

I changed the search string to include a leading space and changed the text file to demonstrate issues with the FOR /F eol option. See the eol discussion embedded within Sorting tokens within a string for more info. Note that the last hand in hands.txt should not be counted because it is not preceded by a space.

I also changed the text file to include uppercase to demonstrate that the search is indeed case insensitive. FINDSTR is by default case sensitive, but the string substitution technique that I used can only be case insensitive.

Using FINDSTR is still very worth while because we don't want to waste time parsing lines that don't have the search string anywhere within them.

Dave Benham


Top
   
PostPosted: 18 May 2011 20:32 
Offline

Joined: 03 May 2011 19:06
Posts: 116
thanks dave for the script. it is working. the problem now is that when hand was mixed with some words like handrandomtext, it is not counted, is there a way to solve this?


Top
   
PostPosted: 18 May 2011 20:57 
Online
Expert

Joined: 12 Feb 2011 21:02
Posts: 1608
Location: United States (east coast)
renzlo wrote:
the problem now is that when hand was mixed with some words like handrandomtext, it is not counted, is there a way to solve this?

I don't understand - the example text file I ran has that exact string and it is counted properly. If you mean that something like "randomHANDrandom" is not counted, that is because my script intentionally is looking for a " hand" with a space in the front. You can simply modify my script to remove the leading space from the search string.

In other words set "str= hand" becomes set "str=hand"

Dave


Top
   
PostPosted: 18 May 2011 21:23 
Offline

Joined: 03 May 2011 19:06
Posts: 116
yes i did that dave.

the code:
Code:
@echo off
set "str=hand"
set file=hands.txt
set cnt=0
for /f ^"eol^=^

delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr

echo '%str%' appears %cnt% times in hands.txt (case insensitive)
exit /b

:countStr
  setlocal enableDelayedExpansion
  :loop
  set "ln2=!ln:*%str%=!"
  if "!ln2!" neq "!ln!" (
    set "ln=!ln2!"
    set /a "cnt+=1"
    goto :loop
  )
  endlocal & set cnt=%cnt%
exit /b


the contents of hands.txt:
Code:
HAND randomtexthandrandomtext
handrandomtext
hand randomtext
hand randomtext
hand


output:

Code:
'hand' appears 8 times in hands.txt (case insensitive)


it should be 6.


Top
   
PostPosted: 18 May 2011 21:34 
Online
Expert

Joined: 12 Feb 2011 21:02
Posts: 1608
Location: United States (east coast)
Oops! :oops: There was a bug when the entire line consists of nothing but "hand"(s)
Here is the fix:
Code:
@echo off
set "str=hand"
set file=hands.txt
set cnt=0
for /f ^"eol^=^

delims^=^" %%a in ('"findstr /i "/c:%str%" %file%"') do set "ln=%%a"&call :countStr

echo '%str%' appears %cnt% times in hands.txt (case insensitive)
exit /b

:countStr
  setlocal enableDelayedExpansion
  :loop
  if defined ln (
    set "ln2=!ln:*%str%=!"
    if "!ln2!" neq "!ln!" (
      set "ln=!ln2!"
      set /a "cnt+=1"
      goto :loop
    )
  )
  endlocal & set cnt=%cnt%
exit /b


Dave


Top
   
PostPosted: 18 May 2011 21:50 
Offline

Joined: 03 May 2011 19:06
Posts: 116
working great, thanks dave. thanks for your time. i really appreciate it.


Top
   
PostPosted: 30 Jul 2012 07:15 
Offline

Joined: 30 Jul 2012 07:09
Posts: 13
I know this is a really old thread. I've tried this and it works great. I was wonder how I might get it to loop through a series of files in a directory. I've tried the find dos command, but the problem is my files are very large and consist of one line. So the find command does not work as it counts theh number of lines that contain the string value. What the find command does do is spit out a list of files (if I use *.txt as my file identifier) and the count. Can someone help me convert this script to do something similar?


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 44 posts ]  Go to page 1 2 3 Next

All times are UTC-06:00


Who is online

Users browsing this forum: dbenham, Yahoo [Bot] and 9 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Limited