View unanswered posts | View active topics It is currently 01 Sep 2014 05:09



Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next
String extract question 
Author Message

Joined: 03 May 2011 19:06
Posts: 116
Post String extract question
Experts,

How do you write this in batch?

Source File: source.txt

Source.txt
Code:
C:\randomdir\randomdir\randomdir\randomdir:note.txt  thisfile_C1.jpg 10458791 with handwrittenC:\randomdir\randomdir\randomdir\randomdir\randomdir:note  thisfile_C1.jpg 10458791 with handwritten
C:\randomdir\randomdir\andomdir\randomdir:notes.txt  thisfile_C2.jpg 50459491 with handwritten
C:\randomdir\randomdir:note.txt  thisfile_C2.jpg 78458791 with handwritten
C:\randomdir\randomdir\randomdir\randomdir\randomdir\randomdir\randomdir:note.txt  thisfile_C1.jpg 90458791 with handwritten


the "thisfile_c1.jpg 50459491 with handwritten" will be extracted and written in newfile.txt, the result should be like this:
Output of newfile.txt
Quote:
Img-------------Serial-------Remarks

thisfile_C1.jpg 10458791 with handwritten
thisfile_C1.jpg 10458791 with handwritten
thisfile_C2.jpg 50459491 with handwritten
thisfile_C2.jpg 78458791 with handwritten
thisfile_C1.jpg 90458791 with handwritten


Thanks in advance.

renzlo


21 May 2011 22:49
Profile
Expert

Joined: 17 Oct 2009 08:30
Posts: 378
Location: Russia
Post Re: String extract question
in command line:
Code:
type source.txt |sed -n -r -e s/(\swith\shandwritten)/\1\n/p |sed -n -r -e s/.*\s(thisfile_C[0-9].jpg\s[0-9]*\swith\shandwritten)/\1/p > newfile.txt


Last edited by !k on 22 May 2011 03:35, edited 1 time in total.



22 May 2011 03:31
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
thanks for the reply..

i got this:

Code:
'sed' is not recognized as an internal or external command,
operable program or batch file.


22 May 2011 03:34
Profile
Expert

Joined: 17 Oct 2009 08:30
Posts: 378
Location: Russia
Post Re: String extract question
http://sourceforge.net/projects/gnuwin32/files/sed/4.2.1/sed-4.2.1-bin.zip/download (317.9 KB)
homepage http://sourceforge.net/projects/gnuwin32/files/sed/


22 May 2011 03:42
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
thanks !k

by the way, is this possible with pure dos batch? not using any other program like sed?


22 May 2011 03:45
Profile
Expert

Joined: 17 Oct 2009 08:30
Posts: 378
Location: Russia
Post Re: String extract question
Code:
@echo off
for /f "delims=" %%a in (source.txt) do call :p "%%a"
goto :eof

:p
set "str="
for %%b in (%~1) do (
echo %%b |findstr /r /c:"thisfile_C[0-9]\.jpg" >nul &&set "str=%%b" ||if defined str call set "str=%%str%% %%b"
)
echo,%str%>>newfile.txt
goto :eof

very dirty & fail with "... handwrittenC:\..."


Last edited by !k on 22 May 2011 05:20, edited 1 time in total.



22 May 2011 04:19
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
thanks for the script, this should work if "thisfile" is changing? the real filename of jpg is "8character"_c[0-9].jpg.

thanks for your time.


22 May 2011 04:53
Profile
Expert

Joined: 17 Oct 2009 08:30
Posts: 378
Location: Russia
Post Re: String extract question
findstr/? wrote:
Regular expression quick reference:
. Wildcard: any character
* Repeat: zero or more occurances of previous character or class
^ Line position: beginning of line
$ Line position: end of line
[class] Character class: any one character in set
[^class] Inverse class: any one character not in set
[x-y] Range: any characters within the specified range
\x Escape: literal use of metacharacter x
\<xyz Word position: beginning of word
xyz\> Word position: end of word


so findstr /r /c:"\<........_C[0-9]\.jpg"


22 May 2011 05:06
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
thanks !k. it is working great. the only problem is that my source is dirty. And i can't do about it. thanks again.


22 May 2011 05:15
Profile
Expert

Joined: 22 Jan 2010 18:01
Posts: 1750
Location: Germany
Post Re: String extract question
Well, FINDSTR will find the line that contains the pattern you're looking for, but it's always a problem to extract it.
In this case (if you don't want to install a 3rd party app) I suggest to use e.g. VBScript.

*.vbs
Code:
strSourceFile = "source.txt"
strDestFile = "newfile.txt"
strPattern = "thisfile_C[0-9][0-9]*\.jpg\s[0-9][0-9]*\swith\shandwritten"

Set objRegEx = New RegExp
With objRegEx
  .Global = True
  .IgnoreCase = True
  .Pattern = strPattern
End With

Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objSourceFile = objFSO.OpenTextFile(strSourceFile, 1)
Set objDestFile = objFSO.OpenTextFile(strDestFile, 2, True)

While Not objSourceFile.AtEndOfStream
  Set colMatches = objRegEx.Execute(objSourceFile.ReadLine)
  For Each objMatch In colMatches
    objDestFile.WriteLine objMatch.Value
  Next
Wend

objDestFile.Close
objSourceFile.Close


Regards
aGerman


22 May 2011 05:28
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
thanks aGerman. it is working great but what if there's a line in the source text that looks like this:

Code:
C:\randomdir\randomdir:note.txt      thisfile_C2.jpg       78458791  with  handwritten
C:\randomdir\randomdir:note.txtthisfile_C2.jpg<tab>78458791<tab>with  handwritten


and can i use also a wilcard in "thisfile"?

thanks.


22 May 2011 05:50
Profile
Expert

Joined: 22 Jan 2010 18:01
Posts: 1750
Location: Germany
Post Re: String extract question
Test it with the following line:
Code:
strPattern = "\w*_C[0-9][0-9]*\.jpg\t[0-9][0-9]*\twith\shandwritten"


Here you can find a reference.

Regards
aGerman


22 May 2011 06:10
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
thanks aGerman. It seems that i need to play with regular expression. currently only those who have many white space is not extracted.


22 May 2011 06:29
Profile
Expert

Joined: 22 Jan 2010 18:01
Posts: 1750
Location: Germany
Post Re: String extract question
Haha, one more try:
Code:
strPattern = "\w*_C[0-9][0-9]*\.jpg\s\s*[0-9][0-9]*\s\s*with\shandwritten"


Regards
aGerman


22 May 2011 07:23
Profile

Joined: 03 May 2011 19:06
Posts: 116
Post Re: String extract question
got it with this:

Code:
strPattern = "\w*_C[0-9][0-9]*\.jpg[\s\t]*[0-9][0-9]*[\s\t]*with[\s\t]*handwritten"


thanks aGerman.


22 May 2011 08:38
Profile
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next


Who is online

Users browsing this forum: Yahoo [Bot] and 11 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Forum style by Vjacheslav Trushkin for Free Forums/DivisionCore.