Discussion forum for all Windows batch related topics.
Moderator: DosItHelp
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#1
Post
by renzlo » 21 May 2011 22:49
Experts,
How do you write this in batch?
Source File: source.txt
Source.txt
Code: Select all
C:\randomdir\randomdir\randomdir\randomdir:note.txt thisfile_C1.jpg 10458791 with handwrittenC:\randomdir\randomdir\randomdir\randomdir\randomdir:note thisfile_C1.jpg 10458791 with handwritten
C:\randomdir\randomdir\andomdir\randomdir:notes.txt thisfile_C2.jpg 50459491 with handwritten
C:\randomdir\randomdir:note.txt thisfile_C2.jpg 78458791 with handwritten
C:\randomdir\randomdir\randomdir\randomdir\randomdir\randomdir\randomdir:note.txt thisfile_C1.jpg 90458791 with handwritten
the "thisfile_c1.jpg 50459491 with handwritten" will be extracted and written in newfile.txt, the result should be like this:
Output of newfile.txt
Img-------------Serial-------Remarks
thisfile_C1.jpg 10458791 with handwritten
thisfile_C1.jpg 10458791 with handwritten
thisfile_C2.jpg 50459491 with handwritten
thisfile_C2.jpg 78458791 with handwritten
thisfile_C1.jpg 90458791 with handwritten
Thanks in advance.
renzlo
-
!k
- Expert
- Posts: 378
- Joined: 17 Oct 2009 08:30
- Location: Russia
#2
Post
by !k » 22 May 2011 03:31
in command line:
Code: Select all
type source.txt |sed -n -r -e s/(\swith\shandwritten)/\1\n/p |sed -n -r -e s/.*\s(thisfile_C[0-9].jpg\s[0-9]*\swith\shandwritten)/\1/p > newfile.txt
Last edited by
!k on 22 May 2011 03:35, edited 1 time in total.
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#3
Post
by renzlo » 22 May 2011 03:34
thanks for the reply..
i got this:
Code: Select all
'sed' is not recognized as an internal or external command,
operable program or batch file.
-
!k
- Expert
- Posts: 378
- Joined: 17 Oct 2009 08:30
- Location: Russia
#4
Post
by !k » 22 May 2011 03:42
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#5
Post
by renzlo » 22 May 2011 03:45
thanks !k
by the way, is this possible with pure dos batch? not using any other program like sed?
-
!k
- Expert
- Posts: 378
- Joined: 17 Oct 2009 08:30
- Location: Russia
#6
Post
by !k » 22 May 2011 04:19
Code: Select all
@echo off
for /f "delims=" %%a in (source.txt) do call :p "%%a"
goto :eof
:p
set "str="
for %%b in (%~1) do (
echo %%b |findstr /r /c:"thisfile_C[0-9]\.jpg" >nul &&set "str=%%b" ||if defined str call set "str=%%str%% %%b"
)
echo,%str%>>newfile.txt
goto :eof
very dirty & fail with "... handwrittenC:\..."
Last edited by
!k on 22 May 2011 05:20, edited 1 time in total.
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#7
Post
by renzlo » 22 May 2011 04:53
thanks for the script, this should work if "thisfile" is changing? the real filename of jpg is "8character"_c[0-9].jpg.
thanks for your time.
-
!k
- Expert
- Posts: 378
- Joined: 17 Oct 2009 08:30
- Location: Russia
#8
Post
by !k » 22 May 2011 05:06
findstr/? wrote:Regular expression quick reference:
. Wildcard: any character
* Repeat: zero or more occurances of previous character or class
^ Line position: beginning of line
$ Line position: end of line
[class] Character class: any one character in set
[^class] Inverse class: any one character not in set
[x-y] Range: any characters within the specified range
\x Escape: literal use of metacharacter x
\<xyz Word position: beginning of word
xyz\> Word position: end of word
so
findstr /r /c:"\<........_C[0-9]\.jpg"
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#9
Post
by renzlo » 22 May 2011 05:15
thanks !k. it is working great. the only problem is that my source is dirty. And i can't do about it. thanks again.
-
aGerman
- Expert
- Posts: 4654
- Joined: 22 Jan 2010 18:01
- Location: Germany
#10
Post
by aGerman » 22 May 2011 05:28
Well, FINDSTR will find the line that contains the pattern you're looking for, but it's always a problem to extract it.
In this case (if you don't want to install a 3rd party app) I suggest to use e.g. VBScript.
*.vbs
Code: Select all
strSourceFile = "source.txt"
strDestFile = "newfile.txt"
strPattern = "thisfile_C[0-9][0-9]*\.jpg\s[0-9][0-9]*\swith\shandwritten"
Set objRegEx = New RegExp
With objRegEx
.Global = True
.IgnoreCase = True
.Pattern = strPattern
End With
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objSourceFile = objFSO.OpenTextFile(strSourceFile, 1)
Set objDestFile = objFSO.OpenTextFile(strDestFile, 2, True)
While Not objSourceFile.AtEndOfStream
Set colMatches = objRegEx.Execute(objSourceFile.ReadLine)
For Each objMatch In colMatches
objDestFile.WriteLine objMatch.Value
Next
Wend
objDestFile.Close
objSourceFile.Close
Regards
aGerman
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#11
Post
by renzlo » 22 May 2011 05:50
thanks aGerman. it is working great but what if there's a line in the source text that looks like this:
Code: Select all
C:\randomdir\randomdir:note.txt thisfile_C2.jpg 78458791 with handwritten
C:\randomdir\randomdir:note.txtthisfile_C2.jpg<tab>78458791<tab>with handwritten
and can i use also a wilcard in "thisfile"?
thanks.
-
aGerman
- Expert
- Posts: 4654
- Joined: 22 Jan 2010 18:01
- Location: Germany
#12
Post
by aGerman » 22 May 2011 06:10
Test it with the following line:
Code: Select all
strPattern = "\w*_C[0-9][0-9]*\.jpg\t[0-9][0-9]*\twith\shandwritten"
Here you can find a reference.
Regards
aGerman
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#13
Post
by renzlo » 22 May 2011 06:29
thanks aGerman. It seems that i need to play with regular expression. currently only those who have many white space is not extracted.
-
aGerman
- Expert
- Posts: 4654
- Joined: 22 Jan 2010 18:01
- Location: Germany
#14
Post
by aGerman » 22 May 2011 07:23
Haha, one more try:
Code: Select all
strPattern = "\w*_C[0-9][0-9]*\.jpg\s\s*[0-9][0-9]*\s\s*with\shandwritten"
Regards
aGerman
-
renzlo
- Posts: 116
- Joined: 03 May 2011 19:06
#15
Post
by renzlo » 22 May 2011 08:38
got it with this:
Code: Select all
strPattern = "\w*_C[0-9][0-9]*\.jpg[\s\t]*[0-9][0-9]*[\s\t]*with[\s\t]*handwritten"
thanks aGerman.