Searching html? Can a batch file do this?

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Searching html? Can a batch file do this?

#31 Post by foxidrive » 10 May 2012 10:34

Will each entry always appear in this format?

Will the bit in blue always appear between the number that you want and the reference that you have - ABC1234567 in this case?

patient_id=PREFIX-ID&sps_id=1234567');"><td>ABC1234567</td>


And what is the task?
Do you have acc.txt that contains a list of these reference numbers and you want to extract the 7 digit numbers for all of them from the html?

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Searching html? Can a batch file do this?

#32 Post by foxidrive » 10 May 2012 10:43

Matt20687 wrote:I have been testing a few things and i have found that the sed commands is not actually refering to my variable at all! I deleted the variable set "sps=');" which is defined at the start of my script and it still returned the same values


Yes, I wondered why you put that there. You assumed it was going to do something but as the addition you made wasn't going to change the result I didn't comment on it because the task was changing anyway.

Matt20687
Posts: 54
Joined: 02 May 2012 14:42

Re: Searching html? Can a batch file do this?

#33 Post by Matt20687 » 10 May 2012 10:51

foxidrive wrote:Will each entry always appear in this format?

Will the bit in blue always appear between the number that you want and the reference that you have - ABC1234567 in this case?

patient_id=PREFIX-ID&sps_id=1234567');"><td>ABC1234567</td>


And what is the task?
Do you have acc.txt that contains a list of these reference numbers and you want to extract the 7 digit numbers for all of them from the html?


Hello foxi,

Yes you are right the bit in blue always appears between the sps ID and the reference in this which is ABC1234567. I have a text file named acc (acc.txt) which holds all of the reference numbers such as ABC1234567 and I would like to extract the 7 digit number (which is after sps_id=) which is on the same line as the reference number (held in acc.txt)

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Searching html? Can a batch file do this?

#34 Post by foxidrive » 10 May 2012 11:11

This works here with the sample html that you posted.


Code: Select all

@echo off
for /f "delims=" %%a in ('type "acc.txt"') do (
for /f "delims=" %%b in ('sed -n "s/.*sps_id=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9]\)');.><td>%%a.*/\1/p" "lms.txt"') do (

echo the reference "%%a" returns number "%%b"
rem run your commands here for the 7 digit number contained in %%b

)
)

Matt20687
Posts: 54
Joined: 02 May 2012 14:42

Re: Searching html? Can a batch file do this?

#35 Post by Matt20687 » 10 May 2012 11:51

foxidrive wrote:This works here with the sample html that you posted.


Code: Select all

@echo off
for /f "delims=" %%a in ('type "acc.txt"') do (
for /f "delims=" %%b in ('sed -n "s/.*sps_id=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9]\)');.><td>%%a.*/\1/p" "lms.txt"') do (

echo the reference "%%a" returns number "%%b"
rem run your commands here for the 7 digit number contained in %%b

)
)


You are a legend foxi!!! Works a treat thanks. I do appreciate all of your help, I only hope I can relay what I have learnt onto someone else who in need of assistance.

Thanks Again.
Matt

Matt20687
Posts: 54
Joined: 02 May 2012 14:42

Re: Searching html? Can a batch file do this?

#36 Post by Matt20687 » 11 May 2012 09:38

Matt20687 wrote:
foxidrive wrote:This works here with the sample html that you posted.


Code: Select all

@echo off
for /f "delims=" %%a in ('type "acc.txt"') do (
for /f "delims=" %%b in ('sed -n "s/.*sps_id=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9]\)');.><td>%%a.*/\1/p" "lms.txt"') do (

echo the reference "%%a" returns number "%%b"
rem run your commands here for the 7 digit number contained in %%b

)
)


You are a legend foxi!!! Works a treat thanks. I do appreciate all of your help, I only hope I can relay what I have learnt onto someone else who in need of assistance.

Thanks Again.
Matt


Hello foxi,

I thought i had finished!! Got right near the end and discovered i need a similar thing as to what you have provided me. Instead i need to search the same ims.txt file but instead of returning the sps number i need to return a ID, the number is inbetween a - and &sps_id so the layout would be:

-ID&sps_id

The difference in this one is that i need it to return the ID as i have shown above, the ID is not a specific length and can have both numeric and alphabetical characters (such as ABC123 or C12345677).

Are you able to tweak your code to be able to return the middle text/number as specified above?

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Searching html? Can a batch file do this?

#37 Post by foxidrive » 11 May 2012 09:49

I think this is what you want:

Code: Select all

@echo off
for /f "delims=" %%a in ('type "acc.txt"') do (
for /f "tokens=1,2" %%b in ('sed -n "s/.*-\(.*\)&sps_id=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9]\)');.><td>%%a.*/\1 \2/p" "lms.txt"') do (

echo the reference "%%a" returns ID="%%b" and number "%%c"
rem run your commands here for the 7 digit number contained in %%b

)
)
pause

Matt20687
Posts: 54
Joined: 02 May 2012 14:42

Re: Searching html? Can a batch file do this?

#38 Post by Matt20687 » 11 May 2012 09:56

foxidrive wrote:I think this is what you want:

Code: Select all

@echo off
for /f "delims=" %%a in ('type "acc.txt"') do (
for /f "tokens=1,2" %%b in ('sed -n "s/.*-\(.*\)&sps_id=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9]\)');.><td>%%a.*/\1 \2/p" "lms.txt"') do (

echo the reference "%%a" returns ID="%%b" and number "%%c"
rem run your commands here for the 7 digit number contained in %%b

)
)
pause


Going to try it now, thanks.

Out of interest where does it pull %%c from?

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Searching html? Can a batch file do this?

#39 Post by foxidrive » 11 May 2012 10:10

The "tokens=1,2" will allocate two variables to items of text that are separated by a space. The sed command returns the two items in \1 and \2

Post Reply