Page 1 of 2

how to rename files

Posted: 17 Feb 2012 05:51
by doscode
Hello,
I have files *.html like "Collection-396.html". Every file contains description which is like "<title>Aasiaat (BGAA)</title>". Is any way how to get the title, for example "Aasiaat (BGAA)" and to rename the html filenames to their titles? So the resulting file would be like Aasiaat (BGAA).html

Re: how to rename files

Posted: 17 Feb 2012 07:54
by foxidrive
This requires 32 bit windows as 64 bit windows cannot run the .com file which converts from unix format to dos format.

It also requires GnuSED for Windows.

The rename commands are placed in tempren.bat and you will have to check the target filenames manually in notepad etc (or just try it on a copy of the htm files and see what doesn't rename) because titles can contain illegal filename characters.

Code: Select all

@echo off

:: For 32 bit Windows you can include a unix->dos program within the
:: batch by a few echo lines:

@echo off
echo hD1X-s0P_kUHP0UxGWX4ax1y1ieimnfeinklddmemkjanmndnadmndnpbbn>u2d.com
echo hhpbbnpljhoxolnhaigidpllnbkdnhlkfhlflefblffahfUebdfahhfkokh>>u2d.com
echo oyHAPbP//c@g0oQ7mJG1/HL7t5///u2g0cER8Qg0k5E29Yf//E@Ev5//u1P>>u2d.com
echo /B6WQ1gSns1/HB61//31/.>>u2d.com

:: usage u2d <infile >outfile

echo.@echo off>tempren.bat

for /f "delims=" %%a in ('dir /b *.htm') do (
echo "%%a"
u2d <"%%a" |find /i "<title>" > "temp.tmp"
for /f "delims=" %%c in ('sed "s/.*<title>\(.*\)<\/title>.*/\1/i" "temp.tmp"') do >>tempren.bat echo ren "%%a" "%%c%%~xa"
del "temp.tmp"
)
del u2d.com
echo done - check tempren.bat

Re: how to rename files

Posted: 17 Feb 2012 16:02
by doscode

Code: Select all

echo hD1X-s0P_kUHP0UxGWX4ax1y1ieimnfeinklddmemkjanmndnadmndnpbbn>u2d.com


I thought that this operator > is used to write data to some file. So why should not this overwrite the u2d.com ? It gives no sense to me.

Re: how to rename files

Posted: 17 Feb 2012 20:31
by foxidrive
If you do not have a u2d.com file in the current folder then it it creating one, that is the purpose of those lines. It's an ascii binary.

Re: how to rename files

Posted: 18 Feb 2012 06:01
by doscode
Would it be possible to save the output of command to variable instead to > "temp.tmp"?

Re: how to rename files

Posted: 18 Feb 2012 06:09
by foxidrive
There are reasons why it is not in a variable - because the string can contain many quotes and other poison characters which are very difficult to process with vanilla batch.

Is this for a one off task or is it a regular task? Have you tried it?

If you want to improve it then consider using transliteration in SED and map the illegal characters to your choice of filename characters.

Re: how to rename files

Posted: 18 Feb 2012 06:23
by doscode
Yeah, it is for one off task, not regular. And I must understand things before I run them. But this seems too complicated. I would like to simplify it. It seems to me the u2d.com is not necessary. There are no poison characters in description, only slash "/" possibility. But the slash is not problem. If it is there is could be changed to "-" or removed. I expect the title is simple, no traps like quotes. I was going to use your loop with this command:

Code: Select all

for /f "delims=" %%a in ('dir /b *.a') do (type %%a | sed.exe -n "s/<title>\(.*\)<\/title>/\1/Ip")

Just looking a way how to pass it to variable. There are some tips that I have found... http://justgeeks.blogspot.com/2008/07/save-output-of-dos-command-to-variable.html
http://www.computerhope.com/forum/index.php?topic=65683.0
But maybe our command is too long/complicated for this.

Re: how to rename files

Posted: 18 Feb 2012 07:10
by foxidrive
doscode wrote:Yeah, it is for one off task, not regular. And I must understand things before I run them. But this seems too complicated. I would like to simplify it. It seems to me the u2d.com is not necessary.


You run an antivirus program without understanding it, I'm sure. You probably run dozens of utility programs without seeing the code.

BTW, converting the Unix line ends to msdos line ends is necessary for many batch functions. That is what this implementation of u2d does.

The author of u2d and many other useful batch and command line tools is Herbert Kleebauer and the source code is available on the web.

Re: how to rename files

Posted: 18 Feb 2012 07:11
by Squashman
Just down and dirty without taking into consideration of any poison characters.
Given this as your input which is actually the first couple lines of source of this webpage.

Code: Select all

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-gb" xml:lang="en-gb">
<head>

<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta http-equiv="content-language" content="en-gb" />
<meta http-equiv="content-style-type" content="text/css" />
<meta http-equiv="imagetoolbar" content="no" />
<meta name="resource-type" content="document" />
<meta name="distribution" content="global" />
<meta name="copyright" content="2000, 2002, 2005, 2007 phpBB Group" />
<meta name="keywords" content="" />
<meta name="description" content="" />

<title>DosTips.com - View topic - how to rename files</title>

<link rel="stylesheet" href="./styles/avalon/theme/stylesheet.css" type="text/css" />
<!--[if IE]>
<link rel="stylesheet" type="text/css" href="./styles/avalon/theme/ie7.css" />
<![endif]-->



You can use this code

Code: Select all

FOR /F "Tokens=2 delims=><" %%G in ('find /i "<title>" Dostips.html') do set title=%%G

Re: how to rename files

Posted: 18 Feb 2012 07:24
by foxidrive
Squashman, that's fine for neat and tidy html.

I tested mine on normal HTML files I have saved from the web - here is an example of the line containing the title:

Code: Select all

<html><head><meta HTTP-EQUIV="content-type" CONTENT="text/html; charset=ISO-8859-1"><title>Google Search:  </title><style><!--


Try your code on that. :)

Re: how to rename files

Posted: 18 Feb 2012 07:35
by Squashman
I see your point.
So not only do you have to take into consideration the evil characters batch doesn't like we also have to take into consideration that the TITLE metatag is not on a line by itself.

Re: how to rename files

Posted: 18 Feb 2012 07:38
by foxidrive
Squashman wrote:I see your point.
So not only do you have to take into consideration the evil characters batch doesn't like we also have to take into consideration that the TITLE metatag is not on a line by itself.



Yes. My assumption in the code is that both <title> and </title> are on the same line but even that could be false.

Re: how to rename files

Posted: 18 Feb 2012 08:11
by Squashman
I can get it down to this.
Google Search: </title><style><!--
Can't seem to get everything after the closing title metatag to go away.

Re: how to rename files

Posted: 18 Feb 2012 08:49
by foxidrive
Here's another option - I don't think it is anywhere near as robust but it worked on the few files I tested it on.

Code: Select all

@echo off
echo.@echo off>tempren.bat

for /f "delims=" %%a in ('dir /b *.htm') do (
for /f "delims=" %%b in ('find /i "<title>" ^< "%%a"') do (
set "source=%%a"
set "var=%%b"
call :a
)
)
echo done - check tempren.bat
pause
goto :EOF

:a
set "var=%var:*<title>=%"
for /f "delims=<" %%c in ("%var%") do set "target=%%c.html"
set "target=%target::=-%"
set "target=%target:\=-%"
set "target=%target:/=-%"
set "target=%target:|=-%"
set "target=%target:?=-%"
>>tempren.bat echo ren "%source%" "%target%"


Re: how to rename files

Posted: 18 Feb 2012 09:50
by doscode
foxidrive wrote:You run an antivirus program without understanding it, I'm sure. You probably run dozens of utility programs without seeing the code.


It is in my nature, that I always try to understand code. Not satisfied if have long code which I do not know how it works. Better to know it, to make own way next time. Its significance is highlighted in Windows, where I work often; CMD is such basic tool here and I think I should to know how to work with it. It can save pretty much of time sometimes.