Page 1 of 1

some absurdities w/ if exist and escaping with caret and ""

Posted: 11 Sep 2013 15:52
by taripo
some absurdities with if exist and escaping with caret and ""

I don't know if this is just with if exist,

can anybody explain this behavior?

exhibit a
I test to see if a directory exists. but get a different result based on whether I included quotes or not.

Code: Select all

C:\>if exist c:\abcde\nul echo yeah
yeah

C:\>if exist "c:\abcde\nul" echo yeah

C:\>


exhibit b
Normally one only needs to place quotes around the part where there is an issue and it's fine. But here, I get a different result with each of these

Code: Select all

C:\>IF exist c:\program fi"les (x86)"\gnuwin32\bin echo yeah

C:\>IF exist "c:\program files (x86)"\gnuwin32\bin echo yeah
yeah

C:\>



exhibit c
Normally one can interchange ^ and "" when it comes to a single character, but not here. I don't see why.

Code: Select all

(i'll just do a cd and dir /ad/b so you know what directory tree I have)
C:\abcde\a(b)c>

C:\>dir c:\abcde /ad/b
a(b)c
asd

C:\>


C:\>if exist c:\ab" "cde\a(b)c echo yeah
yeah

C:\>if exist c:\ab^ cde\a(b)c echo yeah

C:\>

Re: some absurdities w/ if exist and escaping with caret and

Posted: 11 Sep 2013 17:10
by penpen
@exhibit a
The name nul is a reserved word in MS_DOS/Windows file systems (all of them).
One of the reasons is what you have named exhibit a.
The cause: Checking devices has higher priority than checking file objects.
With the above you are checking the existence of the nul device in path "C:\abcde",
which fails, as it is not located in path "C:\abcde".
So the the if command works correctly.

You may call the nul device by typing nul in the dos shell.
A message should pop up, that says "cannot accessed" or something similar.
With all allowed filenames the "if exist" works as you expect.
You may read this for more information:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

@exhibit b, c
The characters preceeded by a caret will be escaped, that means the next char has no special meaning,
so the caret space combo is expanded to a space.
The tokenization (on the escaped string) explains the behavior you see:

Code: Select all

[IF] [exist] [c:\program] [fi"les (x86)"\gnuwin32\bin] [echo] [yeah]
[IF] [exist] ["c:\program files (x86)"\gnuwin32\bin] [echo] [yeah]

[if] [exist] [c:\ab" "cde\a(b)c] [echo] [yeah]
[if] [exist] [c:\ab] [cde\a(b)c] [echo] [yeah]
On path normalization the string is expanded to its content, so the "" in your examples just vanished.
I assume you have an additional directory path "c:\ab cde\a(b)c", as i have tested this without this path and
failed on WinXp home/prof and Win7 home 32 bit.

penpen

Re: some absurdities w/ if exist and escaping with caret and

Posted: 12 Sep 2013 00:40
by taripo
thanks..

where can I read about "path normalization" specifically, "path normalization" in the dos prompt or msdos?

(I don't see anything about it on HELP.COM in html form here http://www.vfrazee.com/ms-dos/6.22/help/)

regarding exhibit a,
I have that directory, and I know about the nul device so was intending to use the nul device to test for the directory.

I guess the quotes make any reference to the nul device, literal in the sense that now it will look only for file objects and not the nul device?

If so, how can the unexpected result in exhibit d be explained?


exhibit d

Code: Select all

C:\crp>if exist "c:\crp\nul" echo yeah  <--- expected

C:\crp>copy con nul
fd^Z
        1 file(s) copied.

C:\crp>if exist "c:\crp\nul" echo yeah  <-- unexpected

C:\crp>



regarding exhibit c


this tokenization makes sense to me (and I do have that directory)

Code: Select all

C:\>if exist c:\ab" "cde\a(b)c echo yeah   <-- expected. so, no surprise there
yeah

I see it's tokenizing as  [c:\ab cde\a(b)c]  and I guess that's a file object.  It matches my directory, fine.


this one below seems strange, I don't know what the tokenization is. I would expect the ^ preceding the space, to have the same
effect as " " and so I would have expected one token [c:\ab cde\a(b)c] and thus I would expect it to see my directory.

exhibit c -improved

Code: Select all

C:\ab cde\a(b)c>if exist c:\ab^ cde\a(b)c echo yeah <--- unexpected. i would expect yeah to be echoed.

C:\ab cde\a(b)c>

Re: some absurdities w/ if exist and escaping with caret and

Posted: 12 Sep 2013 00:41
by foxidrive
taripo wrote:I test to see if a directory exists. but get a different result based on whether I included quotes or not.

Code: Select all

C:\>if exist c:\abcde\nul echo yeah



The trick with nul worked in pre NT versions of windows.

Now you would use this, with a trailing backslash.

if exist "C:\abcde\" echo the folder exists


For a file you would use this: the quotes protect against poison characters.

IF exist "c:\program files (x86)\gnuwin32\bin.exe" echo file exists

Re: some absurdities w/ if exist and escaping with caret and

Posted: 12 Sep 2013 01:40
by taripo
@foxidrive, that is totally not what i'm asking (and by the way, nul still works). And I did not ask how to look for a file or directory that contains a space. That is obvious.

penn understood my question, though there are still some things that seem odd to me, as mentioned in my reply to him.

Re: some absurdities w/ if exist and escaping with caret and

Posted: 12 Sep 2013 02:33
by foxidrive
taripo wrote:@foxidrive, that is totally not what i'm asking (and by the way, nul still works).


Then don't use code that no longer works in NT systems and you won't be corrected. Nul doesn't work reliably as a test for directories.

You proved it yourself in your question, so I answered it. Show some appreciation.

taripo wrote:I don't know if this is just with if exist,

can anybody explain this behavior?

exhibit a
I test to see if a directory exists. but get a different result based on whether I included quotes or not.

Code: Select all

C:\>if exist c:\abcde\nul echo yeah
yeah

C:\>if exist "c:\abcde\nul" echo yeah

C:\>


Re: some absurdities w/ if exist and escaping with caret and

Posted: 12 Sep 2013 04:14
by taripo
taripo wrote:@foxidrive, that is totally not what i'm asking (and by the way, nul still works).


foxidrive wrote:Then don't use code that no longer works in NT systems and you won't be corrected.
Nul doesn't work reliably as a test for directories.


taripo wrote:that is news to me, good that you make that correction now. But you didn't make it before.




foxidrive wrote:You proved it yourself in your question



For the record, no that didn't prove that nul to test for directories, no longer works reliably in NT.

It may be though that in practice one shouldn't bother with nul to test for directories in NT, because it is too unpredictable and undocumented.


The only way to come close to proving what you say, would be to test it in DOS and get a different result to NT. I have since done that.

Technically that doesn't prove that nul isn't reliable as a test for directories, it just proves that it works differently in NT, and a test shows trailing backslash works when testing for a directory in NT, and my test with it on DOS shows it doesn't work in DOS.

I don't know if you read in some documentation that nul shouldn't be used in NT to test for directories?

It's only unreliable if you don't know how it works. But not knowing how it works in NT, and if nobody has figured out and described how it works, i'd use trailing backslash after the directory instead of nul when testing for directories.

and somebody like might've figured out that use of nul in NT, enabling a technical answer to that stating the differences between them.


And that still does not explain this which doesn't involve nul

Code: Select all

C:\ab cde\a(b)c>if exist c:\ab^ cde\a(b)c echo yeah <--- unexpected. i would expect yeah to be echoed.

C:\ab cde\a(b)c>


yes obviously foxtrot.. I can do if exist "c:\ab cde\a(b)c\"

But caret and "" -do- work in NT

So i'm asking anybody that might know, why -that- which I typed there in the code tags, with the caret and the quotes used, isn't working?

Re: some absurdities w/ if exist and escaping with caret and

Posted: 12 Sep 2013 17:15
by penpen
taripo wrote:where can I read about "path normalization" specifically, "path normalization" in the dos prompt or msdos?
Normalization is a word in math terms that means, a specific representation. In this case normalization is
just the transformation of an arbitrary path to the Fully Qualified Path. I'm sure you can do this intuitively,
if not see the rules at the url, i've given above. The quotes and escape sequences are replaced by the
command line interpreter itself, but the resulting string is handled as one token. So the path normalization
is done on a 'normal' string. The normalization is done on the string and the current directory value, that is
tracked by the system only, it doesn't check any existance of pathes, files, ... :

Code: Select all

Z:\>dir "i bet this path does not exist, but that doesn't matter even if it contains these characters: \/:*?""<>| \..\..\." /B
tokenize.bat


taripo wrote:I guess the quotes make any reference to the nul device, literal in the sense that now it will look only for file objects and not the nul device?
I'm not sure if that is true, it may be, but i won't bet on it; the paragraph Namespace in the given link is a little bit diffuse.
If there is more information available it can surely be found in some MS C++ documentation.
I would handle it in this way, as the namespaces of nul may be linked explicitely to a folder:
foxidrive wrote:Nul doesn't work reliably as a test for directories.
Edit:This means don't use nul, com1, ... within filenames.
taripo wrote:If so, how can the unexpected result in exhibit d be explained?
exhibit d

Code: Select all

C:\crp>if exist "c:\crp\nul" echo yeah  <--- expected

C:\crp>copy con nul
fd^Z
        1 file(s) copied.

C:\crp>if exist "c:\crp\nul" echo yeah  <-- unexpected

C:\crp>

The program copy.exe, or a functionality called by copy.exe, may be able to set, unset,
link, unlink, hide, ... namespaces and symbolic links visible to each file system object.
Or it is a result of how windows handles namespaces mentioned in the second point,
directly above, don't know.

taripo wrote:this one below seems strange, I don't know what the tokenization is. I would expect the ^ preceding the space, to have the same
effect as " " and so I would have expected one token [c:\ab cde\a(b)c] and thus I would expect it to see my directory.

exhibit c -improved

Code: Select all

C:\ab cde\a(b)c>if exist c:\ab^ cde\a(b)c echo yeah <--- unexpected. i would expect yeah to be echoed.

C:\ab cde\a(b)c>
No your expectation is wrong: I've posted it above. The tokenization is done after ^SPACE is replaced by a SPACE only.
Here is a small batch that should display the tokenization:

Code: Select all

rem tokenize.bat
@echo off
if "%~2" == "" goto :last
:loop
set /P "=[%1] " < nul
shift 1
if not "%~2" == "" goto loop
:last
echo [%1]
Then just use it:

Code: Select all

Z:\>tokenize if exist c:\ab^ cde\a(b)c echo yeah
[if] [exist] [c:\ab] [cde\a(b)c] [echo] [yeah][if] [exist] [c:\ab] [cde\a(b)c] [echo] [yeah]


penpen

Re: some absurdities w/ if exist and escaping with caret and

Posted: 13 Sep 2013 17:19
by taripo
thanks pen that's great. I didn't realise caret didn't escape space, but indeed it doesn't.
I see that while the caret gets removed from ^SPACE, the space remains a separator of parameters and not a literal space.

Re: some absurdities w/ if exist and escaping with caret and

Posted: 15 Sep 2013 15:07
by penpen
A seperator may be realized by adding the character to the list of control characters and referencing a special function (end token) to it, although this is rather laborious.
But there is a hint, that this is not the case in batch scripts, with a more or less funny side effect (first):

Code: Select all

Z:>set see the hint
Die Umgebungsvariable "see the" ist nicht definiert.

Z:>set see the hint=true
see the hint=true

Z:>set see
see the hint=true

If the space were a control character, then you would have needed to escape the space in the second set instruction above to make it work.
Sad to say to be able to define such variables is of not much use within batch scripts.

And in addition:
The caret is a control character with the function to prevent the following character to be recognized as a control character by a program, so the caret is an escape character of the command line interpreter. This is done in common by changing a state of a program, which may fail if such an state may be unreachable from the actual state.
So it is common use, that the escape character is removed, only if its function could be executed successfully, to see where errors occure.
So if the escape character disappears, it has 'escaped' the following character successfully.
The caret vanishes on the escape sequence (caret, space), so the space was escaped successfully.

penpen

Re: some absurdities w/ if exist and escaping with caret and

Posted: 16 Sep 2013 04:52
by jeb
penpen wrote:If the space were a control character, then you would have needed to escape the space in the second set instruction above to make it work.


True and false, as the problem is that the batch parser uses for many of the internal commands different parser rules.

See the difference here

Code: Select all

set / ? 
if / ?

The first one works, the second fails.

Even the carets aren't working everywhere

Code: Select all

echo This is one ^
line

Here the caret works as a multiline concatenation.

But not in the next example

Code: Select all

REM This doesn't ^
work

Re: some absurdities w/ if exist and escaping with caret and

Posted: 16 Sep 2013 06:31
by penpen
jeb wrote:True and false, as the problem is that the batch parser uses for many of the internal commands different parser rules.
I agree that there are multiple parser rules, but a characters token type (exampel: space == seperator) might be
ignored within a special parser rule, but a control character's function is callable from any phase (per definition).

It might not performed successfully from any state, and in case of the command line interpreter there is at least
one state for the caret control character where the execution of its function fails (wanted behaviour), and so is not removed:
jeb wrote:But not in the next example

Code: Select all

REM This doesn't ^
work

I'm sure when reaching the caret the Lexer is in a state (phase) where all characters are handled as normal characters
(or it ignores all characters up to the next \r\n, but i don't think so because of rem:).
So i don't see this as an example where the caret doesn't work (if that means its function is not called).
I see this as an (wanted) example where the function is called, but fails its execution, and because the caret is an escape
character it remains there, similar to "^" or on the second caret: ^^, so the default behaviour of an escape character is
completely implemented. (I'm not sure if they've really implemented it in such a way, as i've never seen the code of the
command line interpreter, but the behaviour of the caret is just indistinguishable to the default escape characters behaviour.)

And additionally i have no doubt, that the space is a seperator, but i believe that it is not associated with any own functionality in form of a control character.

penpen