Dir undocumented wildcards

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Dir undocumented wildcards

#16 Post by dbenham » 25 Jan 2015 00:48

The non-standard wildcards work with FINDSTR, and they have an interesting effect that could be really useful :!: 8) :D

Normally, FINDSTR prefixes each matching line of output with the file name if it searches multiple files due to wildcards, or multiple named files.

Code: Select all

C:\test>for %n in (1 2 3) do @(echo ignore&echo file = "test%n.txt"&echo ignore) >test%n.txt

C:\test>type test?.txt

test1.txt


ignore
file = "test1.txt"
ignore

test2.txt


ignore
file = "test2.txt"
ignore

test3.txt


ignore
file = "test3.txt"
ignore

C:\test>findstr /v ignore "test?.txt"
test1.txt:file = "test1.txt"
test2.txt:file = "test2.txt"
test3.txt:file = "test3.txt"


But if the file mask includes a non-standard wildcard then the file name prefix is suppressed:

Code: Select all

C:\test>findstr /v ignore "test>.txt"
file = "test1.txt"
file = "test2.txt"
file = "test3.txt"

This can be very useful :!: I've seen multiple requests where people want to use FINDSTR to extract lines from multiple files and put the result in a single file. Normally a solution requires either a normal FOR to loop through the files and run FINDSTR on each file, or else FOR /F to process the result of a single FINDSTR with wildcards and strip out the prefix.

But the non-standard wildcards provide a simple solution with a single FINDSTR, and no need for FOR.


Dave Benham

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Dir undocumented wildcards

#17 Post by penpen » 25 Jan 2015 19:24

Liviu wrote:From the horse's mouth: <snip: only two links allowed>
- "All file systems follow the same general naming conventions for an individual file: a base file name and an optional extension, separated by a period" - which implies that the name is mandatory, the extension optional.
- "However, it is acceptable to specify a period as the first character of a name. For example, ".temp"." - which implies that ".temp" is the name.
I think, you misunderstood the purpose of the linked document.
The underlying file system (NTFS, FAT12/16, ...) and the Win32 file API both have definitions of how a filename is defined.
The file systems (only) define a filename as a base file name with an optional extension.
The Win32 file API is written as common as possible, so it also allows nearly any non empty String as a filename:
If the underlying filename does support it, you may create a file named "." (had to delete it by using a backup... impossible to delete that using the Win32 file API...).
The filename is simply split at the last '.' character into the base filename and the extension.
But i'm not sure if the dot belongs to the base name, the extension, or to none of them:
I've seen all versions and no documentation about that, so you still could be right if the definition somewhere says that the Win32 file API defines it in the same way, but i doubt it (or it were the worst implementation ever).
For example dot belongs to the extension (although this isn't the Win32 file API):

Code: Select all

Z:\>>.c echo a

Z:\>>c.c echo a

Z:\>@for %a in (*.c) do @echo "%~na"  "%~xa"
""  ".c"
"c"  ".c"
To have no problem with whatever file system Microsoft has created the "Naming Conventions": Lets name them "windows shell Conventions".

So the first part ("All file systems follow the...") describes the filesystem convention,
while the second part ("However, it is acceptable to specify a...") is a part of the windows shell Conventions.
So you cannot derive that ".temp" is the name because it depends on the file system, while ".temp" is the Win32 API representation of the file name.

I prefer using the "windows shell naming convention" in this forum, because batch can't directly access/manipulate the file system/win32 API.

Liviu wrote:That said, names like ".temp" are uncommon in Windows, and have always been confusing even to Microsoft's own. For example, one of the well respected MSDN blogs offers the following, which directly contradicts the formal spec: http://blogs.msdn.com/b/oldnewthing/archive/2008/04/14/8389268.aspx - "Such files are considered to have an extension but no name."
This may be no contradiction if Microsoft has defined it that way within the Win32 API.
(Although i've never seen such a definition... but i've read something like that multiple times, which is on the other hand no proof at all.)


@dbenham
Finally i'm sad to say, that i've found out (accidentally), that XP behaves... exotic:

Code: Select all

Z:\>dir /x
 Datenträger in Laufwerk Z: ist Test
 Volumeseriennummer: 0438-EEA7

 Verzeichnis von Z:\

25.01.2015  18:04    <DIR>                       .
25.01.2015  18:04    <DIR>                       ..
               0 Datei(en)              0 Bytes
               2 Verzeichnis(se), 133.163.368.448 Bytes frei

Z:\>> 12345678.1234 echo a

Z:\>> .c echo a

Z:\>dir /X
 Datenträger in Laufwerk Z: ist Test
 Volumeseriennummer: 0438-EEA7

 Verzeichnis von Z:\

25.01.2015  17:41    <DIR>                       .
25.01.2015  17:41    <DIR>                       ..
25.01.2015  17:41                 3 C36E2~1      .c
25.01.2015  17:41                 3              123456789.1234
               2 Datei(en)              6 Bytes
               2 Verzeichnis(se), 133.163.368.448 Bytes frei

Z:\>fsutil behavior query disable8dot3
disable8dot3 = 1
If i use a disk editor to change the short name to null, then the 'dir "<"' command doesn't list the ".c" file anymore.
So the behaviour of 'dir "<"' is not really an exception.

penpen

Edit: Added "@dbenham" to better divide sections.
Last edited by penpen on 26 Jan 2015 06:45, edited 1 time in total.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Dir undocumented wildcards

#18 Post by dbenham » 25 Jan 2015 19:34

Gah :evil: Damn short names got me again.

Great investigation penpen, thanks :D

The short vs. long name is ridiculous. Check this out :!:

Code: Select all

C:\test>dir /x .bat
 Volume in drive C has no label.
 Volume Serial Number is 5ED1-638E

 Directory of C:\test

01/25/2015  01:22 AM                19 BAT~1        .bat
               1 File(s)             19 bytes
               0 Dir(s)  861,738,713,088 bytes free

C:\test>type .bat
@echo It's alive!

C:\test>.bat
It's alive!

C:\test>bat~1
'bat~1' is not recognized as an internal or external command,
operable program or batch file.

C:\test>type bat~1
@echo It's alive!


Dave Benham

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Dir undocumented wildcards

#19 Post by Liviu » 25 Jan 2015 20:16

penpen wrote:The underlying file system (NTFS, FAT12/16, ...) and the Win32 file API both have definitions of how a filename is defined.
The file systems (only) define a filename as a base file name with an optional extension.
The Win32 file API is written as common as possible, so it also allows nearly any non empty String as a filename

I don't think I misunderstood. The Win32 API documentation for, for example, CreateFile https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858(v=vs.85).aspx references a page https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx which is virtually identical to the one I linked in the previous post.

As for experiments hex-editing raw sectors on disk, those are interesting on their own, but do not apply to the point I was making - about the Win32 API rules for filenames, by which rules a file must always have a non-empty name.

That said, everyone is free to read and interpret the given references their own way, of course.

Liviu

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Dir undocumented wildcards

#20 Post by penpen » 26 Jan 2015 09:51

dbenham wrote:Great investigation penpen, thanks :D
No problem, but i lead you into that trap... so sorry for that.
I only didn't expected the short name creation, as it should be disabled... (see "123456789.1234").


Liviu wrote:I don't think I misunderstood. The Win32 API documentation for, for example, CreateFile (...) references a page (...) which is virtually identical to the one I linked in the previous post.
Maybe i've misunderstood your above post; i only wanted to say that (the Naming Convention listed in) the referenced page is just a recommendation, it doesn't show the Win32 API definition of (base) file name and extension.

Proof:
The Win32 API and NTFS (both) allow the creation of files with a name that violates the linked Naming Conventions: For example "...", or "abba ".

I admit that it is a little bit tricky to create such directories (or files) using the Win32 API, but you may download the FAR Manager (that uses the Win32 API only) to them:
http://www.farmanager.com/

Liviu wrote:As for experiments hex-editing raw sectors on disk, those are interesting on their own, but do not apply to the point I was making - about the Win32 API rules for filenames, by which rules a file must always have a non-empty name.
Sorry, i've not divided the last part of my post from the second last one. The disk editing should only answer me if 'dir "<"' is able to list files with filenames like ".c".


penpen

mcnd
Posts: 27
Joined: 08 Jan 2014 07:29

Re: Dir undocumented wildcards

#21 Post by mcnd » 26 Jan 2015 11:44

Just to be precise, the Undocumented wildcards term is not completely accurate. The [MS-FSA] File Systems Algorithms contains:

Algorithm for Determining If a Character Is a Wildcard
Algorithm for Determining if a FileName Is in an Expression

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Dir undocumented wildcards

#22 Post by dbenham » 26 Jan 2015 12:19

mcnd wrote:Just to be precise, the Undocumented wildcards term is not completely accurate. The [MS-FSA] File Systems Algorithms contains:

Algorithm for Determining If a Character Is a Wildcard
Algorithm for Determining if a FileName Is in an Expression

Hmmm. :? I'm not convinced.
Either that documentation is wrong, or it does not apply here.

For example:
1) I cannot get " to function as a wildcard under any circumstances, yet the doc claims it can match a period. It is curious that they call " a DOS_DOT. Perhaps it is really referring to the period, and not a quote.
2) The doc claims ? matches a single character (implying any character), but I see that it matches 0 or 1 character, and it cannot match a period.


Dave Benham

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Dir undocumented wildcards

#23 Post by penpen » 26 Jan 2015 17:55

I wanted to test if the dir command may use the (c++) functions "FindFirstFile", "FindNextFile", ... .
So i've written a little program "list.exe" - you may compile it by using "listCompile.bat" (below).

The bahaviour seems to be identical with one exception: The " character.
But i suspect that the doublequotes get consumed somehow by the dir command or the command line interpreter.
=> I think the above functions are called by dir command.
Then the wildcard characters are all in use by these functions, so i think mcnd is partial right:
The algorithm is used, but not on the complete filename, but (somehow) on the base file name (=:b) and the extension (=:e), which may explain the '?' behaviour at the '.' separator.
(Maybe it suffices if the filter matches b<NUL>*e during the match: So the ? may always match one character.)

If you use "list.exe", you could use the doublequotes by escaping them with a \

Code: Select all

Z:\>list "c*"
Filter:c

#c.c#

Z:\>list "c\"c"
Filter:c"c

#c.c#

"listCompile.bat":

Code: Select all

// // >nul 2> nul & @goto :main
/*
:main
   @echo off
   setlocal
   cls

   set "csc="

   pushd "%SystemRoot%\Microsoft.NET\Framework"
   for /f "tokens=* delims=" %%i in ('dir /b /o:n "v*"') do (
      dir /a-d /b "%%~fi\csc.exe" >nul 2>&1 && set "csc="%%~fi\csc.exe""
   )
   popd

   if defined csc (
      echo most recent C#.NET compiler located in:
      echo %csc%.
   ) else (
      echo C#.NET compiler not found.
      goto :eof
   )

   %csc% /nologo /optimize /warnaserror /nowin32manifest /debug- /target:exe /out:"%~dp0List.exe" "%~f0"
   goto :eof
*/


using System;
using System.Threading;
using System.Runtime.InteropServices;


// dllImport: filetypes
using HANDLE = System.IntPtr;
using DWORD = System.Int32;
//using LPCTSTR = string;

class List {
   public const int SUCCESS = 0;
   public const int FAIL = 1;


   static int Main (string [] args) {
      WIN32_FIND_DATA findFileData;
      HANDLE hFindFile;
      string lpFileName = "*.*";

      if (args.Length >= 1) lpFileName = args [0];
      Console.WriteLine ("Filter:{0}\n", lpFileName);

      hFindFile = FindFirstFile (lpFileName, out findFileData);

      if (hFindFile == INVALID_HANDLE_VALUE) {
         DWORD lastError = GetLastError ();
         switch (lastError) {
            case ERROR_NO_MORE_FILES:
            case ERROR_FILE_NOT_FOUND:
               break;
            case ERROR_PATH_NOT_FOUND:
               Console.WriteLine ("Path not found.");
               break;
            case ERROR_NOT_ENOUGH_MEMORY:
               Console.WriteLine ("Out of memory.");
               break;
            default:
               Console.WriteLine ("Unexpected error: {0}", lastError);
               break;
         }
      } else {
         do {
            Console.WriteLine ("#{0}#", findFileData.cFileName);
         } while (FindNextFile (hFindFile, out findFileData));
         FindClose (hFindFile);
      }


      return SUCCESS;
   }



   // dllimport: functions
   [DllImport ("kernel32.dll", CharSet=CharSet.Auto)]
   static extern IntPtr FindFirstFile (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

   [DllImport ("kernel32.dll", CharSet=CharSet.Auto)]
   static extern bool FindNextFile (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

   [DllImport ("kernel32.dll", CharSet=CharSet.Auto)]
   public static extern bool FindClose (IntPtr hFindFile);

   [DllImport ("kernel32.dll", CharSet=CharSet.Auto)]
   static extern DWORD GetLastError ();


   // dllimport: structures
   [StructLayout (LayoutKind.Sequential, CharSet=CharSet.Auto)]
   struct WIN32_FIND_DATA {
      public uint dwFileAttributes;
      public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
      public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
      public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
      public uint nFileSizeHigh;
      public uint nFileSizeLow;
      public uint dwReserved0;
      public uint dwReserved1;

      [MarshalAs(UnmanagedType.ByValTStr, SizeConst=260)]
      public string cFileName;

      [MarshalAs(UnmanagedType.ByValTStr, SizeConst=14)]
      public string cAlternateFileName;
   }

   // dllimport: constants
   public const DWORD ERROR_FILE_NOT_FOUND    = (DWORD)  2;
   public const DWORD ERROR_PATH_NOT_FOUND    = (DWORD)  3;
   public const DWORD ERROR_NOT_ENOUGH_MEMORY = (DWORD)  8;
   public const DWORD ERROR_NO_MORE_FILES     = (DWORD) 16;
   public static readonly HANDLE INVALID_HANDLE_VALUE = new IntPtr (-1);
}


penpen

Samir
Posts: 384
Joined: 16 Jul 2013 12:00
Location: HSV
Contact:

Re: Dir undocumented wildcards

#24 Post by Samir » 02 Feb 2015 11:16

Fascinating. And I thought there was nothing new I'd ever learn about dir. 8)

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Dir undocumented wildcards

#25 Post by Squashman » 10 Dec 2015 12:14

Can someone test this stuff on Windows 10. Seems like some of them do not work according to a reply to one of my answers on StackOverFlow.
http://stackoverflow.com/a/34206104/1417694

mcnd
Posts: 27
Joined: 08 Jan 2014 07:29

Re: Dir undocumented wildcards

#26 Post by mcnd » 10 Dec 2015 13:56

Squashman wrote:Can someone test this stuff on Windows 10. Seems like some of them do not work according to a reply to one of my answers on StackOverFlow.
http://stackoverflow.com/a/34206104/1417694


Tested and working on Windows 10.0.10586 (64 bits spanish locale)

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Dir undocumented wildcards

#27 Post by Squashman » 10 Dec 2015 14:14

mcnd wrote:
Squashman wrote:Can someone test this stuff on Windows 10. Seems like some of them do not work according to a reply to one of my answers on StackOverFlow.
http://stackoverflow.com/a/34206104/1417694


Tested and working on Windows 10.0.10586 (64 bits spanish locale)

Well I started a chat room with them. They said they basically used the same code I gave them.
http://chat.stackoverflow.com/rooms/975 ... ilenicusor

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: Dir undocumented wildcards

#28 Post by Squashman » 04 Nov 2016 13:42

dbenham wrote:
Below are some interesting test cases:

Code: Select all

                |                         |      GREEDY NATURE                                                          
file            | "??.??.??" | ">>.>>.>>" | "?a?.??.??" | ">a>.??.??"
----------------+------------+------------+-------------+-------------
a               |  match     |  no match  |  no match   |  no match
ab              |  match     |  no match  |  no match   |  no match
abc             |  no match  |  no match  |  no match   |  no match
a.1             |  match     |  no match  |  no match   |  no match
ab.12           |  match     |  no match  |  no match   |  no match
abc.123         |  no match  |  no match  |  no match   |  no match
a.1.x           |  match     |  match     |  no match   |  no match
ab.12.xy        |  match     |  match     |  no match   |  no match
abc.123.xyz     |  no match  |  no match  |  no match   |  no match
a.1.x.7         |  no match  |  no match  |  no match   |  no match
ab.12.xy.78     |  no match  |  no match  |  no match   |  no match
abc.123.xyz.789 |  no match  |  no match  |  no match   |  no match

                                                         | NON-GREEDY
file            | "*.*.*"  | "*."     | "**." | "abc.*." | "*a*"
----------------+----------+----------+-------+----------+-----------
abc             |  match   | match    | match | match    | match
abc.123         |  match   | no match | match | match    | match
abc.123.xyz     |  match   | no match | match | match    | match
abc.123.xyz.789 |  match   | no match | match | match    | match

                                                         |      NON-GREEDY
file            | "<.<.<"  | "<"      | "<<"  | "abc.<"  | "<a<"    | "<a<<"
----------------+----------+----------+-------+----------+----------+--------
abc             | no match | match    | match | no match | match    | match
abc.123         | no match | no match | match | match    | no match | match
abc.123.xyz     | match    | no match | match | no match | no match | match
abc.123.xyz.789 | match    | no match | match | no match | no match | match


=========================================================

The other fascinating (and scary) discovery that jeb made was that < and > wildcards may be used in command names when executing an external command! "<st.bat" can successfully execute "test.bat".

Question: So there is no instance when you would use an asterisk and one of the <>

penpen
Expert
Posts: 1991
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Dir undocumented wildcards

#29 Post by penpen » 05 Nov 2016 09:08

Well something like ">>->>->>>> *" could be usefull, but you always could replace the "*" with "<<".
Depends on your preferences.

penpen

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Dir undocumented wildcards

#30 Post by dbenham » 05 Nov 2016 12:29

Squashman wrote:
dbenham wrote:
Below are some interesting test cases:

Code: Select all

                |                         |      GREEDY NATURE                                                          
file            | "??.??.??" | ">>.>>.>>" | "?a?.??.??" | ">a>.??.??"
----------------+------------+------------+-------------+-------------
a               |  match     |  no match  |  no match   |  no match
ab              |  match     |  no match  |  no match   |  no match
abc             |  no match  |  no match  |  no match   |  no match
a.1             |  match     |  no match  |  no match   |  no match
ab.12           |  match     |  no match  |  no match   |  no match
abc.123         |  no match  |  no match  |  no match   |  no match
a.1.x           |  match     |  match     |  no match   |  no match
ab.12.xy        |  match     |  match     |  no match   |  no match
abc.123.xyz     |  no match  |  no match  |  no match   |  no match
a.1.x.7         |  no match  |  no match  |  no match   |  no match
ab.12.xy.78     |  no match  |  no match  |  no match   |  no match
abc.123.xyz.789 |  no match  |  no match  |  no match   |  no match

                                                         | NON-GREEDY
file            | "*.*.*"  | "*."     | "**." | "abc.*." | "*a*"
----------------+----------+----------+-------+----------+-----------
abc             |  match   | match    | match | match    | match
abc.123         |  match   | no match | match | match    | match
abc.123.xyz     |  match   | no match | match | match    | match
abc.123.xyz.789 |  match   | no match | match | match    | match

                                                         |      NON-GREEDY
file            | "<.<.<"  | "<"      | "<<"  | "abc.<"  | "<a<"    | "<a<<"
----------------+----------+----------+-------+----------+----------+--------
abc             | no match | match    | match | no match | match    | match
abc.123         | no match | no match | match | match    | no match | match
abc.123.xyz     | match    | no match | match | no match | no match | match
abc.123.xyz.789 | match    | no match | match | no match | no match | match


=========================================================

The other fascinating (and scary) discovery that jeb made was that < and > wildcards may be used in command names when executing an external command! "<st.bat" can successfully execute "test.bat".

Question: So there is no instance when you would use an asterisk and one of the <>

No, I was not implying that. Those were just some test cases I used to try to help determine the rules. But I think those tests were run with short 8.3 names enabled, which complicates the interpretation.

As far as I know, the correct rules are at viewtopic.php?f=3&t=6207#p39420. I have yet to see a case that violates (or disproves) the rules that I laid out.


Dave Benham

Post Reply