It has come up before that the "<!-- ::" syntax (or "<!-- :") could be used to build hybrid cmd+xml files. There have been examples of it for cmd+html/mshta/wsf, but I don't recall a cmd+xslt one. This post seeks to cover that missing part. Anyone not interested in xml/xslt might as well stop here, and skip over the rest.
The two techniques below provide for:
1. a hybrid cmd+xslt batch file with an embedded stylesheet, used to process an xml file of known format using an external command line XML processor of user's choice;
2. a hybrid cmd+wsf+xslt all-in-one batch file with an embedded stylesheet, and also the necessary jscript code to apply it to a given xml, using just Windows' builtin parser, without relying on external processors.
Think it's easier to follow if I start with a - completely made up, still somewhat believable - use case. Suppose one had an XML file holding file information and respective hashes, such as...
Code: Select all
<?xml version='1.0' encoding='UTF-8'?>
<dfxml xmloutputversion='1.0'>
<fileobject>
<filename>file-0.txt</filename>
<hashdigest type='MD5'>a3037a8c309f79fbccc1cfb0bce59634</hashdigest>
<hashdigest type='SHA1'>c2f71e783ed2b5e78d30917a7363ed041de87d02</hashdigest>
</fileobject>
<fileobject>
<filename>dir-1\file-1-1.txt</filename>
<hashdigest type='MD5'>2ae67a9bf2060a55033796e4ef0d023e</hashdigest>
<hashdigest type='SHA1'>49eb490983196ca2b829b0885c8880afd9e52f5d</hashdigest>
</fileobject>
<fileobject>
<filename>dir-2\file-2-1.txt</filename>
<hashdigest type='MD5'>c83a699edda408fec4340e3f7aa5acfd</hashdigest>
<hashdigest type='SHA1'>a83b85c8cca34c484d31ec116247583453a29dab</hashdigest>
</fileobject>
<fileobject>
<filename>dir-2\file-2-2.txt</filename>
<hashdigest type='MD5'>21bb58475e5a9b7146730e6eb8e74bb1</hashdigest>
<hashdigest type='SHA1'>76418189163e89d9fc38fdf1acaa36e15046d86f</hashdigest>
</fileobject>
</dfxml>
...and needed to convert it to a flat text format like...
Code: Select all
a3037a8c309f79fbccc1cfb0bce59634* file-0.txt
2ae67a9bf2060a55033796e4ef0d023e* dir-1\file-1-1.txt
c83a699edda408fec4340e3f7aa5acfd* dir-2\file-2-1.txt
21bb58475e5a9b7146730e6eb8e74bb1* dir-2\file-2-2.txt
...where incidentally the xml file is a trimmed down dfxml format such as generated by hashdeep/md5deep, and the text is in the de-facto standard accepted by most every md5sum clone.
Such a conversion between formats would not be a trivial batch task - due to potential extra/irrelevant information in the input file, whitespace (spaces/tabs/newlines) tolerance in xml, the '<>' characters that are special to batch, etc. On the other hand, extracting the very specific information needed from the xml is just what XSLT was made for, and all it takes is a straightforward "transformation" (which is what "T" in XSLT stands for). Note that the XSLT as written does not include the '*' binary marker after MD5s, and puts quotes around the filename - both formatting issues will be taken care of in the batch code later down.
Code: Select all
<?xml version='1.0' encoding='iso-8859-1'?>
<!DOCTYPE xsl:stylesheet [<!ENTITY eol "&#xA;">]>
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:output method='text' encoding='iso-8859-1'/>
<xsl:strip-space elements='*'/>
<xsl:template match='dfxml/fileobject'>
<xsl:value-of select="concat(hashdigest[@type='MD5'],' "',filename,'"&eol;')"/>
</xsl:template>
</xsl:stylesheet>
To run an XSLT on an XML file, one needs an XML parser/processor. An older, yet functioning one, is Microsoft's own MSXSL.EXE
http://www.microsoft.com/en-us/download/details.aspx?id=21714. With that in hand (i.e. somewhere in the PATH, or in the current directory), the following works - assuming the files above are saved as 'dfx.xml' and the XSLT 'dfx2md5.xslt' respectively.
Code: Select all
C:\tmp>for /f "tokens=1,*" %X in ('msxsl "dfx.xml" "dfx2md5.xslt"') do @echo %~X* %~Y
a3037a8c309f79fbccc1cfb0bce59634* file-0.txt
2ae67a9bf2060a55033796e4ef0d023e* dir-1\file-1-1.txt
c83a699edda408fec4340e3f7aa5acfd* dir-2\file-2-1.txt
21bb58475e5a9b7146730e6eb8e74bb1* dir-2\file-2-2.txt
This, however, requires having the external XSLT file available, and MSXSL.EXE available. The rest of this post is about removing those restrictions, one by one - and both codes return the same output as the one above.
1. hybrid cmd+xslt batch file with an embedded stylesheet, used to process an xml file of known format using an external command line XML processor of user's choice
Code: Select all
<!-- :: dfx2md5-msXsl.cmd :: converts dfxml file to md5sum format using msXsl.exe
@echo off
for /f "tokens=1,*" %%X in ('msxsl "%~1" "%~f0"') do echo %%~X* %%~Y
goto :eof & rem -->
<!-- embedded stylesheet identical to standalone one, except for
the <?xml?> declaration which cannot be inserted at the very top of the file -->
<!-- <?xml version='1.0' encoding='iso-8859-1'?> -->
<!DOCTYPE xsl:stylesheet [<!ENTITY eol "&#xA;">]>
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:output method='text' encoding='iso-8859-1'/>
<xsl:strip-space elements='*'/>
<xsl:template match='dfxml/fileobject'>
<xsl:value-of select="concat(hashdigest[@type='MD5'],' "',filename,'"&eol;')"/>
</xsl:template>
</xsl:stylesheet>
2. hybrid cmd+wsf+xslt all-in-one batch file with an embedded stylesheet, and also the necessary jscript code to apply it to a given xml, using just Windows' builtin parser, without relying on external processors
Code: Select all
<!-- :: dfx2md5.cmd :: converts dfxml file to md5sum format - - - - - .cmd - -
@echo off
for /f "tokens=1,*" %%X in ('cscript //nologo "%~f0?.wsf" "%~1"') do (
echo %%~X* %%~Y
)
goto :eof & rem -->
<job> <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - .wsf -->
<xslt id='default'> <!-- - - - - - - - - - - - - - - - - - - - - - .xslt -->
<!-- embedded stylesheet identical to standalone one, except for
the <?xml?> declaration which cannot be inserted at the very top of the file
&eol; entity replaced with $eol variable since !doctype breaks wsf parser -->
<!-- <?xml version='1.0' encoding='iso-8859-1'?> -->
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:output method='text' encoding='iso-8859-1'/>
<xsl:strip-space elements='*'/>
<xsl:variable name='eol' select='"
"'/>
<xsl:template match='dfxml/fileobject'>
<xsl:value-of select="concat(hashdigest[@type='MD5'],' "',filename,'"',$eol)"/>
</xsl:template>
</xsl:stylesheet>
</xslt>
<script language="JScript">//<![CDATA[ <!-- - - - - - - - - - - - - .js -->
function u32hex(arg)
{ return (arg >>> 0).toString(16).toUpperCase(); }
function xmlErr(err)
{ return "error 0x" + u32hex(err.errorCode) + " (" + err.line + "," + err.linepos + "): " + err.reason; }
var vArgs = WScript.Arguments;
if(vArgs.length < 1 || !vArgs(0).length)
{ WScript.Echo("syntax: dfx2md5.cmd <xmlFile>"); WScript.Quit(1); }
var xmlFile = vArgs(0);
var wsfFile = WScript.ScriptFullName.slice(0, -5); // drop '?.wsf' tail
// ms-recommended msXml6, in-band from xp.sp3 up, except server2k3 requires separate install
var xmlDOMDocProgID = "MSXML2.DOMDocument.6.0";
// load external xml file
var xmlDoc = new ActiveXObject(xmlDOMDocProgID);
xmlDoc.setProperty("NewParser", true);
xmlDoc.validateOnParse = false;
xmlDoc.async = false;
xmlDoc.load(xmlFile);
if(xmlDoc.parseError.errorCode)
{ WScript.Echo("XML " + xmlErr(xmlDoc.parseError)); WScript.Quit(1); }
// load self, then retrieve 'xslt' node at next step
var wsfDoc = new ActiveXObject(xmlDOMDocProgID);
wsfDoc.setProperty("NewParser", true);
wsfDoc.validateOnParse = false;
wsfDoc.async = false;
wsfDoc.load(wsfFile);
if(wsfDoc.parseError.errorCode)
{ WScript.Echo("XSL " + xmlErr(wsfDoc.parseError)); WScript.Quit(1); }
// msXml3 only, override legacy default 'xslPattern'
//wsfDoc.setProperty("SelectionLanguage", 'XPath');
// required in order for 'xpath' to recognize 'xsl' namespace
wsfDoc.setProperty("SelectionNamespaces", "xmlns:xsl='http://www.w3.org/1999/XSL/Transform'");
// could store multiple 'xslt' stylesheets with different 'id' tags, and choose at runtime which one to apply
var xslNode = wsfDoc.documentElement.selectSingleNode("/job/xslt[@id='default']/xsl:stylesheet");
// 'xslDoc' only needed for 'xslPi' otherwise 'xslNode' could be passed to 'transformNode' directly
var xslDoc = new ActiveXObject(xmlDOMDocProgID);
// 'encoding' is parsed and observed, but left out of the 'xslDoc.xml' property
var xmlPi = xslDoc.createProcessingInstruction("xml", "version='1.0' encoding='iso-8859-1' standalone='yes'");
xslDoc.appendChild(xmlPi);
xslDoc.appendChild(xslNode.parentNode.removeChild(xslNode));
// apply transformation, output via stdout.write to avoid extra .echo newline
try
{ WScript.StdOut.Write(xmlDoc.transformNode(xslDoc)); }
catch(err)
{ WScript.Echo("XSLT error 0x" + u32hex(err.number) + ": " + err.description); WScript.Quit(1); }
//]]></script>
</job> <!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
To be noted that one would only need to customize the batch and xslt parts in the code above - the script/js block is completely generic.
Liviu