Page 1 of 3

CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 09:56
by rojo
Here's the latest version of CSV.BAT:

Code: Select all

@if (@CodeSection == @Batch) @Then

:: -----------------------------------------------------
:: csv.bat v1.1.2
:: csv.bat /? for usage
::
:: For updates and discussion, visit
:: http://www.dostips.com/forum/viewtopic.php?f=3&t=6184
:: -----------------------------------------------------

@echo off
setlocal enabledelayedexpansion

if "%~1"=="" goto usage

for %%I in (sqlfile tmpfile found) do set %%I=
for %%I in (_E _S _Debug) do set %%I=0
set "_D=,"

set keywords="SELECT" "INSERT" "UPDATE" "DELETE" "DESCRIBE" "TRUNCATE"

:: loop through arguments until either a keyword or a filename is reached
for %%I in (%*) do (
   if "!_D!"=="found" set "_D=%%~I"
   if /i "%%~I"=="/d" set "_D=found"
   if /i "%%~I"=="/e" set "_E=1"
   if /i "%%~I"=="/s" set "_S=1"
   if /i "%%~I"=="/debug" set "_Debug=1"
   if not "!keywords!"=="!keywords:%%~I=!" (
      goto break
   )
   if exist "%%~I" (
      set "sqlfile=%%~I"
      goto break
   )
)
:break

if not defined sqlfile (

   if "%_S%"=="1" (
      set "sqlfile=%CD%"
   ) else (

      set "args=%*"
      for %%I in (SELECT INSERT UPDATE DELETE DESCRIBE TRUNCATE /D) do (
         if not "!args!"=="!args:%%I=!" set found=1
      )

      if "%_E%"=="1" goto usage

      if not defined found goto usage
      set "tmpfile=%temp%\raw%time::=%.tmp"
   )
)

:: pass arguments via text file to avoid stripping quotation marks
if defined tmpfile (
   >"%tmpfile%" echo(%*
   set "sqlfile=%tmpfile%"
)

:: if 64-bit Windows, use 32-bit cscript for Jet driver compat
set "cscript=%systemroot%\SysWOW64\cscript.exe"
if not exist "%cscript%" set "cscript=cscript"

"%cscript%" /nologo /e:jscript "%~f0" "%sqlfile%" "/d:%_D%" "/e:%_E%" "/s:%_S%" "/debug:%_Debug%"

:: end normal runtime
set "exitcode=%ERRORLEVEL%"
if defined tmpfile del "%tmpfile%"
exit /b %exitcode%

:usage
setlocal enabledelayedexpansion
set "self=%~nx0"
call :heredoc usage >"%temp%\%~n0.txt" && goto end_usage
Usage: !self! [/D "?"] query
       !self! [/D "?"] [/E] file
       !self! [/S [path]]

    ... where query is a SQL query using SELECT, INSERT, UPDATE, DELETE,
        DESCRIBE, or TRUNCATE

    ... or file is a file containing a list of SQL queries (one query per line)

  Switches:
    /D "?"
        specifies the delimiter (default: comma)
        For tab delimited files, use /D "tab"

    /E file
        On caught error, continue executing subsequent lines in file
        Omitting the /E switch will cause the script to halt on error.

    /S [path]
        Generate Schema.ini for all text files in the specified directory
        (default: current)

  Examples:
    !self! SELECT column1 FROM test.csv WHERE column2="value" ORDER BY column1
    !self! INSERT INTO test.csv (h1, h2, h3) VALUES ("v1", "v2", "v3")
    !self! DELETE FROM test.csv WHERE col1="value"
    !self! UPDATE TOP 1 test.csv SET col2="value" WHERE col1="value"
    !self! DESCRIBE TABLE test.csv

  Notes:
    * Values are treated as strings.  Math operations (col+=int) won't work.
    * SELECT count(*) works, though.  So does UPDATE file SET col1=col2.
    * If the csv file does not exist, INSERT will attempt to create it.
    * The csv file MUST include a header row.
    * The SQL keywords are not case-sensitive.  Be lazy if you wish.
    * To avoid having to escape parentheses, you can enclose your statement in
      quotation marks.

    Example:
    !self! "INSERT INTO test.csv (h1, h2, h3) VALUES ('val1', 'val2', 'val3')"

    * If you need to refer to a column name containing spaces, enclose it in
      backticks.

    Example:
    !self! UPDATE test.csv SET `column 1`=`column 2` WHERE idx="5"

  Complex extra-credit example:
    !self! "INSERT INTO `new file.csv` SELECT a.col1 AS `column 1`, a.col2 AS
    `column 2`, a.col3 AS `column 3`, b.`col1` AS `column 4` FROM test.csv AS a
    LEFT JOIN `test 2.csv` AS b ON a.`col1`=b.`col1`"
:end_usage
2>NUL ( endlocal & more /e /t4 "%temp%\%~n0.txt" & del "%temp%\%~n0.txt" )
goto :EOF

:: http://stackoverflow.com/a/15032476/1683264
:heredoc <uniqueIDX>
setlocal enabledelayedexpansion
set go=
for /f "delims=" %%A in ('findstr /n "^" "%~f0"') do (
    set "line=%%A" && set "line=!line:*:=!"
    if defined go (if #!line:~1!==#!go::=! (goto :EOF) else echo(!line!)
    if "!line:~0,13!"=="call :heredoc" (
        for /f "tokens=3 delims=>^ " %%i in ("!line!") do (
            if #%%i==#%1 (
                for /f "tokens=2 delims=&" %%I in ("!line!") do (
                    for /f "tokens=2" %%x in ("%%I") do set "go=%%x"
                )
            )
        )
    )
)
goto :EOF

@end

// JScript portion

String.prototype.strip = function() {
   return this.replace(/^(['"`])(.*)\1$/,"$2");
}

String.prototype.csvQuotes = function() {
   return this.strip().replace(/"([^"]+)"/g, '""$1""');
}

String.prototype.sqlQuotes = function() {
   return this.strip().replace(/'/g, "''");
}

// Array.prototype.indexOf Polyfill from MDN
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
if (!Array.prototype.indexOf) {
   Array.prototype.indexOf = function(e, t) {
      var n;
      if (this == null) {
         throw new TypeError('"this" is null or not defined')
      }
      var r = Object(this);
      var i = r.length >>> 0;
      if (i === 0) return -1;
      var s = +t || 0;
      if (Math.abs(s) === Infinity) s = 0;
      if (s >= i) return -1;
      n = Math.max(s >= 0 ? s : i - Math.abs(s), 0);
      while (n < i) {
         if (n in r && r[n] === e) return n;
         n++
      }
      return -1
   }
}

Array.prototype.toLowerCase = function() {
   for (var i=0, arr = []; i<this.length; i++) {
      arr.push(this[i].toLowerCase());
   }
   return arr;
}

Array.prototype.moveDotsLeft = function() {
   for (var i=0; i<this.length; i++) {
      if (this[i] == '.' && i+1 < this.length) {
         this[--i] += this.splice(i+1,2).join('');
      }
   }
   return this;
}

function fatal(err) {
   if (!suppressErrors) {
      if (typeof err == 'string') WSH.StdErr.WriteLine(err);
      else for (var i in err)
         if (i != 'message') WSH.StdErr.WriteLine(i + ': ' + err[i]);
   }
   if (!errorResume) WSH.Quit(1);

   else exitCode += 1;
}

function fetch(what) {

   var attempts = arguments[1] || 0;

   if (!oConn.State)
      oConn.Open('Provider=Microsoft.Jet.OLEDB.4.0;Data Source="' + path
         + '";Extended Properties="text;HDR=YES;FMT=Delimited"');
   try {
      if (oRS.State) oRS.Close();
      oRS.Open(what, oConn, 3, 3, 1);
      if (debug) WSH.Echo(what);
      if (debug && oRS.State) WSH.Echo(oRS.RecordCount + ' records returned\n');
      return true;
   }
   catch(e) {
      if (e.number == -2147467259 && /exclusive/i.test(e.message)) {
         // exclusive lock error.  If 20 failed attempts, give up.
         if (++attempts > 19) {
            e.command = what;
            fatal(e);
         }

         // Otherwise, sleep a little and try again.
         WSH.Sleep(1);
         return fetch(what, attempts);
      }
      e.command = what; fatal(e);
      return false;
   }
}

function flushBuffer(proc) {
   while (!proc.StdOut.AtEndOfStream) WSH.Echo(proc.StdOut.ReadLine());
   while (!proc.StdErr.AtEndOfStream) WSH.StdErr.WriteLine(proc.StdErr.ReadLine());
}

function fixSchema() {
   var schema = fso.OpenTextFile(path + 'Schema.ini', 1),
      fixed = fso.CreateTextFile(path + 'Schema2.ini', true);
   while (!schema.AtEndOfStream) {
      fixed.WriteLine(schema.ReadLine().replace(/^\[__/,'['));
   }
   schema.Close();
   fixed.Close();
   var cmd = 'cmd /c move /y "'+path+'Schema2.ini" "'+path+'Schema.ini"',
      proc = oSH.Exec(cmd);
   while (!proc.Status) WSH.Sleep(1);
}

var oConn = WSH.CreateObject('adodb.connection'),
   oRS = WSH.CreateObject('adodb.recordset'),
   oSH = WSH.CreateObject('wscript.shell'),
   fso = WSH.CreateObject('scripting.filesystemobject'),
   exitCode = 0,
   skip = WSH.Arguments.Named('skip') || 0,
   processRoot = !skip,
   errorResume = (WSH.Arguments.Named('e') || 0) * 1,
   suppressErrors = 0,
   delim = WSH.Arguments.Named('d') || ',',
   debug = (WSH.Arguments.Named('debug') || 0) * 1;
   threads = [];

if (delim.toLowerCase() == 'tab') delim='\t';

if (WSH.Arguments.Unnamed(0) == 'query') {
   var path = WSH.Arguments(1);
   fetch(WSH.Arguments(2));
   WSH.Quit(0);
}
else if (WSH.Arguments.Unnamed(0) == 'debug') {
   var debug = path = WSH.Arguments(1);
   fetch(WSH.Arguments(2));
   WSH.Quit(0);
}

/*
Registry tweaks for greater compatibility

File extensions:
http://msdn.microsoft.com/en-us/library/ms974559.aspx
See the section "Working with Custom File Extensions".

Delimiters:
http://www.connectionstrings.com/textfile/
*/
function setDelim(delim) {
   var jet = "HKLM\\Software\\Microsoft\\Jet\\4.0\\Engines\\Text\\",
      ext = jet + "DisabledExtensions",
      del = jet + "Format",
      ch = delim == '\t' ? 'TabDelimited' : 'Delimited('+delim+')';

   try {
      oSH.RegWrite(ext,"","REG_SZ");
      if (delim == ',') {
         try { oSH.RegDelete(del); }
         catch(e) {}
      }
      else oSH.RegWrite(del,ch,"REG_SZ");
   }
   catch(e) {
      ext = ext.replace('HKLM','HKCU');
      del = del.replace('HKLM','HKCU');
      try {
         oSH.RegWrite(ext,"","REG_SZ");
         if (delim == ',') {
            try { oSH.RegDelete(del); }
            catch(e) {}
         }
         else oSH.RegWrite(del,ch,"REG_SZ");
      }
      catch(e) { fatal(e) }
   }
}

function resetDefaultDelim() {
   try { oSH.RegDelete(del); }
   catch(e) {
      try { oSH.RegDelete(del.replace('HKLM','HKCU')); }
      catch(e) {}
   }
}

if (processRoot) {
   setDelim(delim);
}

if (WSH.Arguments.Named('s') * 1) {
   var path = WSH.Arguments(0) + '\\',
      folder = fso.GetFolder(path),
      errorResume = suppressErrors = 1,
      total = folder.files.Count,
      col = new Enumerator(folder.files);
   if (!debug) WSH.StdOut.Write('Working...  0%');
   if (fso.FileExists(path + 'Schema.ini'))
      fso.DeleteFile(path + 'Schema.ini');
   for (var i=0; !col.atEnd(); col.moveNext(), i++) {
      var file = col.item().Name,
         size = col.item().Size,
         f = fso.OpenTextFile(path + file, 1),
         sample = f.Read(Math.min(1024, size));
      f.Close();
      if (sample.indexOf("\x00") == -1
         && !/^schema.+ini$/i.test(file)
         && size > 2
      ) {
         if (debug) WSH.Echo('adding ' + file);
         sample = sample.split(/\r?\n/)[0].replace(/".+?"/g,'');

         // if first line contains no commas, auto detect delimiter
         if (sample.indexOf(',') > -1) most = ',';
         else {
            var chars = {}, most = '';
            sample.replace(
               /[\s\x21\x23-\x26\x28-\x2D\/\x3A-\x40\x5B-\x60\x7B-\x7E]/g,
               function($1) { chars[$1] = (chars[$1] || 0) + 1; }
            );
            for (var x in chars) {
               if (!most || chars[x] > chars[most]) most = x;
            }
            if (debug) WSH.Echo('delimiter "' + most + '": ' + chars[most]);
         }
         setDelim(most);

         try {
            // force Jet driver to reacquire delimiter setting from registry
            if (oConn.State) oConn.Close();
            fetch('SELECT * INTO `__' + file + '` FROM `' + file + '` WHERE 0');
            if (oRS.State) oRS.Close();
         }
         catch(e) {}
         if (fso.FileExists(path + '__' + file))
            fso.DeleteFile(path + '__' + file);
      } else if (debug) WSH.Echo('skipping ' + file);
      if (!debug) {
         var pct = (' ' + Math.floor(100 * i / total) + '%').slice(-3);
         WSH.StdOut.Write("\x08\x08\x08"+pct);
      }
   }
   if (!debug) WSH.Echo("\x08\x08\x08100%");
   fixSchema();
   resetDefaultDelim();
   WSH.Quit(0);
}

var rawfile = fso.OpenTextFile(WSH.Arguments(0), 1);
for (var i=skip, raw=''; i && !rawfile.AtEndOfStream; i--) rawfile.SkipLine();
if (rawfile.AtEndOfStream) WSH.Quit(0);
var raw = rawfile.ReadLine();
rawfile.Close();
if (!/\w/.test(raw)) WSH.Quit(0);

while (1) {

   var raw = raw.replace(/\r?\n/g, '').strip(),
      argRxp = /(`.+?`|'.+?'|".+?"|[,=\.\(\)]|[^,=\.\(\) ]+)/g,
      args = raw.match(argRxp).moveDotsLeft(),
      lCaseArgs = args.toLowerCase();

   for (var i=file=path=0; i<args.length; i++) {

      if (/^\/d$/i.test(args[i])) {
         delim = args.splice(i,2)[1].strip();
         if (delim.toLowerCase() == 'tab') delim='\t';
         args = args.join(' ').strip().match(argRxp).moveDotsLeft();
         lCaseArgs = args.toLowerCase();
         i=-1;
      }

      if (/^\/debug$/i.test(args[i])) {
         var debug = 1;
         args.splice(i,1);
         args = args.join(' ').strip().match(argRxp).moveDotsLeft();
         lCaseArgs = args.toLowerCase();
         i=-1;
      }

      if (file && !path) {

         if (fso.FileExists(args[file].strip())) {
            path = fso.GetAbsolutePathName(args[file].strip() + '\\..') + '\\';
            args[file] = fso.GetFileName(args[file].strip());

         } else { // if file doesn't exist

            var filename = args[file].strip();
            if (lCaseArgs[0] == 'insert') { // allow insert to create the file
               var headers = [],
                  csvfile = fso.CreateTextFile(filename, true);

               if (args[++i] == '(') { // if insert defines the headers
                  while (args[++i] != ')') {
                     if (args[i] != ',') {
                        args[i] = args[i].strip();
                        headers.push('"' + args[i].csvQuotes() + '"');
                     }
                  }
                  csvfile.WriteLine(headers.join(delim));
               }
               // else if insert...select
               else if (lCaseArgs.indexOf('select') > -1) {

                  var select = lCaseArgs.indexOf('select') + 1,
                     from = lCaseArgs.indexOf('from'),
                     lSelectFrom = lCaseArgs.slice(select, from);

                  args[from+1] = '`' + args[from+1].strip() + '`';

                  // if select col as alias, write alias
                  if (lSelectFrom.indexOf('as') > -1) {
                     var selectFrom = args.slice(select, from),
                        headers = [];

                     for (var i=0; i<selectFrom.length; i++) {

                        if (lSelectFrom[++i] == 'as')
                           headers.push(selectFrom[++i].strip());

                        else if (selectFrom[--i] != ',') {
                           headers.push(selectFrom[i].strip());
                        }
                     }

                     csvfile.WriteLine('"' + headers.join('"'+delim+'"') + '"');

                  // otherwise, just copy the header row from queried file
                  } else {
                     var othercsv = args[lCaseArgs.indexOf('from') + 1],
                        otherfile = fso.OpenTextFile(othercsv, 1);
                     csvfile.WriteLine(otherfile.ReadLine());
                     otherfile.Close();
                  }
               }
               csvfile.Close();
               path = fso.GetAbsolutePathName(filename + '\\..') + '\\';
               args[file] = fso.GetFileName(filename);

            }
         }

      } else if (!path && /^(from|into|update|table)$/i.test(args[i])) {
         if (lCaseArgs[i+2] != 'from') {
            file=i+1;
            if (lCaseArgs[file] == 'top') {
               i += 2;
               file=i+1;
            }
         }
      }

      // join [var, =, value] into [var=value]
      if (args[i] == '=') args[i-1] += args.splice(i--,2).join('');
   }

   if (!path) {
      fatal({
         name: 'Error',
         description: 'Unsupported syntax.',
         command: args.join(' ')
      });
      continue;
   }

   args[file] = '`' + args[file].strip() + '`';

   if (debug) WSH.Echo(': ' + args.join('\n: '));

   if (lCaseArgs.length != args.length) lCaseArgs = args.toLowerCase();

   switch (lCaseArgs[0]) {
      case 'insert':
      case 'select':

         var command = args.join(' ');

         if (!fetch(command)) break;

         while (oRS.State && args[0].toLowerCase() == 'select' && !oRS.eof) {
            for (var i=0; i<oRS.Fields.Count; i++) {
               var val = oRS.Fields(i).Value;
               if (typeof val === 'object' && val == null) val = '';
               WSH.Echo(oRS.Fields(i).Name + '=' + val);
            }
            oRS.MoveNext();
         }

      break;

      case 'describe':

         function readSchema(file) {
            var schema = fso.OpenTextFile(path + 'Schema.ini', 1);
            for (var found = ret = 0; !schema.AtEndOfStream;) {
               var line = schema.ReadLine().replace(/\r?\n/,'');
               if (line.toLowerCase() == '[' + file + ']') found=ret=1;
               else if (/^\[.*\]$/.test(line)) found=0;
               if (found) WSH.Echo(line);
            }
            schema.Close();
            return ret;
         }

         var leaveSchema = fso.FileExists(path + 'Schema.ini');

         if (leaveSchema && readSchema(lCaseArgs[file].strip())) break;

         var command = 'SELECT * INTO `__' + args[file].strip() + '` FROM '
            + args[file] + ' WHERE 0';

         if (!fetch(command)) break;
         if (oRS.State) oRS.Close();
         fso.DeleteFile(path + '__'+args[file].strip());

         fixSchema();
         readSchema(lCaseArgs[file].strip());

         if (!leaveSchema) fso.DeleteFile(path + 'Schema.ini');

      break;

      /* The MS Jet OLEDB text driver does not natively support UPDATE or
      DELETE statements.  Work around this by copying unaffected rows to a
      new file, appending modified rows, and replacing the original csv. */
      case 'update':
      case 'delete':
      case 'truncate':   // *shrug* Why not?

         var where = lCaseArgs.indexOf('where'),
            set = lCaseArgs.indexOf('set'),
            top = lCaseArgs.indexOf('top') + 1,
            whereNot = (where > -1) ? args.slice(where+1) : [],
            update = 0;

         // copy header row to new file
         var othercsv = '_' + args[file].strip(),
            csvfile = fso.OpenTextFile(args[file].strip(), 1),
            otherfile = fso.CreateTextFile(othercsv, true);

         otherfile.WriteLine(csvfile.ReadLine());
         otherfile.Close();
         csvfile.Close();

         if (args[0].toLowerCase() == 'update') {
            update = 1;

            // convert "set var1=val1, var2=val2" to JS object
            var setArgs = (where > set) ?
                  args.slice(set + 1, where) :
                  args.slice(set + 1),
               set = {};

            for (var i=0; i<setArgs.length; i++) {
               if (setArgs[i].indexOf('=') > 0) {
                  var varval=setArgs[i].split('=');
                  set[varval[0].strip()] = varval[1].replace(/^`(.+)`$/,"$1");
               }
            }
         }

         if (whereNot.length) {

            for (var i=0; i<whereNot.length; i++) {
               if (whereNot[i].indexOf('=') > 0) {
                  var varval = whereNot[i].split('=');
                  whereNot[i] = '`' + varval[0].strip() + '`=' + varval[1];
               }
            }

            var command = 'INSERT INTO `_' + args[file].strip()
               + '` SELECT * FROM ' + args[file]
               + ' WHERE NOT ' + whereNot.join(' '),
               cscript = oSH.Environment('Process')('cscript'),
               cmd = 'cmd /c ' + cscript + ' /nologo /e:JScript "' + WSH.ScriptFullName
                  + (debug ? '" debug "' : '" query "') + path + '" "' + command.replace(/"/g,"'") + '"';

            if (debug) WSH.Echo('\nforking new thread for query:\n' + cmd + '\n');
            threads.push(oSH.Exec(cmd));

            if (update) {
               var command = 'SELECT * FROM ' + args[file] + ' WHERE '
                  + whereNot.join(' ');

               if (!fetch(command)) break;
            }

         } else if (update) {
            var command = 'SELECT * FROM ' + args[file];

            if (!fetch(command)) break;
         }

         if (update) {

            topN = top ? args[top] * 1 : 0;

            var insert = [], r = 0;
            while (!oRS.eof) {
               var line = [];
               for (var i=0; i<oRS.Fields.Count; i++) {
                  if (!topN || r < topN) {
                     var val = set[oRS.Fields(i).Name] || oRS.Fields(i).Value;
                     if (isNaN(val)) {
                        try { val = oRS.Fields(val); }
                        catch(e) {}
                     }
                  }
                  else var val = oRS.Fields(i).Value;
                  line.push(typeof val === 'object' && val == null ?
                     "''" : "'" + (''+val).sqlQuotes() + "'")
               }
               insert.push(line.join(','));
               oRS.MoveNext();
               r++;
            }
            if (oRS.State) oRS.Close();

            for (var i=0; i<insert.length; i++) {
               var command = 'INSERT INTO `_' + args[file].strip() + '` VALUES ('
                  + insert[i] + ')';

               if (!fetch(command)) break;
               if (oRS.State) oRS.Close();
            }
            i = Math.min(i, topN || i);
            WSH.Echo(i + (i==1 ? ' row' : ' rows') + ' affected' + (debug ? '\n' : ''));
         }

         while (threads.length) {
            if (!threads[0].Status) WSH.Sleep(1);
            else {
               flushBuffer(threads[0]);
               threads.splice(0,1);
            }
         }

         if (debug) WSH.Echo('Updated csv file: _'+args[file].strip());

         else {
            // replace old csv file
            try {
               fso.CopyFile(path + '_' + args[file].strip(), path + args[file].strip(), true);
               fso.DeleteFile(path + '_' + args[file].strip(), true);
            }
            catch(e) {
               // in case of pointless "Permission Denied" failures
               var cmd = 'cmd /c move /y "'+path+'_'+args[file].strip()+'" "'+path+args[file].strip()+'"',
                  proc = oSH.Exec(cmd);
               while (!proc.Status) WSH.Sleep(1);
            }
         }

      break;
   }   // end switch(lCaseArgs[0])

   break;
}   // end while (1)

if (oRS.State) oRS.Close();
if (oConn.State) oConn.Close();

if ((errorResume || !exitCode) && processRoot) {

   var sqlScript = fso.OpenTextFile(WSH.Arguments(0), 1);
   for (var i=0; !sqlScript.AtEndOfStream; i++) {
      sqlScript.SkipLine();
   }
   sqlScript.Close();

   for (var j=1; j<=i; j++) {
      cscript = oSH.Environment('Process')('cscript');
      var cmd = 'cmd /c ' + cscript + ' /nologo /e:JScript "' + WSH.ScriptFullName + '" "'
         + WSH.Arguments(0) + '" "/d:' + delim + '" "/e:' + (WSH.Arguments.Named('e') || 0)
         + '" "/debug:' + (WSH.Arguments.Named('debug') || 0) + '" "/skip:' + j + '"';

      if (debug) WSH.Echo('\nspawning new thread for next line:\n' + cmd + '\n');
      var proc = oSH.Exec(cmd);

      while (!proc.Status) {
         WSH.Sleep(1);
         flushBuffer(proc);
      }
      flushBuffer(proc);

      exitCode += proc.ExitCode;

      if (!errorResume && exitCode) break;
   }
}

if (processRoot) {
   try { oSH.RegDelete(del); }
   catch(e) {
      try { oSH.RegDelete(del.replace('HKLM','HKCU')); }
      catch(e) {}
   }
}

WSH.Quit(exitCode);

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 11:58
by rojo
CSV.BAT allows interaction with CSV (or TSV or similar) files with SQL syntax. It's not as fast as direct search-and-replace manipulation of the text, but hopefully it will provide enough flexibility to be useful as a utility for other projects and a solution to CSV challenges. Rather than counting lines and tokens, you can use this script to treat CSV files as rudimentary databases, querying and manipulating data, and merging multiple CSV files with SQL JOIN syntax. You can manage the column data types with Schema.ini if you wish, and automatically generate a default Schema.ini for all text files in a directory with the /S switch. It will probably run on any Windows machine 2000 and newer, although a Windows Script 5.7 update might be needed on old Windows versions.

If CSV.BAT is executed from within another batch script, make sure you use CALL to allow your script to continue after CSV.BAT has exited.

CSV.BAT exits with an %ERRORLEVEL% of 0 on success, or 1 on fail. With the /E switch, it exits with status 0 on success, or the number of caught errors on fail. Use the following to see example usage:

Code: Select all

csv /?


So, why did I make this when Dave Benham has already shared his excellent CSV parser? It was a challenge that's been an occasional itch on my brain for a long time. I don't have a particular purpose in mind. I just finally got around to sitting down and writing code a couple days ago, and thought I'd share my efforts.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 12:43
by Squashman
Could you post some common examples that you would use your script for.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 16:01
by rojo
Sure. Say you own a few rental properties and have some records in CSV format.

properties.csv:

Code: Select all

id,address,city,state,rent,available
1,"123 Smith St","Johnson City",TN,$1200,1
2,"384 Jones Ave",Erwin,TN,$800,1
3,"4250 Harbison Blvd",Jonesborough,TN,$900,0


applicants.csv:

Code: Select all

"First Name","Last Name","Current Address","City","State","Applied For","Application Fee Paid"
"Michael","Williams","218 Pine St","Johnson City","TN",1,1
"Josh","Douglas","84 Lover's Lane Apt. 7","Elizabethton","TN",3,0

Now, say you want to know which property Josh Douglas applied for.

Code: Select all

csv select a.address as `Applied For` from properties.csv as a left join applicants.csv as b on a.id=b.`Applied For` where b.`First Name` like "josh%" or b.`last name` like "douglas"

The resulting output will be:

Code: Select all

Applied For=4250 Harbison Blvd

Now, say you want a list of all applicants with the street addresses of the properties for which they applied:

Code: Select all

>applications.csv echo "Name1","Name2","Applied For","Rent"
csv insert into applications.csv select a.`First Name` as Name1, a.`Last Name` as Name2, b.`address` as `Applied For`, b.rent as Rent from applicants.csv as a left join properties.csv as b on b.`id`=a.`Applied For`

... which results in the creation of applications.csv with the following contents:

Code: Select all

"Name1","Name2","Applied For","Rent"
"Michael","Williams","123 Smith St","1200"
"Josh","Douglas","4250 Harbison Blvd","900"

I'm still squashing bugs as I find them. For example, in describing this scenario I discovered a problem with the CSV INSERT auto detection of the headers where column names contain spaces. That's why I did ">applications.csv echo col1,col2,etc." before calling csv.bat. I'll squash that bug tomorrow. Edit: Squashed in 1.0.1, but a more reliable syntax would be "SELECT args INTO applications.csv FROM applicants.csv", etc. See note below.*

And if someone moves out of 4250 Harbison Blvd, then

Code: Select all

csv update properties.csv set available=1 where address like "%harbison%"

... or at least that'll work after I find and squash another bug... tomorrow. Edit: fixed in 1.0.2.

By the way, wonder how many posts I need before I don't need moderator approval for new posts and edits?

-----------------------------------------------

* Note on "SELECT INTO newfile" versus "INSERT INTO newfile SELECT": The former syntax allows the MS Jet driver to copy the header row exactly as it should be, rather than my regexp and for loop guesswork on the latter syntax. For example:

Code: Select all

csv INSERT INTO newfile.csv SELECT * FROM properties.csv

... will fail grievously because of the wildcard, whereas

Code: Select all

csv SELECT * INTO newfile.csv FROM properties.csv

... will succeed.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 16:16
by Squashman
rojo wrote:By the way, wonder how many posts I need before I don't need moderator approval for new posts and edits?

Only two. I didn't have to approve your 3rd post.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 17:38
by dbenham
Oooh - intriguing concept. 8)

At first I thought you were crazy trying to bring SQL to the batch environment. But now that I see what you can do with it, I see a lot of potential.

I hope to God no one attempts to use this to maintain a database in CSV files. But it could be really useful for extracting data from, or transforming CSV files.

Have you thought about providing connectivity to a true relational database? That might be useful if you have a CSV file and you want to augment it with content from a proper relational database via lookup (join)


Dave Benham

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 17:49
by rojo
I've thought about it, but more in the context of reading csv and importing to MS Access or sqlite.

My more immediate goal is to fix the bugs I discovered this afternoon though. They're gnawing at me. Life afk is interfering at the moment though, so it'll have to wait. I hate leaving things broken....

Anyway, thank you guys for your interest! I'm glad y'all don't think I'm nuts for doing this. :)

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 20:20
by Squashman
Pretty sure this is going to come in handy.
Can I still do queries if I do not have a header row that defines my delimited fields. I work with a lot of files like this. I know what the fields are already. I do most of my work on our mainframe with sas but this may come in handy for quick reporting on files. We have the files backed up on the network and mainframe storage. It is a pita on the mainframe to work with lots of files whereas in batch I can just point it at a directory.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 20:29
by Squashman
dbenham wrote:
I hope to God no one attempts to use this to maintain a database in CSV files.

Years ago I used flat files for pseudo databases with the pascal programs I wrote. Three programs I can remember were my VHS Video Collection, Golf Scores and Car parts inventory for a friend's dad who owned a salvage yard. That was over 20 years ago.

The first forum software I ran on my linux server used csv files. Ikonboard.

The good ole days.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 20:33
by rojo
Squashman wrote:Pretty sure this is going to come in handy.
Can I still do queries if I do not have a header row that defines my delimited fields. I work with a lot of files like this. I know what the fields are already. I do most of my work on our mainframe with sas but this may come in handy for quick reporting on files. We have the files backed up on the network and mainframe storage. It is a pita on the mainframe to work with lots of files whereas in batch I can just point it at a directory.


Not sure. I've not played with csv files with this script without a header row. I'm pretty sure at the very least you'll need to modify the oConn= line and set "HDR=No", and add a Schema.ini file to your CSV directory to define the columns. You might be able to reference the columns by index number without Schema.ini, but I'm not sure how much the script would need to be modified to allow that.

Edit: Since I added syntax for "describe table csvfile" I think this is doable. Just modify the oConn= line and set "HDR=No", then "csv describe table csvfile" to see the driver-detected column names and types of your CSV file. I'll probably add an /N option for no header in a future update.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 20:48
by Aacini
rojo wrote:Anyway, thank you guys for your interest! I'm glad y'all don't think I'm nuts for doing this. :)


Of course that you are nuts for doing this! :D The same way than several of us around here, so I think this site is perfect for you! :mrgreen:

(Welcome here, rojo, I frequently enjoyed your original Batch-file answers at S.O.)

Some time ago I developed FilePointer.exe auxiliary program that allows to move the file pointer of a redirected file inside a code block, and as example of such program I wrote a very simple Data Base Relational application (or something like that, the link is in that post) that uses files with .NDX extension as index files to process another file in certain order, or to directly take a record via a key. I wonder if such method could be used as DBR engine and then program the SQL query part in pure Batch. This way the application don't require JScript nor ADODB, but require FilePointer.exe instead.

Crazy idea! Isn't it? 8)

Antonio

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 22:10
by rojo
Version 1.0.1

Fixed:
  • setting var=number sometimes used to set var=column[number].value. Now it doesn't.
  • joined table.column arguments for better header generation

Added:
  • /debug switch to show intermediate sql operations. It's mostly for my own benefit, so I'll probably leave it undocumented.

Still struggling with:
  • In my [update... set where col like val] test case, [insert...select where not like] is somehow not the inverse of [select where like] as it should be. /debug switch shows the SQL commands being composed correctly, but this duck isn't quacking the way it's supposed to. On the other hand, if I directly [select where not like] and [select where like] I get the correct results. I'll have to sleep on it.... Edit: Fixed in 1.0.2.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 15 Jan 2015 22:23
by rojo
Aacini wrote:Of course that you are nuts for doing this! :D The same way than several of use around here, so I think this site is perfect for you! :mrgreen:

(Welcome here, rojo, I frequently enjoyed your original Batch-file answers at S.O.)


Thanks! Likewise!

Aacini wrote:Some time ago I developed FilePointer.exe auxiliary program that allows to move the file pointer of a redirected file inside a code block... blah blah math math math... This way the application don't require JScript nor ADODB, but require FilePointer.exe instead.

Crazy idea! Isn't it? 8)


Who am I to judge? I'm nuts. o°)

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 16 Jan 2015 09:42
by rojo
Version 1.0.2

Fixed:
  • It seems the MS Jet driver has difficulty executing more than one WHERE clause per process thread (possibly only WHERE...LIKE?). Workaround by forking a query into a separate process. As an unintended benefit, this actually sped up execution a little. Cool.
  • "SELECT * INTO bar FROM foo" syntax was not recognized. It is now.
  • Exclusive lock error now sleeps and retries rather than failing outright.

Re: CSV.BAT - CSV file manipulation with SQL syntax

Posted: 16 Jan 2015 09:54
by dbenham
SQL statements can become quite long - perhaps too long to fit on the command line.

A nice feature would be an option to read the statement from a file. Even better would be the abililty to include multiple statements in a single script file.


Dave Benham