JSORT.BAT v4.2 - Case sensitive sort with option for numeric sort

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

JSORT.BAT v4.2 - Case sensitive sort with option for numeric sort

#1 Post by dbenham » 13 May 2014 22:10

Here at the top is the current version. The original post follows the current code.

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment
@goto :Batch
 
::************ Documentation ***********
::JSORT.BAT version 4.2
:::
:::JSORT [File] [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::    File - If the optional File argument is specified, then JSORT reads lines
:::           from the file instead of from stdin. If specified, the File must
:::           be the very first argument.
:::
:::  Options:
:::
:::    /C n - Number of sorted lines to print. Skipped lines are always printed
:::           and do not contribute to the count. Default is -1 (all lines).
:::
:::    /D String - Specifies the string used to delimit tokens. The delimiter
:::           string is always case sensitive. A quote literal " must be escaped
:::           as \q, and a backslash literal \ must be escaped as \\.
:::           The default value is an empty string, meaning treat the entire
:::           line as a single token.
:::
:::    /I   - Ignore case when sorting (or when checking for uniqueness)
:::
:::    /N   - Sort consecutive digits as numbers instead of text. The numbers
:::           may be embedded within alpha text. JSort supports numbers up to
:::           20 digits long.
:::
:::    /O File - Writes the output to File instead of stdout.
:::
:::    /P n - Begin sorting at character position n relative to the beginning
:::           of the selected token. Lines that do not extend that far are
:::           treated as equivalent values, and collate before all other lines.
:::           The default value is 1 (first character).
:::
:::    /R   - Sort the lines in Reverse (descending) order.
:::
:::    /S n - Number of lines to skip - default is 0.
:::           Skipped lines are not sorted (remain in place)
:::
:::    /T n - Specify the token at which to begin sorting. The sort is not
:::           restricted to the selected token, it just helps to identify where
:::           to begin sorting. The default value is 1 (first token). A value of
:::           -1 represents the last token, -2 the penultimate token, etc.
:::           A value of 0 is invalid. Note that negative values only recognize
:::           tokens that occur after a delimiter. If the requested token cannot
:::           be found, then the line will collate before all other lines.
:::
:::    /U   - Only write unique lines (discard duplicates)
:::
:::    /V   - Display the version of JSORT.BAT.
:::
:::    /?   - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

============== :Batch portion =============
@echo off
setlocal disableDelayedExpansion

:: Get optional input file
set "infile="
set "test=%~1"
if defined test (
  setlocal enableDelayedExpansion
  if "!test:~0,1!" neq "/" (
    endlocal
    set ^"infile=^<"%~1""
    shift /1
  ) else endlocal
)

:: Define options
set "options= /?: /c:-1 /d:"" /i: /n: /o:"" /p:1 /r: /s:0 /t:1 /u: /v: "

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  setlocal enableDelayedExpansion
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      endlocal
      set "%~1=1"
  ) else (
      endlocal
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, tok=%/t%, unique=0%/u%, order=1-2*0%/r%, 1/!(0x80000000&pos), 1/tok" 2>nul || (
  >&2 echo Error: Invalid option value.
  exit /b 1
)
if %tok% gtr 0 set /a tok-=1
set "outfile="
if defined /o set ^"outfile=^>"%/o%""

:: Perform the sort
%infile% %outfile% cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order% %/s% %/c% %tok% "%/d%" %unique%

exit /b 0

************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =parseInt(WScript.Arguments.Item(2)),
    order  =WScript.Arguments.Item(3),
    skip   =WScript.Arguments.Item(4),
    count  =WScript.Arguments.Item(5),
    token  =WScript.Arguments.Item(6),
    delim  =WScript.Arguments.Item(7).replace(/\\(?!q|\\)/g,'').replace(/\\\\/g,'\\s').replace(/\\q/g,'"').replace(/\\s/g,'\\'),
    unique =WScript.Arguments.Item(8);
while (!WScript.StdIn.AtEndOfStream) {
  if (skip > 0) {
    WScript.Echo(WScript.StdIn.ReadLine());
    skip-=1
  } else {
    var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
    for( var i=pos+FindToken(raw,delim,token); i<raw.length; i++ ) {
      var c=upper.substr(i,1);
      if (numeric==1 && c>="0" && c<="9") {
        num+=c;
      } else {
        if (num != "") {
          num="00000000000000000000" + num;
          expanded+=num.substr(num.length-20);
          num="";
        }
        expanded+=c;
      }
    }
    if (num != "") {
      num="00000000000000000000" + num;
      expanded+=num.substr(num.length-20);
    }
    var obj={expanded:expanded, raw:raw};
    array.push(obj);
  }
}
if (count<0) count=array.length;
if (count>array.length) count=array.length;
if (unique==1) {
  array.sort( function(a,b){
                var rtn = (a.expanded>b.expanded)-(a.expanded<b.expanded);
                if (rtn==0) {
                  var a2=(nocase==1)?a.raw.toUpperCase():a.raw;
                  var b2=(nocase==1)?b.raw.toUpperCase():b.raw;
                  rtn = (a2>b2)-(a2<b2);
                }
                return order*rtn;
              });
  if (count>=1) WScript.Echo(array[0].raw);
  for (var i=1; i<count; i++) {
    if (nocase==1 && array[i].raw.toUpperCase() == array[i-1].raw.toUpperCase()) continue;
    if (nocase==0 && array[i].raw == array[i-1].raw) continue;
    WScript.Echo(array[i].raw);
  }
} else {
  array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));});
  for (var i=0; i<count; i++) WScript.Echo(array[i].raw);
}

function FindToken(str, str2, n) {
  if (n!=0 && str2=="") return str.length;
  if (n>=0) {
    var rtn = 0;
    for( var i=n; i>0; i-- ) {
      rtn = str.indexOf(str2,rtn);
      if (rtn<0) return str.length;
      rtn+=str2.length;
    }
  } else {
    var rtn = str.length;
    for (var i=n; i<0; i++ ) {
      rtn-=1;
      rtn = str.lastIndexOf(str2,rtn);
      if (rtn<0) return str.length;
    }
    rtn+=str2.length;
  }
  return rtn;
}
------------------------------------------------------------------------------------------
Original post:

I've often been disappointed that SORT did not have a case sensitive option, but never thought much about doing anything about it.

It's also not uncommon to want to sort strings with embedded numbers, and I've always parsed and reformatted the strings with 0 padded numbers so that I could use SORT. It works, but it's a pain.

But then at How can i order the results consecutively in a batch? rootx asked how to sort numeric strings. Squashman linked to the DosTips SortN.bat solution, which I had never noticed before. It works well, but it can be slow with moderate sized input.

Then I thought, why not write a hybrid JScript/batch SORT utility that can solve both problems efficiently and conveniently :?: :idea: - so I did :) I'm no JScript expert, and I did not spend much time optimizing the code. But it seems to work well in my limited testing. But be aware that the utility will load the entire input data set into memory, using more than twice the original space.

Full documentation is embedded within the script. Use JSORT /? to display the help.

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
::JSORT.BAT version 1.0
:::
:::JSORT [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::  Options:
:::
:::    /I - Ignore case
:::
:::    /N - Sort consecutive digits as numbers instead of text. The numbers
:::         may be embedded within alpha text.
:::
:::    /P n - Begin sorting at character position n. Lines that have fewer than
:::           n characters are treated as equivalent values, and collate before
:::           all other lines. The default value is 1 (first character).
:::
:::    /R - Sort the lines in reverse (descending) order.
:::
:::    /V - Display the version of JSORT.BAT.
:::
:::    /? - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

::************ Batch portion ***********
@echo off
setlocal enableDelayedExpansion

:: Define options
set "options= /?: /i: /n: /p:1 /r: /v:"

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      set "%~1=1"
  ) else (
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, order=1-2*0%/r%, val=1/^!(0x80000000&pos)" 2>nul || (
  >&2 echo Error: Invalid /P value.
  exit /b 1
)

:: Perform the sort
cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order%
exit /b 0


************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =WScript.Arguments.Item(2),
    order  =WScript.Arguments.Item(3);
while (!WScript.StdIn.AtEndOfStream) {
  var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
  for( var i=pos; i<raw.length; i++ ) {
    var c=upper.substr(i,1);
    if (numeric==1 && c>="0" && c<="9") {
      num+=c;
    } else {
      if (num != "") {
        num="00000000000000000000" + num;
        expanded+=num.substr(num.length-20);
        num="";
      }
      expanded+=c;
    }
  }
  if (num != "") {
    num="00000000000000000000" + num;
    expanded+=num.substr(num.length-20);
  }
  var obj={expanded:expanded, raw:raw};
  array.push(obj);
}
array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));})
for (var i=0; i<array.length; i++) WScript.Echo(array[i].raw);
Dave Benham
Last edited by dbenham on 24 Mar 2015 21:55, edited 8 times in total.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#2 Post by dbenham » 21 Oct 2014 13:50

There was an interesting StackOverflow question at Find top 10 processes with wmic command. JSORT.BAT could already sort numercally at a given line position, but it couldn't preserve the header, and it couldn't list just the top 10.

So I modified JSORT.BAT to include the /S n (skip n lines), and /C n (print top n sorted lines) options.

The solution becomes quite simple, and efficient. The only tricky bit is the position of the WorkingSetSize column varies depending on the longest listed file name. So extra code had to be included to compute the line position at which to start sorting.

Code: Select all

@echo off
setlocal

:: Get full proc list in unicode
wmic process list brief >procList.tmp

:: Convert unicode to ANSII
type procList.tmp >procList.tmp2

:: Read the header line
<procList.tmp2 set /p "header="

:: Identify position of WorkingSetSize column
for /f "delims=W" %%A in ("%header%") do echo %%A>skipSize.tmp
for %%A in (skipSize.tmp) do set /a pos=%%~zA-1

:: Print the header, followed by the top 10 sorted by WorkingSetSize
type procList.tmp2 | jsort2 /n /r /p %pos% /s 1 /c 10

:: Delete the temp files
del procList.tmp procList.tmp2 skipSize.tmp


Here is version 2 with the new options (With 2.1 bug fix for /C option default):

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
::JSORT.BAT version 2.1
:::
:::JSORT [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::  Options:
:::
:::    /I   - Ignore case
:::
:::    /C n - Number of sorted lines to print. Skipped lines are always printed
:::           and do not contribute to the count. Default is -1 (all lines).
:::
:::    /N   - Sort consecutive digits as numbers instead of text. The numbers
:::           may be embedded within alpha text. JSort supports numbers up to
:::           20 digits long.
:::
:::    /P n - Begin sorting at character position n. Lines that have fewer than
:::           n characters are treated as equivalent values, and collate before
:::           all other lines. The default value is 1 (first character).
:::
:::    /R   - Sort the lines in Reverse (descending) order.
:::
:::    /S n - Number of lines to skip - default is 0.
:::           Skipped lines are not sorted (remain in place)
:::
:::    /V   - Display the version of JSORT.BAT.
:::
:::    /?   - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

::************ Batch portion ***********
@echo off
setlocal enableDelayedExpansion

:: Define options
set "options= /?: /i: /c:-1 /n: /p:1 /r: /s:0 /v:"

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      set "%~1=1"
  ) else (
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, order=1-2*0%/r%, 1/^!(0x80000000&pos)" 2>nul || (
  >&2 echo Error: Invalid /P value.
  exit /b 1
)

:: Perform the sort
cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order% %/s% %/c%
exit /b 0


************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =WScript.Arguments.Item(2),
    order  =WScript.Arguments.Item(3),
      skip   =WScript.Arguments.Item(4),
      count  =WScript.Arguments.Item(5);
while (!WScript.StdIn.AtEndOfStream) {
  if (skip > 0) {
    WScript.Echo(WScript.StdIn.ReadLine());
    skip-=1
  } else {
    var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
    for( var i=pos; i<raw.length; i++ ) {
      var c=upper.substr(i,1);
      if (numeric==1 && c>="0" && c<="9") {
        num+=c;
      } else {
        if (num != "") {
          num="00000000000000000000" + num;
          expanded+=num.substr(num.length-20);
          num="";
        }
        expanded+=c;
      }
    }
    if (num != "") {
      num="00000000000000000000" + num;
      expanded+=num.substr(num.length-20);
    }
    var obj={expanded:expanded, raw:raw};
    array.push(obj);
  }
}
if (count<0) count=array.length;
if (count>array.length) count=array.length;
array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));});
for (var i=0; i<count; i++) WScript.Echo(array[i].raw);


Dave Benham
Last edited by dbenham on 22 Oct 2014 09:49, edited 1 time in total.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: JSORT.BAT - Case sensitive sort with option for numeric

#3 Post by foxidrive » 21 Oct 2014 22:29

Thank Dave, as always for your useful tools.

Just thinking that you might like to change the batch to provide help when no options are given.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#4 Post by dbenham » 22 Oct 2014 06:14

No, JSORT is much like MORE or SORT - it is perfectly reasonable to run the command without any arguments or options, and it will wait silently for input, gathering everything that is entered, and then print out the sorted lines after <Ctrl-Z> is pressed.

Running it without any options performs a case sensitive sort.


Dave Benham

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#5 Post by dbenham » 22 Oct 2014 09:52

Updated to version 2.1 - fixed /C option default. Was 0, corrected to -1. Eariler post has been edited.


Dave Benham

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: JSORT.BAT - Case sensitive sort with option for numeric

#6 Post by foxidrive » 22 Oct 2014 17:17

dbenham wrote:JSORT is much like MORE or SORT - it is perfectly reasonable to run the command without any arguments or options


Okelidokily.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#7 Post by dbenham » 05 Nov 2014 17:02

I saw another StackOverflow question that led me to add yet another feature to JSort.

JSort now allows you to sort beginning at a given token position, rather than an absolute character position. The /T option specifies the token, and the /D option specifies the delimiter string (note - the delimiter can have length > 1). The /P option now specifies the position relative to the start of the selected token, rather than from the beginning of the string. The default token is 1, so it does not change the behavior when /T is not specified.

I also provide the option to specify the file to sort as the first argument, so you are no longer forced to use redirection or a pipe.

So, if given file test.txt:

Code: Select all

goodbye,Z570
Hello world,A100
Ivanhoe,Z36
tiddleywinks,B36
Variable length text,A9

Then the following command:

Code: Select all

jsort test.txt /t 2 /d "," /n

produces this output:

Code: Select all

Variable length text,A9
Hello world,A100
tiddleywinks,B36
Ivanhoe,Z36
goodbye,Z570


Here is version 3.0

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
::JSORT.BAT version 3.0
:::
:::JSORT [File] [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::    File - If the optional File argument is specified, then JSORT reads lines
:::           from the file instead of from stdin. If specified, the File must
:::           be the very first argument.
:::
:::  Options:
:::
:::    /C n - Number of sorted lines to print. Skipped lines are always printed
:::           and do not contribute to the count. Default is -1 (all lines).
:::
:::    /D string - Specifies the string used to delimit tokens. The delimiter
:::           string is always case sensitive. A quote literal " must be escaped
:::           as \q, and a backslash literal \ must be escaped as \\.
:::           The default value is an empty string, meaning treat the entire
:::           line as a single token.
:::
:::    /I   - Ignore case when sorting
:::
:::    /N   - Sort consecutive digits as numbers instead of text. The numbers
:::           may be embedded within alpha text. JSort supports numbers up to
:::           20 digits long.
:::
:::    /P n - Begin sorting at character position n relative to the beginning
:::           of the selected token. Lines that do not extend that far are
:::           treated as equivalent values, and collate before all other lines.
:::           The default value is 1 (first character).
:::
:::    /R   - Sort the lines in Reverse (descending) order.
:::
:::    /S n - Number of lines to skip - default is 0.
:::           Skipped lines are not sorted (remain in place)
:::
:::    /T n - Specify the token at which to begin sorting. The default value
:::           is 1 (first token).
:::
:::    /V   - Display the version of JSORT.BAT.
:::
:::    /?   - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

::************ Batch portion ***********
@echo off
setlocal disableDelayedExpansion

:: Get optional input file
set "redirect="
set "file=%~1"
setlocal enableDelayedExpansion
if defined file if "!file:~0,1!" neq "/" (
  set "redirect=<"!file!""
  shift /1
)

:: Define options
set "options= /?: /i: /c:-1 /n: /p:1 /r: /s:0 /v: /d:"" /t:1 "

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      set "%~1=1"
  ) else (
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, tok=%/t%-1, order=1-2*0%/r%, 1/^!(0x80000000&pos), 1/^!(0x80000000&tok)" 2>nul || (
  >&2 echo Error: Invalid option value.
  exit /b 1
)

:: Perform the sort
%redirect% cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order% %/s% %/c% %tok% "%/d%"

exit /b 0
************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =WScript.Arguments.Item(2),
    order  =WScript.Arguments.Item(3),
    skip   =WScript.Arguments.Item(4),
    count  =WScript.Arguments.Item(5),
    token  =WScript.Arguments.Item(6),
    delim  =WScript.Arguments.Item(7).replace(/\\(?!q|\\)/g,'').replace(/\\\\/g,'\\s').replace(/\\q/g,'"').replace(/\\s/g,'\\');
while (!WScript.StdIn.AtEndOfStream) {
  if (skip > 0) {
    WScript.Echo(WScript.StdIn.ReadLine());
    skip-=1
  } else {
    var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
    for( var i=pos+FindToken(raw,delim,token); i<raw.length; i++ ) {
      var c=upper.substr(i,1);
      if (numeric==1 && c>="0" && c<="9") {
        num+=c;
      } else {
        if (num != "") {
          num="00000000000000000000" + num;
          expanded+=num.substr(num.length-20);
          num="";
        }
        expanded+=c;
      }
    }
    if (num != "") {
      num="00000000000000000000" + num;
      expanded+=num.substr(num.length-20);
    }
    var obj={expanded:expanded, raw:raw};
    array.push(obj);
  }
}
if (count<0) count=array.length;
if (count>array.length) count=array.length;
array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));});
for (var i=0; i<count; i++) WScript.Echo(array[i].raw);

function FindToken(str, str2, n) {
  if (n>0 && str2=="") return str.length;
  var rtn = 0;
  for( var i=n; i>0; i-- ) {
    rtn = str.indexOf(str2,rtn);
    if (rtn<0) return str.length;
    rtn+=str2.length;
  }
  return rtn;
}


Dave Benham

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: JSORT.BAT - Case sensitive sort with option for numeric

#8 Post by Squashman » 05 Nov 2014 19:49

Sweet! I am definitely going to use that option.
Just wondering if jsort would run faster if you could specify the output filename instead of redirecting stdout? Similar to the built in sort utility.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#9 Post by dbenham » 06 Nov 2014 11:02

Perhaps there is a small performance gain to be had, but I'm lazy :mrgreen:

As I said earlier, I've done very little to optimize the JScript code. It works fine for the small tasks that I've used it for, but it may not perform well with very large files. Given that the JScript loads the entire file in memory (twice over!), I suspect that the utility will completely fail at some size threshold.

I've implemented an /O option to specify the output file in lieu of using redirection. However, I took the easy way out and perform redirection within the batch code, so there is no performance benefit to using /O.

Version 3.2 also fixes a bug with the File argument implementation. Version 3.0 failed if the file path included an exclamation point (!). (Version 3.1 had major argument parsing bugs)

Version 3.2

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
::JSORT.BAT version 3.2
:::
:::JSORT [File] [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::    File - If the optional File argument is specified, then JSORT reads lines
:::           from the file instead of from stdin. If specified, the File must
:::           be the very first argument.
:::
:::  Options:
:::
:::    /C n - Number of sorted lines to print. Skipped lines are always printed
:::           and do not contribute to the count. Default is -1 (all lines).
:::
:::    /D String - Specifies the string used to delimit tokens. The delimiter
:::           string is always case sensitive. A quote literal " must be escaped
:::           as \q, and a backslash literal \ must be escaped as \\.
:::           The default value is an empty string, meaning treat the entire
:::           line as a single token.
:::
:::    /I   - Ignore case when sorting
:::
:::    /N   - Sort consecutive digits as numbers instead of text. The numbers
:::           may be embedded within alpha text. JSort supports numbers up to
:::           20 digits long.
:::
:::    /O File - Writes the output to File instead of stdout.
:::
:::    /P n - Begin sorting at character position n relative to the beginning
:::           of the selected token. Lines that do not extend that far are
:::           treated as equivalent values, and collate before all other lines.
:::           The default value is 1 (first character).
:::
:::    /R   - Sort the lines in Reverse (descending) order.
:::
:::    /S n - Number of lines to skip - default is 0.
:::           Skipped lines are not sorted (remain in place)
:::
:::    /T n - Specify the token at which to begin sorting. The default value
:::           is 1 (first token).
:::
:::    /V   - Display the version of JSORT.BAT.
:::
:::    /?   - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

::************ Batch portion ***********
@echo off
setlocal disableDelayedExpansion

:: Get optional input file
set "infile="
set "test=%~1"
setlocal enableDelayedExpansion
if defined test if "!test:~0,1!" neq "/" (
  endlocal
  set ^"infile=^<"%~1""
  shift /1
) else endlocal

:: Define options
set "options= /?: /i: /c:-1 /n: /p:1 /r: /s:0 /v: /d:"" /t:1 /o:"" "

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  setlocal enableDelayedExpansion
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      endlocal
      set "%~1=1"
  ) else (
      endlocal
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, tok=%/t%-1, order=1-2*0%/r%, 1/!(0x80000000&pos), 1/!(0x80000000&tok)" 2>nul || (
  >&2 echo Error: Invalid option value.
  exit /b 1
)
set "outfile="
if defined /o set ^"outfile=^>"%/o%""

:: Perform the sort
%infile% %outfile% cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order% %/s% %/c% %tok% "%/d%"

exit /b 0
************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =WScript.Arguments.Item(2),
    order  =WScript.Arguments.Item(3),
    skip   =WScript.Arguments.Item(4),
    count  =WScript.Arguments.Item(5),
    token  =WScript.Arguments.Item(6),
    delim  =WScript.Arguments.Item(7).replace(/\\(?!q|\\)/g,'').replace(/\\\\/g,'\\s').replace(/\\q/g,'"').replace(/\\s/g,'\\');
while (!WScript.StdIn.AtEndOfStream) {
  if (skip > 0) {
    WScript.Echo(WScript.StdIn.ReadLine());
    skip-=1
  } else {
    var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
    for( var i=pos+FindToken(raw,delim,token); i<raw.length; i++ ) {
      var c=upper.substr(i,1);
      if (numeric==1 && c>="0" && c<="9") {
        num+=c;
      } else {
        if (num != "") {
          num="00000000000000000000" + num;
          expanded+=num.substr(num.length-20);
          num="";
        }
        expanded+=c;
      }
    }
    if (num != "") {
      num="00000000000000000000" + num;
      expanded+=num.substr(num.length-20);
    }
    var obj={expanded:expanded, raw:raw};
    array.push(obj);
  }
}
if (count<0) count=array.length;
if (count>array.length) count=array.length;
array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));});
for (var i=0; i<count; i++) WScript.Echo(array[i].raw);

function FindToken(str, str2, n) {
  if (n>0 && str2=="") return str.length;
  var rtn = 0;
  for( var i=n; i>0; i-- ) {
    rtn = str.indexOf(str2,rtn);
    if (rtn<0) return str.length;
    rtn+=str2.length;
  }
  return rtn;
}


Dave Benham
Last edited by dbenham on 06 Nov 2014 13:34, edited 1 time in total.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: JSORT.BAT - Case sensitive sort with option for numeric

#10 Post by Squashman » 06 Nov 2014 11:39

Well then maybe that won't work for me then. I mostly work with very large files. Hundreds of megabytes and sometimes a gigabyte or two.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#11 Post by dbenham » 06 Nov 2014 12:40

Can you give a large file a try and report back :?: I'm curious.

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: JSORT.BAT - Case sensitive sort with option for numeric

#12 Post by Squashman » 06 Nov 2014 12:44

dbenham wrote:Can you give a large file a try and report back :?: I'm curious.

I will put it on my todo list. Been putting out fires every day this week.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#13 Post by dbenham » 06 Nov 2014 13:35

Updated prior post to version 3.2.

Version 3.1 had major argument parsing bugs :oops:

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JSORT.BAT - Case sensitive sort with option for numeric

#14 Post by dbenham » 06 Nov 2014 14:09

I successfully sorted a ~15 mb file in 64 seconds.

I attempted to sort a ~30 mb file and it failed with an "Out of memory" error. :evil:

So it looks like this utility is no good for large files - Sorry Squashman :(

Squashman
Expert
Posts: 4465
Joined: 23 Dec 2011 13:59

Re: JSORT.BAT - Case sensitive sort with option for numeric

#15 Post by Squashman » 06 Nov 2014 14:24

dbenham wrote:I successfully sorted a ~15 mb file in 64 seconds.

I attempted to sort a ~30 mb file and it failed with an "Out of memory" error. :evil:

So it looks like this utility is no good for large files - Sorry Squashman :(


How much ram do you have?

I know when I only had 1GB of ram on my computer I had issues reading large files with FOR /F because it also loads the whole file into memory first. I could usually process about a 600 MB file. Now I have a 4GB and can run larger files but is still pretty slow.

Post Reply