Page 3 of 4

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 05 Jan 2015 16:38
by dbenham
OK - below is release 3.3 that supports negative token numbers, meaning search for tokens from right to left, with -1 being the last token. Note that a negative value will only recognize tokens that occur after the delimiter.

The solution for foxidrives problem is now as simple as:

Code: Select all

jsort test.txt /d "(" /t -1


JSORT.BAT version 3.3

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment

::************ Documentation ***********
::JSORT.BAT version 3.3
:::
:::JSORT [File] [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::    File - If the optional File argument is specified, then JSORT reads lines
:::           from the file instead of from stdin. If specified, the File must
:::           be the very first argument.
:::
:::  Options:
:::
:::    /C n - Number of sorted lines to print. Skipped lines are always printed
:::           and do not contribute to the count. Default is -1 (all lines).
:::
:::    /D String - Specifies the string used to delimit tokens. The delimiter
:::           string is always case sensitive. A quote literal " must be escaped
:::           as \q, and a backslash literal \ must be escaped as \\.
:::           The default value is an empty string, meaning treat the entire
:::           line as a single token.
:::
:::    /I   - Ignore case when sorting
:::
:::    /N   - Sort consecutive digits as numbers instead of text. The numbers
:::           may be embedded within alpha text. JSort supports numbers up to
:::           20 digits long.
:::
:::    /O File - Writes the output to File instead of stdout.
:::
:::    /P n - Begin sorting at character position n relative to the beginning
:::           of the selected token. Lines that do not extend that far are
:::           treated as equivalent values, and collate before all other lines.
:::           The default value is 1 (first character).
:::
:::    /R   - Sort the lines in Reverse (descending) order.
:::
:::    /S n - Number of lines to skip - default is 0.
:::           Skipped lines are not sorted (remain in place)
:::
:::    /T n - Specify the token at which to begin sorting. The sort is not
:::           restricted to the selected token, it just helps to identify where
:::           to begin sorting. The default value is 1 (first token). A value of
:::           -1 represents the last token, -2 the penultimate token, etc.
:::           A value of 0 is invalid. Note that negative values only recognize
:::           tokens that occur after a delimiter. If the requested token cannot
:::           be found, then the line will collate before all other lines.
:::
:::    /V   - Display the version of JSORT.BAT.
:::
:::    /?   - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

::************ Batch portion ***********
@echo off
setlocal disableDelayedExpansion

:: Get optional input file
set "infile="
set "test=%~1"
setlocal enableDelayedExpansion
if defined test if "!test:~0,1!" neq "/" (
  endlocal
  set ^"infile=^<"%~1""
  shift /1
) else endlocal

:: Define options
set "options= /?: /i: /c:-1 /n: /p:1 /r: /s:0 /v: /d:"" /t:1 /o:"" "

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  setlocal enableDelayedExpansion
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      endlocal
      set "%~1=1"
  ) else (
      endlocal
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, tok=%/t%, order=1-2*0%/r%, 1/!(0x80000000&pos), 1/tok" 2>nul || (
  >&2 echo Error: Invalid option value.
  exit /b 1
)
if %tok% gtr 0 set /a tok-=1
set "outfile="
if defined /o set ^"outfile=^>"%/o%""

:: Perform the sort
%infile% %outfile% cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order% %/s% %/c% %tok% "%/d%"

exit /b 0
************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =WScript.Arguments.Item(2),
    order  =WScript.Arguments.Item(3),
    skip   =WScript.Arguments.Item(4),
    count  =WScript.Arguments.Item(5),
    token  =WScript.Arguments.Item(6),
    delim  =WScript.Arguments.Item(7).replace(/\\(?!q|\\)/g,'').replace(/\\\\/g,'\\s').replace(/\\q/g,'"').replace(/\\s/g,'\\');
while (!WScript.StdIn.AtEndOfStream) {
  if (skip > 0) {
    WScript.Echo(WScript.StdIn.ReadLine());
    skip-=1
  } else {
    var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
    for( var i=pos+FindToken(raw,delim,token); i<raw.length; i++ ) {
      var c=upper.substr(i,1);
      if (numeric==1 && c>="0" && c<="9") {
        num+=c;
      } else {
        if (num != "") {
          num="00000000000000000000" + num;
          expanded+=num.substr(num.length-20);
          num="";
        }
        expanded+=c;
      }
    }
    if (num != "") {
      num="00000000000000000000" + num;
      expanded+=num.substr(num.length-20);
    }
    var obj={expanded:expanded, raw:raw};
    array.push(obj);
  }
}
if (count<0) count=array.length;
if (count>array.length) count=array.length;
array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));});
for (var i=0; i<count; i++) WScript.Echo(array[i].raw);

function FindToken(str, str2, n) {
  if (n!=0 && str2=="") return str.length;
  if (n>=0) {
    var rtn = 0;
    for( var i=n; i>0; i-- ) {
      rtn = str.indexOf(str2,rtn);
      if (rtn<0) return str.length;
      rtn+=str2.length;
    }
  } else {
    var rtn = str.length;
    for (var i=n; i<0; i++ ) {
      rtn-=1;
      rtn = str.lastIndexOf(str2,rtn);
      if (rtn<0) return str.length;
    }
    rtn+=str2.length;
  }
  return rtn;
}


Dave Benham

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 06 Jan 2015 04:08
by foxidrive
dbenham wrote:Just in case the source file alredy contains slashes.

ok, ta. I gather that slashes treated as regexp characters there too then.
I don't see the problem. The /T option does not specify that only one token is sorted. It simply is part of the equation that determines the beginning position that is used for sorting. Each line is sorted based on the year through the end of the line. The content before the year is simply ignored.

Oh, I wasn't aware of that last bit.
It's just what I need here so thanks.
dbenham wrote:release 3.3 that supports negative token numbers

The solution for foxidrives problem is now as simple as:

Code: Select all

jsort test.txt /d "(" /t -1


Thanks for that - you're mucho helpful.

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 06 Jan 2015 15:18
by Squashman
So is there a way to sort on multiple tokens. Say token 5 then token 3. Or am I going to have to do it the old fashion way and move those two variables to the beginning of the line and then do a normal sort.

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 06 Jan 2015 15:27
by dbenham
No, I opted not to implement that feature. Coming up with useful syntax and rules became too messy for my taste.


Dave Benham

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 06 Jan 2015 15:38
by Squashman
I guess I will just do it the old fashion way then.

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 06 Jan 2015 21:59
by carlsomo
It would be nice to have a date option so that date fields in the format: MM/DD/YYYY or m/d/y could be sorted by year month day? Just a suggestion. I currently have to write to a new file after converting dates to YYYYMMDD, run the sort with the date in the first position, then convert the dates back again and put the date field back where it started. Kind of tedious coding.

carl

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 06 Jan 2015 23:39
by foxidrive
carl, FWIW I think that could be done with a single line using jrepl, also using sort, if the format is predictable.

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 08 Jan 2015 00:31
by carlsomo
That would be cool, so sample csv file to sort by date, most recent year on top:

Sample CSV File Title line
FRUIT,Dish,Date submitted,item
apple,taco,12/31/2000,bus ticket
banana,fruit pie,3/17/1958,table
orange,julius,1/5/1999,stool
pea pod,stew,04/9/2014,glider
key lime,pie,12/1/2000,ladder

Is there a one liner for this?

TIA, carl

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 08 Jan 2015 01:35
by foxidrive
carlsomo wrote:Is there a one liner for this?


I'm not quite sure what output you want from that sample data

Edit: I shortened the code

Code: Select all

@echo off
type "file.csv" | jrepl ".*,(\d\d|\d)/(\d\d|\d)/(\d\d\d\d),.*" "$3+lpad($1,'00')+lpad($2,'00')+'|'+$0" /j | sort /r | jrepl "^.*?\|" ""

This uses a helper batch file called `Jrepl.bat` (by dbenham) - download from: https://www.dropbox.com/s/4otci4d4s8x5ni4/Jrepl.bat

Place `Jrepl.bat` in the same folder as the batch file or in a folder that is on the path.


Sample CSV File Title line
FRUIT,Dish,Date submitted,item
pea pod,stew,04/9/2014,glider
apple,taco,12/31/2000,bus ticket
key lime,pie,12/1/2000,ladder
orange,julius,1/5/1999,stool
banana,fruit pie,3/17/1958,table

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 24 Mar 2015 20:51
by dbenham
Here is version 4.1

I've added a /U option to write only Unique lines, after sorting. (Discard Duplicates)

I've also fixed the option parser so that it properly supports no options. Versions 3.2, 3.3, and 4.0 had a bug that raised an error if no options were given.

Code: Select all

@if (@X)==(@Y) @end /* Harmless hybrid line that begins a JScript comment
@goto :Batch
 
::************ Documentation ***********
::JSORT.BAT version 4.1
:::
:::JSORT [File] [/Option [Value]]...
:::
:::  Sort lines of text from stdin and write the result to stdout.
:::  JSORT uses an ascending, case sensitive text sort by default.
:::
:::    File - If the optional File argument is specified, then JSORT reads lines
:::           from the file instead of from stdin. If specified, the File must
:::           be the very first argument.
:::
:::  Options:
:::
:::    /C n - Number of sorted lines to print. Skipped lines are always printed
:::           and do not contribute to the count. Default is -1 (all lines).
:::
:::    /D String - Specifies the string used to delimit tokens. The delimiter
:::           string is always case sensitive. A quote literal " must be escaped
:::           as \q, and a backslash literal \ must be escaped as \\.
:::           The default value is an empty string, meaning treat the entire
:::           line as a single token.
:::
:::    /I   - Ignore case when sorting (or when checking for uniqueness)
:::
:::    /N   - Sort consecutive digits as numbers instead of text. The numbers
:::           may be embedded within alpha text. JSort supports numbers up to
:::           20 digits long.
:::
:::    /O File - Writes the output to File instead of stdout.
:::
:::    /P n - Begin sorting at character position n relative to the beginning
:::           of the selected token. Lines that do not extend that far are
:::           treated as equivalent values, and collate before all other lines.
:::           The default value is 1 (first character).
:::
:::    /R   - Sort the lines in Reverse (descending) order.
:::
:::    /S n - Number of lines to skip - default is 0.
:::           Skipped lines are not sorted (remain in place)
:::
:::    /T n - Specify the token at which to begin sorting. The sort is not
:::           restricted to the selected token, it just helps to identify where
:::           to begin sorting. The default value is 1 (first token). A value of
:::           -1 represents the last token, -2 the penultimate token, etc.
:::           A value of 0 is invalid. Note that negative values only recognize
:::           tokens that occur after a delimiter. If the requested token cannot
:::           be found, then the line will collate before all other lines.
:::
:::    /U   - Only write unique lines (discard duplicates)
:::
:::    /V   - Display the version of JSORT.BAT.
:::
:::    /?   - Display this help
:::
:::JSORT.BAT was written by Dave Benham and originally posted at
:::http://www.dostips.com/forum/viewtopic.php?f=3&t=5595
:::

============== :Batch portion =============
@echo off
setlocal disableDelayedExpansion

:: Get optional input file
set "infile="
set "test=%~1"
if defined test (
  setlocal enableDelayedExpansion
  if "!test:~0,1!" neq "/" (
    endlocal
    set ^"infile=^<"%~1""
    shift /1
  ) else endlocal
)

:: Define options
set "options= /?: /c:-1 /d:"" /i: /n: /o:"" /p:1 /r: /s:0 /t:1 /u: /v: "

:: Set default option values
for %%O in (%options%) do for /f "tokens=1,* delims=:" %%A in ("%%O") do set "%%A=%%~B"

:: Get options
:loop
if not "%~1"=="" (
  setlocal enableDelayedExpansion
  set "test=!options:* %~1:=! "
  if "!test!"=="!options! " (
      >&2 echo Error: Invalid option %~1
      exit /b 1
  ) else if "!test:~0,1!"==" " (
      endlocal
      set "%~1=1"
  ) else (
      endlocal
      set "%~1=%~2"
      shift /1
  )
  shift /1
  goto :loop
)

:: Display help
if defined /? (
  for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A
  exit /b 0
)

:: Display version
if defined /v (
  for /f "delims=: tokens=*" %%A in ('findstr /bc:"::JSORT.BAT version" "%~f0"') do echo %%A
  exit /b 0
)

:: Transform and validate options
set /a "case=0%/i%, num=0%/n%, pos=%/p%-1, tok=%/t%, unique=0%/u%, order=1-2*0%/r%, 1/!(0x80000000&pos), 1/tok" 2>nul || (
  >&2 echo Error: Invalid option value.
  exit /b 1
)
if %tok% gtr 0 set /a tok-=1
set "outfile="
if defined /o set ^"outfile=^>"%/o%""

:: Perform the sort
%infile% %outfile% cscript //E:JScript //nologo "%~f0" %case% %num% %pos% %order% %/s% %/c% %tok% "%/d%" %unique%

exit /b 0

************* JScript portion **********/
var array=new Array(),
    nocase =WScript.Arguments.Item(0),
    numeric=WScript.Arguments.Item(1),
    pos    =WScript.Arguments.Item(2),
    order  =WScript.Arguments.Item(3),
    skip   =WScript.Arguments.Item(4),
    count  =WScript.Arguments.Item(5),
    token  =WScript.Arguments.Item(6),
    delim  =WScript.Arguments.Item(7).replace(/\\(?!q|\\)/g,'').replace(/\\\\/g,'\\s').replace(/\\q/g,'"').replace(/\\s/g,'\\'),
    unique =WScript.Arguments.Item(8);
while (!WScript.StdIn.AtEndOfStream) {
  if (skip > 0) {
    WScript.Echo(WScript.StdIn.ReadLine());
    skip-=1
  } else {
    var expanded="", num="", raw=WScript.StdIn.ReadLine(), upper=((nocase==1)?raw.toUpperCase():raw);
    for( var i=pos+FindToken(raw,delim,token); i<raw.length; i++ ) {
      var c=upper.substr(i,1);
      if (numeric==1 && c>="0" && c<="9") {
        num+=c;
      } else {
        if (num != "") {
          num="00000000000000000000" + num;
          expanded+=num.substr(num.length-20);
          num="";
        }
        expanded+=c;
      }
    }
    if (num != "") {
      num="00000000000000000000" + num;
      expanded+=num.substr(num.length-20);
    }
    var obj={expanded:expanded, raw:raw};
    array.push(obj);
  }
}
if (count<0) count=array.length;
if (count>array.length) count=array.length;
if (unique==1) {
  array.sort( function(a,b){
                var rtn = (a.expanded>b.expanded)-(a.expanded<b.expanded);
                if (rtn==0) {
                  var a2=(nocase==1)?a.raw.toUpperCase():a.raw;
                  var b2=(nocase==1)?b.raw.toUpperCase():b.raw;
                  rtn = (a2>b2)-(a2<b2);
                }
                return order*rtn;
              });
  if (count>=1) WScript.Echo(array[0].raw);
  for (var i=1; i<count; i++) {
    if (nocase==1 && array[i].raw.toUpperCase() == array[i-1].raw.toUpperCase()) continue;
    if (nocase==0 && array[i].raw == array[i-1].raw) continue;
    WScript.Echo(array[i].raw);
  }
} else {
  array.sort(function(a,b){return order*((a.expanded>b.expanded)-(a.expanded<b.expanded));});
  for (var i=0; i<count; i++) WScript.Echo(array[i].raw);
}

function FindToken(str, str2, n) {
  if (n!=0 && str2=="") return str.length;
  if (n>=0) {
    var rtn = 0;
    for( var i=n; i>0; i-- ) {
      rtn = str.indexOf(str2,rtn);
      if (rtn<0) return str.length;
      rtn+=str2.length;
    }
  } else {
    var rtn = str.length;
    for (var i=n; i<0; i++ ) {
      rtn-=1;
      rtn = str.lastIndexOf(str2,rtn);
      if (rtn<0) return str.length;
    }
    rtn+=str2.length;
  }
  return rtn;
}


Dave Benham

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 01 May 2015 01:29
by foxidrive
Dave, What would be the best method to use leading hashes in a line and delete all the same lines, except one line?

Can Jsort remove the last (or first line) - in a matching set that only considers field 1?

If not would a syntax like this be useful:
jsort file.txt /HB or
jsort file.txt /HE
to filter out the Beginning line or End line in a set of leading /Hashs?

For example, filter out (or maybe just include) the lines in blue which would be the End lines, after sorting the file.

007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_14.28.11\reminderfox.ics.bak2
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_15.11.09\reminderfox.ics.bak2
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_16.13.10\reminderfox.ics.bak3
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_16.15.12\reminderfox.ics.bak3
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_16.27.24\reminderfox.ics.bak3
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_16.29.17\reminderfox.ics.bak3
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_16.29.24\reminderfox.ics.bak3
007B9E14 Reminderfox backups\Firefox profile-4938hge8.default-2015-03-28_16.59.08\reminderfox.ics.bak3
050EB09F Reminderfox backups\Firefox profile-4938hge8.default-2015-02-27_20.08.33\reminderfox.ics
050EB09F Reminderfox backups\Firefox profile-4938hge8.default-2015-02-27_21.48.20\reminderfox.ics
050EB09F Reminderfox backups\Firefox profile-4938hge8.default-2015-02-27_21.48.20\reminderfox.ics.bak1
069D347B Reminderfox backups\Firefox profile-4938hge8.default-2015-02-12_01.01.25\calDAVmap.css
07B5268C Reminderfox backups\reminderfox.ics.2014-07-12_03.09

07C51335 Reminderfox backups\reminderfox.ics.2014-03-31_02.08
07C51335 Reminderfox backups\reminderfox.ics.2014-03-31_15.07
07C51335 Reminderfox backups\reminderfox.ics.2014-03-31_15.14
07C51335 Reminderfox backups\reminderfox.ics.2014-03-31_15.15
0862C70B Reminderfox backups\reminderfox.ics.2014-12-13_03.54
0862C70B Reminderfox backups\reminderfox.ics.2014-12-13_04.19
0A077988 Reminderfox backups\reminderfox.ics.2014-02-06_15.50
0A077988 Reminderfox backups\reminderfox.ics.2014-02-06_17.28
0AC5D19A Reminderfox backups\reminderfox.ics.2015-01-11_03.56
0AC5D19A Reminderfox backups\Thunderbird profile-6978he18.default-2015-01-14_14.42.13\reminderfox.ics.bak2
0AC5D19A Reminderfox backups\Thunderbird profile-6978he18.default-2015-01-14_14.46.05\reminderfox.ics.bak2
0AC5D19A Reminderfox backups\Thunderbird profile-6978he18.default-2015-01-15_23.31.58\reminderfox.ics.bak3
0AFDCCA6 Reminderfox backups\reminderfox.ics.2014-08-10_03.29
0B58297D Reminderfox backups\reminderfox.ics.2014-08-19_00.12
0BF8B098 Reminderfox backups\reminderfox.ics.2014-02-25_11.08


======================================================

I have a script that does this task with files, and recursively, but it might be a useful thing and I'm not clever enough to figure out any jscript to do it.

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 01 May 2015 05:13
by dbenham
I wouldn't mess with JSORT at all. I would use SORT piped to JREPL

Code: Select all

sort yourfile.log | jrepl "^(.{8} ).*\n(\1.*\n)*(?=\1)" "" /m


Dave Benham

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 01 May 2015 05:23
by foxidrive
dbenham wrote:I wouldn't mess with JSORT at all. I would use SORT piped to JREPL

Code: Select all

sort yourfile.log | jrepl "^(.{8} ).*\n(\1.*\n)*(?=\1)" "" /m



My brain just exploded.

Two questions though - will sort function effectively when case is taken into consideration, and
you may have read one of my edits but I'd like to filter out the matches and return the rest.

On re-reading I wasn't clear that filtering out was my preference but that inclusive may have been useful as a secondary option - if it was added to Jsort, was my thinking.

Thanks for your code.

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 01 May 2015 07:21
by dbenham
SORT is not case sensitive. If you need case sensitive, then you can switch to JSORT.

As far as the rest of your questions, I'm having trouble understanding the collective meaning of your words (with both your posts). For example, I'm not sure what you mean by "filtering out" - Are the filtered out lines discarded, or preserved? And what do you mean by "inclusive".

The way I derived my code was to look for a way to preserve the blue lines in your first post.

The only feature I am thinking of adding to JSORT is the ability to specify a regex search/replace pair. The sort would be performed on the replace result, but the original lines would be preserved (except for duplicate removal if /U specified, though definition of duplicate is a bit hazy yet). But I haven't decided to leap and develop this feature yet. It would require merging a lot of code from JSORT and JREPL. Perhaps it would be an entirely new utility - REGXSORT.

A utility like REGXSORT would easily solve your problem in one step.


Dave Benham

Re: JSORT.BAT - Case sensitive sort with option for numeric

Posted: 01 May 2015 16:39
by foxidrive
dbenham wrote:I'm having trouble understanding the collective meaning of your words (with both your posts). For example, I'm not sure what you mean by "filtering out" - Are the filtered out lines discarded, or preserved? And what do you mean by "inclusive".


Yeah, sometimes I write like a retarded monkey with dyslexia.

Filtering out - was meant to mean to discard those lines.
Inclusive - was meant to mean only print those lines (which is what your code does well).

It would require merging a lot of code from JSORT and JREPL. Perhaps it would be an entirely new utility - REGXSORT.

A utility like REGXSORT would easily solve your problem in one step.


If you find the time and enthusiasm then it sounds like another great tool for the batch community. :thumbsup: