Multi-process Advanced Encryption Standard (AES)

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Aacini
Expert
Posts: 1670
Joined: 06 Dec 2011 22:15
Location: México City, México

Re: Multi-process Advanced Encryption Standard (AES)

#31 Post by Aacini » 11 Nov 2013 16:51

Are you aware of the small fix I did in my code? In the first appearance of AddRoundKey routine I originally posted the XOR operation with two carets, that produce the "Missing operand" error. Here it is:

Code: Select all

rem AES:    AddRoundKey(state, w[0, Nb-1]^)
   for /L %%c in (0,1,%NbM1%) do (
      for %%b in (0 1 2 3) do (
         set /A "state[%%b][%%c]^=w%%b[%%c]"      <- This line had two carets: "...[%%c]^^=w%%b..."
         %TRACE% set sch[%%b][%%c]=!w%%b[%%c]!
      )
   )
   %TRACE% call :ShowMatrix "round[0].k_sch " sch >&2


I think that fixing this error will also correct the wrong result...

Antonio

Sponge Belly
Posts: 200
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: Multi-process Advanced Encryption Standard (AES)

#32 Post by Sponge Belly » 12 Nov 2013 08:56

Hi Magialisk! :-)

I haven’t been following this topic too closely because I know nothing of encryption, but I read your plea for help on how to stop a for /l loop in its tracks. It is possible, but at the expense of starting a subshell. Here's a trivial example:

Code: Select all

@echo off & setlocal enableextensions

for /f %%a in ('set stop^=^&^
for /l %%i in (1 1 10^) do @if not defined stop (^
if %%i equ 5 set stop^=1^) else (echo(%%i ^& exit 0^)
') do echo(%%a

endlocal & goto :eof


Note that stop is defined when %%i reaches 5, but there is one more iteration of the for /l loop to execute the else clause of the if statement and 6 is the final output.

The above snippet is suitable for alphanumeric output, but if you want special characters, the code becomes much more complicated and looks like what I call “caret soup.” See reverse a string without goto for an example. :twisted:

As for the null character, why not create a file consisting of a single NUL and append it to the output file using type as and when needed? Or am I missing something? :?:

And again, hex to binary shouldn’t be a problem. Simply use certutil, or try Dave Benham’s hex2str and str2hex subroutines for maximum compatibility… or am I missing something?

Anyways, hope this helps! ;-)

- SB

Aacini
Expert
Posts: 1670
Joined: 06 Dec 2011 22:15
Location: México City, México

Re: Multi-process Advanced Encryption Standard (AES)

#33 Post by Aacini » 12 Nov 2013 13:58

I read again the AES documentation on "4.2.1 Multiplication by x". Although there is an explanation with several figures, I just did not understood it the first time; I thought that this result was obtained by xor'ing the result of the multiplication with 0x011B, but this give wrong results. After multiple tests I finally realized how this result is obtained! In my opinion, the documentation should include a description appropriate for non-matemathic oriented people, but computer people; something similar to this one:

AES documentation should wrote:This operation can be implemented at the byte level by successive shift left the value of b(x) and xor'ing it by 11B if b8 = 1, and then xor'ing this intermediate result with the final result if the corresponding bit of x is 1. This way multiplication by any constant can be implemented, but you must note that the encrypt procedure just use the constants 2 and 3 so only one intermediate result is required, and the decrypt procedure use the constants 9, 0b, 0d and 0e so it only needs three intermediate results.

I then wrote a Batch program to generate the multiplying tables using this method. Here it is:

Code: Select all

rem Definition of multiplication tables over the Rijndael Galois finite field
rem multiplied by 2, 3, 9, B (11), D (13) and E (14)

rem        Factor:                  2         3       9          B          D          E
for /L %%x in (0,1,255) do (
   rem     bit 0 [1]:               0         1       1          1          1          0
   set /A "x=%%x,                            G3[%%x]=G9[%%x]=   GB[%%x]=   GD[%%x]=x"
   rem     bit 1 [2]:               1         1       0          1          0          1
   set /A "x<<=1, x^=(x>>8)*0x11B, G2[%%x]=x,G3[%%x]^=x,        GB[%%x]^=x,           GE[%%x]=x"
   rem     bit 2 [4]:               0         0       0          0          1          1
   set /A "x<<=1, x^=(x>>8)*0x11B,                                         GD[%%x]^=x,GE[%%x]^=x"
   rem     bit 3 [8]:               0         0       1          1          1          1
   set /A "x<<=1, x^=(x>>8)*0x11B,                   G9[%%x]^=x,GB[%%x]^=x,GD[%%x]^=x,GE[%%x]^=x"
)

However, when I compared these results with Magialisk's original constants there were many differences. :shock: For example, these are the differences in G3 table:

Code: Select all

Magialisk: 03 x 03 = 09,   Aacini: 03 x 03 = 05
Magialisk: 03 x 06 = 12,   Aacini: 03 x 06 = 0a
Magialisk: 03 x 07 = 15,   Aacini: 03 x 07 = 09
Magialisk: 03 x 0b = 21,   Aacini: 03 x 0b = 1d
Magialisk: 03 x 0c = 24,   Aacini: 03 x 0c = 14
Magialisk: 03 x 0d = 27,   Aacini: 03 x 0d = 17
Magialisk: 03 x 0e = 2a,   Aacini: 03 x 0e = 12
Magialisk: 03 x 0f = 2d,   Aacini: 03 x 0f = 11
Magialisk: 03 x 13 = 39,   Aacini: 03 x 13 = 35
Magialisk: 03 x 16 = 42,   Aacini: 03 x 16 = 3a
Magialisk: 03 x 17 = 45,   Aacini: 03 x 17 = 39
Magialisk: 03 x 18 = 48,   Aacini: 03 x 18 = 28
Magialisk: 03 x 19 = 4b,   Aacini: 03 x 19 = 2b
Magialisk: 03 x 1a = 4e,   Aacini: 03 x 1a = 2e
Magialisk: 03 x 1b = 51,   Aacini: 03 x 1b = 2d
Magialisk: 03 x 1c = 54,   Aacini: 03 x 1c = 24
Magialisk: 03 x 1d = 57,   Aacini: 03 x 1d = 27
Magialisk: 03 x 1e = 5a,   Aacini: 03 x 1e = 22
Magialisk: 03 x 1f = 5d,   Aacini: 03 x 1f = 21
Magialisk: 03 x 23 = 69,   Aacini: 03 x 23 = 65
Magialisk: 03 x 26 = 72,   Aacini: 03 x 26 = 6a
Magialisk: 03 x 27 = 75,   Aacini: 03 x 27 = 69
Magialisk: 03 x 2b = 81,   Aacini: 03 x 2b = 7d
Magialisk: 03 x 2c = 84,   Aacini: 03 x 2c = 74
Magialisk: 03 x 2d = 87,   Aacini: 03 x 2d = 77
Magialisk: 03 x 2e = 8a,   Aacini: 03 x 2e = 72
Magialisk: 03 x 2f = 8d,   Aacini: 03 x 2f = 71
Magialisk: 03 x 30 = 90,   Aacini: 03 x 30 = 50
Magialisk: 03 x 31 = 93,   Aacini: 03 x 31 = 53
Magialisk: 03 x 32 = 96,   Aacini: 03 x 32 = 56
Magialisk: 03 x 33 = 99,   Aacini: 03 x 33 = 55
Magialisk: 03 x 34 = 9c,   Aacini: 03 x 34 = 5c
Magialisk: 03 x 35 = 9f,   Aacini: 03 x 35 = 5f
Magialisk: 03 x 36 = a2,   Aacini: 03 x 36 = 5a
Magialisk: 03 x 37 = a5,   Aacini: 03 x 37 = 59
Magialisk: 03 x 38 = a8,   Aacini: 03 x 38 = 48
Magialisk: 03 x 39 = ab,   Aacini: 03 x 39 = 4b
Magialisk: 03 x 3a = ae,   Aacini: 03 x 3a = 4e
Magialisk: 03 x 3b = b1,   Aacini: 03 x 3b = 4d
Magialisk: 03 x 3c = b4,   Aacini: 03 x 3c = 44
Magialisk: 03 x 3d = b7,   Aacini: 03 x 3d = 47
Magialisk: 03 x 3e = ba,   Aacini: 03 x 3e = 42
Magialisk: 03 x 3f = bd,   Aacini: 03 x 3f = 41
Magialisk: 03 x 43 = c9,   Aacini: 03 x 43 = c5
Magialisk: 03 x 46 = d2,   Aacini: 03 x 46 = ca
Magialisk: 03 x 47 = d5,   Aacini: 03 x 47 = c9
Magialisk: 03 x 4b = e1,   Aacini: 03 x 4b = dd
Magialisk: 03 x 4c = e4,   Aacini: 03 x 4c = d4
Magialisk: 03 x 4d = e7,   Aacini: 03 x 4d = d7
Magialisk: 03 x 4e = ea,   Aacini: 03 x 4e = d2
Magialisk: 03 x 4f = ed,   Aacini: 03 x 4f = d1
Magialisk: 03 x 53 = f9,   Aacini: 03 x 53 = f5

I checked the 03 x 13 case (Magialisk = 39, Aacini = 35) following the same procedure of AES documentation:

Code: Select all

{03} • {02} = xtime({03}) = {06} 
{03} • {04} = xtime({06}) = {0c}
{03} • {08} = xtime({0c}) = {18}
{03} • {10} = xtime({18}) = {30},
thus,
{03} • {13} = {03} • ({01} o {02} o {10})
            = {03} o {06} o {30}
            = {35}


:?: :?: :?:

Antonio

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#34 Post by Magialisk » 13 Nov 2013 02:42

Aacini where are you pulling my constants from? I just checked my code and they seem to agree with your results. For example here is the first part of my G3 matrix:

Code: Select all

set /a G3[0]=0x00, G3[1]=0x03, G3[2]=0x06, G3[3]=0x05, G3[4]=0x0C, G3[5]=0x0F, G3[6]=0x0A, G3[7]=0x09, G3[8]=0x18, G3[9]=0x1B, G3[10]=0x1E, G3[11]=0x1D, G3[12]=0x14, G3[13]=0x17, G3[14]=0x12, G3[15]=0x11, G3[16]=0x30, G3[17]=0x33, G3[18]=0x36, G3[19]=0x35, G3[20]=0x3C, G3[21]=0x3F, G3[22]=0x3A, G3[23]=0x39, G3[24]=0x28, G3[25]=0x2B, G3[26]=0x2E, G3[27]=0x2D, G3[28]=0x24, G3[29]=0x27, G3[30]=0x22, G3[31]=0x21, 

So 3x3 is 5, not 9, 3x6 is A, not 12, 3x7 is 9, not 15. It sure looks like my constants and your constants are one and the same, so where did the "Magialisk" column in your post come from?

Aacini
Expert
Posts: 1670
Joined: 06 Dec 2011 22:15
Location: México City, México

Re: Multi-process Advanced Encryption Standard (AES)

#35 Post by Aacini » 13 Nov 2013 10:59

Oops! MY MISTAKE! :? My Mistake :( my mistake :cry: I really sorry... :oops:

I didn't converted the first 1/3 part of your G3 table in my code; the whole table is in the Batch file below. I also include a timing comparison of the generation of G2 and G3 tables via my formula vs. conversion of predefined constants. Although in this case the timing is similar, note that the first 128 constants of G2 table are not converted, but calculated. I am sure that the generation of 9, B (11), D (13) and E (14) tables will be faster via formula, because in this case the four tables are created at the same time in just one FOR loop.

Code: Select all

@echo off
setlocal EnableDelayedExpansion

rem Define a decimal-to-hexadecimal conversion table for one byte
set i=0
for %%a in (0 1 2 3 4 5 6 7 8 9 a b c d e f) do (
   for %%b in (0 1 2 3 4 5 6 7 8 9 a b c d e f) do (
      set "H[!i!]=%%a%%b"
      set /A i+=1
   )
)

echo %time% - Start definition of G2 and G3 tables via formula

rem Definition of multiplication tables over the Rijndael Galois finite field
rem multiplied by 2 and 3

rem        Factor:                  2         3
for /L %%x in (0,1,255) do (
   rem     bit 0 [1]:               0         1
   set /A "x=%%x,                            G3[%%x]=x"
   rem     bit 1 [2]:               1         1
   set /A "x<<=1, x^=(x>>8)*0x11B, G2[%%x]=x,G3[%%x]^=x"
)

echo %time% - Start definition of G2 table via converting pre-defined constants

rem Multiplication by 2:
for /L %%i in (0,1,127) do set /A G2%%i=2*%%i
set i=128
for %%s in ("1B191F1D131117150B090F0D03010705"
            "3B393F3D333137352B292F2D23212725"
            "5B595F5D535157554B494F4D43414745"
            "7B797F7D737177756B696F6D63616765"
            "9B999F9D939197958B898F8D83818785"
            "BBB9BFBDB3B1B7B5ABA9AFADA3A1A7A5"
            "DBD9DFDDD3D1D7D5CBC9CFCDC3C1C7C5"
            "FBF9FFFDF3F1F7F5EBE9EFEDE3E1E7E5"
           ) do (
   set "s=%%~s"
   for /L %%j in (0,2,30) do set /A "G2!i!=0x!s:~%%j,2!, i+=1"
)

echo %time% - Start definition of G3 table via converting pre-defined constants

rem Multiplication by 3:
set i=0
for %%s in ("000306050C0F0A09181B1E1D14171211"
            "303336353C3F3A39282B2E2D24272221"
            "606366656C6F6A69787B7E7D74777271"
            "505356555C5F5A59484B4E4D44474241"
            "C0C3C6C5CCCFCAC9D8DBDEDDD4D7D2D1"
            "F0F3F6F5FCFFFAF9E8EBEEEDE4E7E2E1"
            "A0A3A6A5ACAFAAA9B8BBBEBDB4B7B2B1"
            "909396959C9F9A99888B8E8D84878281"
            "9B989D9E97949192838085868F8C898A"
            "ABA8ADAEA7A4A1A2B3B0B5B6BFBCB9BA"
            "FBF8FDFEF7F4F1F2E3E0E5E6EFECE9EA"
            "CBC8CDCEC7C4C1C2D3D0D5D6DFDCD9DA"
            "5B585D5E57545152434045464F4C494A"
            "6B686D6E67646162737075767F7C797A"
            "3B383D3E37343132232025262F2C292A"
            "0B080D0E07040102131015161F1C191A"
           ) do (
   set "s=%%~s"
   for /L %%j in (0,2,30) do set /A "G3!i!=0x!s:~%%j,2!, i+=1"
)

echo %time%

rem Compare both tables and show differences

for %%a in (2 3) do (
   for /L %%x in (0,1,255) do (
      if "!G%%a%%x!" neq "!G%%a[%%x]!" (
         for /F "tokens=1-3" %%A in ("!H[%%x]! !G%%a%%x! !G%%a[%%x]!") do (
            echo Magialisk: 0%%a x %%A = !H[%%B]!,   Aacini: 0%%a x %%A = !H[%%C]!
         )
      )
   )
)


Output:

Code: Select all

C:>GaloisMultiplication.bat
10:17:17.12 - Start definition of G2 and G3 tables via formula
10:17:17.69 - Start definition of G2 table via converting pre-defined constants
10:17:17.91 - Start definition of G3 table via converting pre-defined constants
10:17:18.26


Antonio

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#36 Post by Magialisk » 13 Nov 2013 16:04

Hey there's absolutely nothing to be sorry about, I'm just glad you figured it out :D You had me wondering if somehow I'd posted bad values in an earlier version of code, but I couldn't figure out how that would be possible...

It looks like you're really coming along well on your rewrite of the algorithm! I've been busy the last couple days with a minor crisis at work, and I have a school project due at the end of the week, so I've had to leave my AES code alone for now. I can't wait to see how yours turns out and jump back into mine. I really want to overhaul my PID handler again because the latest time tests against you were an embarrassment... :oops:

The "conversion" technique you're using to parse a long string of constants with a FOR loop to create a table was actually suggested to me by jeb back in the original development thread. Although his approach was to run the FOR in "real time" when a constant was needed (not pre-compute the table up-front). I ran some timing tests doing it his way vs. my way back then and it showed his was a bit slower by about 150ms, though the savings in code was huge, since there were no lookup tables at all.

Your "conversion" seems to be a hybrid, where you still create the table up-front for later lookups, but you do it via FOR instead of "set X=, Y=, Z=, ...". I would have guessed that doing a set inside a FOR would be slow, like my original "set X= & set Y= &..." approach, but I guess I was wrong about that too?
It usually doesn't make sense to me why technique A is slower or faster for the CMD parser than technique B, and frequently (as here) it's counterintuitive...

Either way your new generator algorithm looks to be faster than the conversion, and much faster than my crappy algorithm (posted earlier). Originally I had a hard time getting an algorithm to work based on only reading the FIPS doc, so I ended up implementing a modified "Peasant's algorithm" to generate my Galois constants. http://en.wikipedia.org/wiki/Multiplication_algorithm#Peasant_or_binary_multiplication. Obviously mine works, but it's not "purpose built" just for Galois like yours is because I didn't understand the standard well enough at the time. I understand the standard a bit better now and should have probably gone back to rewrite the algorithm like you did, but thank you for doing it! It's so exciting to see this level of interest and cooperation!

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#37 Post by Magialisk » 13 Nov 2013 16:12

Sponge Belly wrote:Hi Magialisk! :-)
I read your plea for help on how to stop a for /l loop in its tracks. It is possible, but at the expense of starting a subshell. Here's a trivial example:
I just realized that I never stopped to thank you for this, so THANK YOU! :D

I've been a bit tied up the last couple days so haven't had a chance to test it in my Galois algorithm, but even besides that this could be an invaluable technique for breaking FOR loops in general, so I'm excited to try it out. Now that Aacini has really dug in to the standards my old "Peasant's Algorithm" Galois generator is basically OBE, and I'll likely re-write a generator that more closely resembles the specific AES standard, like Aacini did. As I said though, breaking a FOR could certainly come in handy for other endeavors, so I very much appreciate the help.

Aacini
Expert
Posts: 1670
Joined: 06 Dec 2011 22:15
Location: México City, México

Re: Multi-process Advanced Encryption Standard (AES)

#38 Post by Aacini » 19 Nov 2013 00:17

@Magialisk:

Yes, all operations are simpler if pre-computed constants are used and they allows some interesting tricks, like these ones:

Code: Select all

Core part of MixColumns():

            for %%m in ("0=G2 G3 + +" "1=+ G2 G3 +" "2=+ + G2 G3" "3=G3 + + G2") do (

Core part of InvMixColummns():

            for %%m in ("0=Ge Gb Gd G9" "1=G9 Ge Gb Gd" "2=Gd G9 Ge Gb" "3=Gb Gd G9 Ge") do (


Let's pass to the multi-process feature. The base idea of this method is that each parallel process be executed in a different CPU core, so there will be as many active processes as CPU cores. The controller process takes one CPU core, so there will be CPU-1 parallel encryption processes; lets name this number N. In the description below, assume N=3 and numRecords=9 in the file.

Note that the "controller process" is BinToHex subroutine itself so this is not a CPU core wasted just as controller, but used both as hexadecimal conversion and multi-process controller.

My first idea was to split the hexadecimal file in N parts in the BinToHex subroutine and pass each part to a different encrypt process as soon as it be ready. This method just needs a START command each time that numRecords/N records are converted as the "multi-process controller".

Code: Select all

file.ext -> BinToHex -> filePart#.hex -> Encrypt# -> filePart#.aes -> JoinParts -> file.aes

1111     ->    \
2222     ->     \
3333     ->   --->   -> filePart1.hex -> Encrypt1 -> filePart1.aes -> ren filePart1.aes file.aes
4444     ->    \
5555     ->     \
6666     ->   --->   -> filePart2.hex -> Encrypt2 -> filePart2.aes -> type filePart2.aes >> file.aes
7777     ->    \
8888     ->     \
9999     ->   --->   -> filePart3.hex -> Encrypt3 -> filePart3.aes -> type filePart3.aes >> file.aes


However, in this method the last available CPU core will be used until the last hexa part (that is, the whole file) be generated, and when the first encryption process ends its CPU core will be no longer used, so this method waste the available CPU cores a lot.

A more efficient method consist in pass each hexadecimal converted record to the next encryption process in turn, so all processes will be active as soon as N records be converted. That is:

Code: Select all

file.ext -> BinToHex -> filePart#.hex -> Encrypt# -> file.aes

1111     ->          >> filePart1.hex -> Encrypt1 -> ae11
2222     ->          >> filePart2.hex -> Encrypt2 -> ae22
3333     ->          >> filePart3.hex -> Encrypt3 -> ae33
4444     ->          >> filePart1.hex -> Encrypt1 -> ae44
5555     ->          >> filePart2.hex -> Encrypt2 -> ae55
6666     ->          >> filePart3.hex -> Encrypt3 -> ae66
7777     ->          >> filePart1.hex -> Encrypt1 -> ae77
8888     ->          >> filePart2.hex -> Encrypt2 -> ae88
9999     ->          >> filePart3.hex -> Encrypt3 -> ae99


The simplest way to coordinate this scheme is via a "pseudo-pipe" file: the BinToHex routine append records to the next file part in turn, and each encryption routine just read them via a redirected SET /P command. The key for this to work is that always be an available record in each "pipe" file when its encrypt process read it (that the pipe buffer never empties); this is achieved if the time required to convert N records in BinToHex subroutine is less or equal than the time each encryption subroutine requires to encrypt one record. If a given computer have too many CPU cores, so the first core requires more time to convert N records than the time each core requires to encrypt one record, then some CPU cores must be necessarily wasted: it is not possible to encrypt a file in less time than the time required to convert it! The maximum number of CPU cores that this method may use may be obtained via separate timing tests for convert and encrypt routines followed by timeEncrypt/timeConvert operation.

The Batch code below is an example of this method; it may be used as general format for any application that follows this scheme: process a file via a fast part followed by a slow one that may be divided/repeated in several parallel processes using "multi pipe files". In the example below the time that write part (convert) and read part (encrypt) takes to process one record is simulated via a ping command, so you may review how the behavior of the method changes as these times are adjusted. In this case the choosen ping values for write and read parts, 2 and 4 respectively, produce a perfect synchronization: each record is read (encrypted) as soon as it was wrote (converted), so the encryption process ends just instants after the conversion process ends. Of course, in a real application you can not adjust the time each routine takes, just change the number N of parallel processes.

EDIT: The following paragraph and code modification was made after Magialisk's comment about "this part can not be optional".

The generation of the final file is comprised by the output of each encrypted record redirected to the same output file. This may cause problems if two parallel processes tries to write to the output file at same time. This is solved via a "semaphore file" called "outputTurn.#" that is renamed to the next output turn in each parallel process.

Code: Select all

@echo off
setlocal EnableDelayedExpansion

rem Multi-process dispatcher:
set "param=%~1"
if "!param:~0,1!" equ ":" (
   shift
   goto !param:~1!
)

set N=3

echo %time% - Write side running > output.txt
rem Write side (fast):
del pipefile?.txt 2>NUL
echo X >outputTurn.0
for /L %%i in (0,1,14) do (
   echo !time! - Write: Record %%i >> output.txt
   set /A nextTurn=%%i %% N
   echo Record %%i >> pipefile!nextTurn!.txt
   ping -n 2 localhost >NUL
   rem Multi-process starter:
   if %%i lss %N% start "" /B "%~F0" :ReadSide !nextTurn!
)
echo %time% - Write records end >> output.txt
:waitForReadSides
if exist pipefile?.txt goto waitForReadSides
echo %time% - Write side end >> output.txt
del outputTurn.*
type output.txt
goto :EOF

rem Read side (slow):
:ReadSide turn
echo %time% - Read side turn %1 - started >> output.txt
call :nextRead %1 < pipefile%1.txt
echo %time% - Read side turn %1 - end >> output.txt
del pipeFile%1.txt
exit

:nextRead turn
   set "line="
   set /P line=
   if not defined line goto endRead
   if not exist outputTurn.%1 call :waitMyTurn %1
   echo !time! - Read turn %1: %line% >> output.txt
   set /A nextTurn=(%1+1) %% N
   ren outputTurn.%1 outputTurn.%nextTurn%
   ping -n 4 localhost >NUL
goto nextRead
:endRead
exit /B

:waitMyTurn
if not exist outputTurn.%1 goto waitMyTurn
exit /B


I included this method in the last edit of my AES FIPS-197 encryption program above (excepting the part about "outputTurn.#" file mentioned in the edit); however, I did not fully tested it because my computer have just 2 CPU cores.

Antonio
Last edited by Aacini on 24 Nov 2013 13:16, edited 1 time in total.

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#39 Post by Magialisk » 19 Nov 2013 15:52

Antonio your timing couldn't be better. Yesterday I had some free time to work on my program and I made number of tweaks to improve performance. I haven't gotten to upload the modified version yet, maybe in a day or two. My main focus yesterday was on minimizing the size of the CMD environment, which had a larger than expected impact on the execution time of each child. I also borrowed your algorithm for defining the Galois constants in my main control program. I never really liked having 1500 constant definitions up front, but my Galois algorithm was too slow to use in real time. I studied yours, compared it to mine, and in effect they were doing the same basic thing (as should be expected!) but yours was so much cleaner and more efficient I just had to integrate it with my code! I can't believe I was using 8 loop iterations to do what you managed in a few XORs... that's what I get for not taking the time to fully understand the FIPS standard before I started coding :!:

I say your timing couldn't be better because I believe I've tweaked the main processing code about as well as I'll be able to for now, so I've turned my attention back to the PID controller. I've completely rewritten it again to use CPU affinity masks instead of vague PID counts, and it works well, it's just not any faster than the original method was... No matter what I do to my PID handling logic, I keep coming out around 37s to encode a 10kB test file.

I've been coming to the realization that the only chance I'm going to have for a large speedup is to break up the input file and hand pieces to the children, I just didn't want to go there because it's a big rewrite to my code. I'd have to integrate the hex2bin function with the PID dispatcher, instead of doing it all in one lump up front, etc. I just assumed an entirely different input structure when I first wrote this, so it will be a pain to change. On the bright side, I'm already using a "mutex/semaphore" file, and I had been contemplating dual-purposing that same file as an input pipe. Seeing how you've gone in a very similar direction at least validates that concept, so maybe I'll give it a whirl the next time I'm bored...

By the way, I thought I'd point out that even though you're starting multiple children in sequence and they're all running the same code, you can never assume they're going to finish in order. Unless you wait for child 'n' to finish before starting child 'n+1' (which defeats the whole purpose) you're at the mercy of the OS when it will schedule CPU time for each one. It can and will interrupt child 2 when you start child 3, or 4, or 5, and there's no telling who will end up finishing first out of the mess of children. I did a lot of testing on both my quad and 12 thread machines and you absolutely have to have a mutex/semaphore or some other mechanism to keep the output in order. So the semaphore approach you proposed is viable, I just wanted to stress that it's generally not "optional" :D Also, in my mutex code I built in a suicide timer early on. It's especially handy when developing/testing, so if something goes wrong you don't have 100 orphaned children stuck in an infinite "IF NOT myturn GOTO" loop. The children timeout and commit suicide after a few seconds of being stuck. Definitely not a mandatory feature, but I got tired of cleaning up after my mistakes and bad code when developing the mutual exclusion code :lol:

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: Multi-process Advanced Encryption Standard (AES)

#40 Post by Ed Dyreen » 19 Nov 2013 18:02

Aacini wrote:I read again the AES documentation on "4.2.1 Multiplication by x".
I don't know about AES but I do know about RSA. This algo works with public/private keys and of course prime numbers. When public keys are shared it is used for encrypting and when private keys are shared it is used for authentication. The lack of BigInteger support in batch I see as the main problem but the algo is simple safe and straightforward. All this AES table and row multiplication talk confused me so I had a look at the documentation http://www.thestudymaterial.com/present ... ption.html. What I read there is that it like RSA depends on at least one key which has to remain secret. This means your batch must know about it which immediately raises a question. If your script/s use a secret key to do the de-/encoding then how are you going to prevent others from searching for this secret key in your script and reverse engineer it ?

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#41 Post by Magialisk » 20 Nov 2013 02:26

The key is an input to the script, just as with any encryption program. As you say it can never be stored in the script itself, that would defeat the purpose. When calling the AES program you give it a filename to work on, and a key to use. If you give it the wrong key it still runs perfectly fine, but the data it produces will be random garbage.

As far as what key you choose to use and how you choose to protect it that's up to the user. As with all encryption, by definition, you've only shifted the responsibility from protecting the (large) secret data to protecting the (short) secret key. I don't expect anyone wants to memorize a 32-character random hexadecimal value (a49b 3fe6 ... ... dc17) but there are lots of other ways to obscure a secret key, or just write it down somewhere :D .

Here's an example someone might come up with: You choose an inconspicuous file in the directory with the script (among several other files), and you choose a favorite number (114 for sake of argument). As the key you use the 16, 24 or 32 bytes of that file starting at byte 114, or maybe start at byte 114^2, or...? The point is the value of the key doesn't matter, as long as you can remember how to retrieve it and an attacker can't guess what you're up to.

Granted, the above approach wouldn't be remotely up to government standards, but all along this project was just an educational proof of concept to teach myself the AES algorithm. Since nobody will ever use AES in batch for practical purposes, it doesn't much matter how they would protect their keys :D Good discussion point though, and it's just as relevant whether you're calling a C++ crypto library, logging in to your "full disk encrypted" Dell, or fiddling in a batch.

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#42 Post by Magialisk » 21 Nov 2013 01:22

Boy I wish I would have gone back a page and looked at Antonio's updated code when he posted it. It sure could have saved me a lot of time debugging my new pipes :oops: .

The good news is I've converted my code to use pipe files in a strikingly similar way to how you did. Since mine was already in two batch files I don't have to have the main batch call itself and shift, instead I tweaked my child 'aescore' batch to work in either the original input mode or piped mode, depending on how it is called. Aside from that, I put both a PID# and the data block into the pipe where you're writing only the data, and I'm using a mutex to guarantee proper result ordering, but otherwise the nature of our pipe code is pretty much identical.

That said my piped encryption side is tested and working now, and almost twice as fast as my previous versions with either affinity assignments or a PID manager. It was definitely worth the code re-organization :!: Unfortunately I'm not quite ready to post the new code as I'm chasing an extremely nasty bug in my decrypt side. I don't know how it's happening but the decrypt side occasionally works alright (maybe 1/10 times) but most of the time it gets to the end and prints a nonsense message that looks like "c:\<dir path>\aes\* Are you sure? [Y/N]". It doesn't even ask a yes/no question, but if you answer 'Y' it deletes every file in the directory with the AES script, including the script itself. Yeah, good thing I had backups :evil:

Anyway, I intend to chase down the bug tomorrow and should have fresh new code posted before the weekend. It has to be related to cleaning up temporary files at the end of the script, but I haven't figured out how a statement like "del %tempfile%" can turn into a directory-wiping *.* mask, or how the same code run 10 times can do one thing 1 time, and a different thing the other 9 times. Perhaps I have a leaky pipe :D

foxidrive
Expert
Posts: 6033
Joined: 10 Feb 2012 02:20

Re: Multi-process Advanced Encryption Standard (AES)

#43 Post by foxidrive » 21 Nov 2013 05:07

del %tempfile% can do odd things when the variable is changed from what you expect.

Magialisk
Posts: 104
Joined: 25 Jul 2013 19:00

Re: Multi-process Advanced Encryption Standard (AES)

#44 Post by Magialisk » 25 Nov 2013 17:40

Unfortunately I did not make my "before the weekend" target as it took me a couple of days to track down the very nasty bug I had in my decryption routine. All signs were pointing in a completely different direction from where the bug was. It looked like a timing/sync problem as with some added ping delays to slow everything down I could sometimes get the program to work right, etc.

In the end it ended up being that I had two labels with the same name, one in the encrypt routine and one in the decrypt. :oops: That wasn't the only bug, but it was the most severe, the other was a simple logic error. I still haven't figured out how this caused the batch to wipe my entire directory on a couple occasions, or some of the other weird errors that it presented, but as foxidrive alluded to once it was jumping to the wrong label all bets were off in terms of variable contents...

In any case, the first several posts of this thread have been updated with the new code which uses file pipes very similar to Antonio's. My aescore.bat file can be used in piped mode or the original "PID controller" mode, but otherwise a pipe is a pipe so our implementations came out pretty similar.

Thanks again to everyone who's offered suggestions in this thread, most of all Antonio. I'm pretty much ready to call this project complete, but if anyone sees a stupid mistake or obvious enhancement please let me know.

einstein1969
Expert
Posts: 776
Joined: 15 Jun 2012 13:16
Location: Italy, Rome

Re: Multi-process Advanced Encryption Standard (AES)

#45 Post by einstein1969 » 26 Nov 2013 07:26

hi majalisk,

thanks fo updating.

I will test.

There is an error of division for zero in the case run on monoprocessor
In this case the calculated %numCPU%=0

Code: Select all

set /a pipe=!pid! %% %numCPU%


Einstein1969

Post Reply