DosTips.com

A Forum all about DOS Batch
It is currently 24 Feb 2017 17:58

All times are UTC-06:00




Post new topic  Reply to topic  [ 79 posts ]  Go to page Previous 1 2 3 4 5 6 Next
Author Message
PostPosted: 18 Jul 2016 08:51 
Online

Joined: 02 May 2016 18:20
Posts: 222
Did some quick profiling for 3dworld2.bat

300 iterations, no user interaction:
Code: Select all
Running normally: 17s
Running without calling cmdgfx.exe: 0s
Running with cmdgfx.exe, compiled to return immediately: 1s
Running with cmdgfx.exe, not calling the Windows function WriteConsoleOutput: 8s

How slow is WriteConsoleOutput!!?

It's one single call, and all it has to do is take the 180x110 character text buffer and show it. This takes 53% of the total time.

Much more than the 41% it takes to read an obj file, parse it, read 3 texture bitmaps, rotate 1016 points in 3d, draw (up to) 762 polygons with texture maps, then read another obj file, read another bitmap, rotate 640 points, and draw (up to) 160 more polygons!

Why, Microsoft, why? :cry:

(and note that the rendering in this example is kind of heavy, for more simple stuff it's common for WriteConsoleOutput to take 75% or more of the total time...)

Image


Top
   
PostPosted: 18 Jul 2016 14:40 
Offline

Joined: 23 Jun 2013 06:15
Posts: 1226
Location: Germany
misol101 wrote:
How slow is WriteConsoleOutput!!?
It should be much faster than the monitor:
http://stackoverflow.com/questions/14295570/why-is-console-animation-so-slow-on-windows-and-is-there-a-way-to-improve-spee

But to be honest, maybe WriteConsoleOutput is somehow internally initialized on first time usage (who knows), or the cmdgfx c source is suboptimal (is the source available?).
So you may avoid calling cmdsgfx repeatedly (for example using "batch1.bat | cmdgfx" if possible), or optimize cmdgfx.


penpen


Top
   
PostPosted: 18 Jul 2016 14:56 
Offline

Joined: 20 Aug 2010 13:57
Posts: 427
Location: Chile
Yes, WriteConsoleOutput can be slow, but is the most api low level function.
Every time you call it, it allocates memory and do calculations. The same is for ReadConsoleOutput, because it, it can fail if you have big screen buffer sizes, like (300*200).
Also, great program.


Top
   
PostPosted: 18 Jul 2016 17:55 
Online

Joined: 02 May 2016 18:20
Posts: 222
penpen wrote:
misol101 wrote:
How slow is WriteConsoleOutput!!?
It should be much faster than the monitor:
http://stackoverflow.com/questions/14295570/why-is-console-animation-so-slow-on-windows-and-is-there-a-way-to-improve-spee


That was more of a rhetorical question, actually :mrgreen:

I don't know who came up with the statement "much faster than the monitor", but it is absolutely *not* true. As soon as you start fiddling around with the attributes (i.e. changing colors of the text), WriteConsoleOutput slows to a crawl.

Here is the shortest way I could think of to demonstrate this:

Code: Select all
#include <windows.h>
#include <stdio.h>
#include <conio.h>

#define     COLS        160
#define     ROWS        55
#define     ITERATIONS  100

CHAR_INFO disp[ROWS][COLS];
HANDLE console;
COORD size = { COLS, ROWS };
COORD src = { 0, 0};
SMALL_RECT  dest = { 0, 0, COLS, ROWS };

void write_screen3(int colrange, int charrange, int k) {
    int i, j;

    for (i = 1; i <= ROWS; ++i) {
        for (j = 1; j <= COLS; ++j) {
             disp[i-1][j-1].Char.AsciiChar = k % charrange+1;
             disp[i-1][j-1].Attributes = k % colrange+1;
             k++;
        }
    }
    WriteConsoleOutput(console, (CHAR_INFO *)disp, size, src, &dest);
}

void runTest(int col_range, int char_range) {
    int i, startT;

    startT = GetTickCount();
   
    for (i = 0; i < ITERATIONS; i++) {
      write_screen3(col_range, char_range, i);
    }

    printf("%ld s\n", (GetTickCount() - startT)/1000);
    getch();
}

int main(int argc, char **argv) {
    console = GetStdHandle(STD_OUTPUT_HANDLE);
   
    runTest(1,2);
    runTest(2,1);
    runTest(250,200);

    return 1;
}


Before running this test, write "mode 160,55"

All 3 tests fills up the whole screen with characters, and runs 100 times each.

Test 1 switches between two characters, but keeps the color the same (blue text, black bg).
This reports 0 seconds. Nice and fast...

Test 2 writes only one type of character, but switches between blue and green text.
Oops, 6 seconds on my machine. That's 100/6=17 FPS... a lot slower than the screen refresh!

Test 3 writes a wide range of both characters and colors.
Again, it takes about 6 seconds.


And the reason that the "Game of Life" from the link you posted runs so fast is precisely that, they never change the colors at all. The more color switches, the slower it will get.


Top
   
PostPosted: 18 Jul 2016 18:16 
Online

Joined: 02 May 2016 18:20
Posts: 222
carlos wrote:
Also, great program.


Thanks carlos! As you can see I'm using your bg program to set the fonts for my example batch files! Thanks!

I'll reply to your PM shortly.


Anyway, the extended profiling was kind of interesting (though depressing, concerning how much time WriteConsoleOutput takes):

Breakdown, 300 runs, 17s:

1s: batch logic, startup overhead of exe file (6%)
1s: rotation, drawing (6%)
6s: opening object file, reading textures, parsing (35%)
9s: WriteConsoleOutput (53%)

I was very surprised to see how fast my rendering code is :) Basically, there is no bottleneck there at all for this size of project.

I was also surprised that the file reading/parsing took so long! Turns out, sscanf for parsing floats is horribly slow (atof,strtof etc are almost as bad)! So, I replaced it with my own float converter. Then I saw my pcx reader was sub-optimal too and improved that. Plus made some other optimizations in the reader, and voila, cut 3 seconds!

So now it's 300 runs,14s
1s: batch logic, startup overhead of exe file (7%)
1s: rotation, drawing (7%)
3s: opening object file, reading textures, parsing (21%)
9s: WriteConsoleOutput (64%)

In the end it just makes WriteConsoleOutput look even more silly, but hey, atleast the demo is faster as a whole :)


Top
   
PostPosted: 18 Jul 2016 18:21 
Online

Joined: 02 May 2016 18:20
Posts: 222
penpen wrote:
So you may avoid calling cmdsgfx repeatedly (for example using "batch1.bat | cmdgfx" if possible)


I have tried this and posted about it in my CmdRunner thread (viewtopic.php?f=3&t=7225). I'd be very happy if you could show me how to improve speed that way, so far my tests are not successful. If you download the CmdRunner archive there is already a version called "cmdrunner_t.bat" which uses the above setup. It is not faster though, it's slower. If you could figure out why it would be great!


See my other reply above to the first part of your post regarding WriteConsoleOutput


Last edited by misol101 on 18 Jul 2016 18:49, edited 1 time in total.

Top
   
PostPosted: 18 Jul 2016 18:28 
Online

Joined: 02 May 2016 18:20
Posts: 222
Archive was updated with a version which loads/parses files more efficiently.

This affects many of the examples, small speedups can be seen in many places. Most radical is perhaps for the HULK 3d model in gfxtest5.bat. Still not very fast, but much better.


Top
   
PostPosted: 18 Jul 2016 19:36 
Offline

Joined: 23 Jun 2013 06:15
Posts: 1226
Location: Germany
misol101 wrote:
I don't know who came up with the statement "much faster than the monitor", but it is absolutely *not* true.

It could be seen here (the answer with the green checkmark):
Quote:
Just for reference, here's a version of John Conway's Game of Life written for the Windows console. For timing purposes, it just generates a random starting screen and runs for 2000 generations, then stops. On my machine, it does 2000 generations in about 2 seconds, or around 1000 frames per second (useless, since a typical monitor can only update at around 60-120 Hz). Under 32-bit Windows with a full-screen console, it can roughly double that (again, at least on my machine). I'm pretty sure with a little work, this could be sped up some more, but I've never seen any reason to bother.
The source is right below this.

Some time ago i used and tested that code on my win xp, and (after switching from debug to release) i could confirm that speed.
On a first look (but here it is a little late) it seems to be very similar; i may test if it still is that fast on Win 10 (but surely not today - maybe tomorrow).


penpen


Top
   
PostPosted: 19 Jul 2016 01:13 
Online

Joined: 02 May 2016 18:20
Posts: 222
penpen wrote:
misol101 wrote:
I don't know who came up with the statement "much faster than the monitor", but it is absolutely *not* true.

It could be seen here (the answer with the green checkmark):


:?:

Thanks penpen, but umm did you actually read my answer? :wink:

Yes, the Game Of Life example is fast, because everything uses the same color. I specifically mentioned this, and gave C code that shows where this is not true. If you have complex use of color, WriteConsoleOutput is cripplingly slow. Please try the code I posted.


edit: further runs of the code I posted shows that atleast on my machine (Win7), changing color attributes for each char compared to writing with a single color makes WriteConsoleOutput a whopping 36.6 times slower :shock: (3560% ...). Unbelievable, but true.


Top
   
PostPosted: 19 Jul 2016 04:57 
Offline

Joined: 23 Jun 2013 06:15
Posts: 1226
Location: Germany
(Edit: I was wrong: WriteConsoleOutput never was a simple "memcopy"; Sorry @Microsoft.)
Yes i read your answer, but if Microsoft hasn't changed the implementation of WriteConsoleOutput (since Visual Studio 6.0), then it simply "memcopies" the selected Region line by line:
I find it hard to beliefe, that MS has changed the implementation in a way that the speed with which the field is copied depends on the value stored within that field - but who knows, they've done crazy things before.

I have to install Visual Studio (Community) to see the implementation and test both sources, but the download speed here is actually rather low, so it will take some time to download the ~40 GB.


penpen


Last edited by penpen on 20 Jul 2016 12:00, edited 1 time in total.

Top
   
PostPosted: 19 Jul 2016 05:19 
Online

Joined: 02 May 2016 18:20
Posts: 222
penpen wrote:
I find it hard to beliefe, that MS has changed the implementation in a way that the speed with which the field is copied depends on the value stored within that field

I have to install Visual Studio (Community).
penpen


Yes, it really is unbelievable... I hope to be wrong...

Visual Studio? I compiled my example and GameOflife with mingw/gcc. But it would be interesting to see if you get the same results with MS own compiler, so thanks!


Top
   
PostPosted: 19 Jul 2016 06:29 
Offline
Expert

Joined: 22 Jan 2010 18:01
Posts: 2524
Location: Germany
You guys forgot that the console window is actually text-based and definitely not made for graphics :wink: Thus, console functions are not optimized to perform game interfaces :lol:

When I'm back home in a few hours I'll check the possibilities of multithreading for this purpose...

Regards
aGerman


Top
   
PostPosted: 19 Jul 2016 08:26 
Online

Joined: 02 May 2016 18:20
Posts: 222
aGerman wrote:
You guys forgot that the console window is actually text-based and definitely not made for graphics :wink: Thus, console functions are not optimized to perform game interfaces :lol:

When I'm back home in a few hours I'll check the possibilities of multithreading for this purpose...

Regards
aGerman



Haha, definitely true. This whole project of mine is really rather retarded with such a bottleneck :) however it's still frustrating because there seems to be no good reason for it to be so slow. Like penpen said, it should be a simple memcopy more or less. And why in the world should it matter if the color changes or not (yet it makes a huge difference in speed). It's like the incompetent MS developer decided that the whole text "engine" needs to be restarted every time the color changes... :)


Top
   
PostPosted: 19 Jul 2016 12:43 
Offline
Expert

Joined: 22 Jan 2010 18:01
Posts: 2524
Location: Germany
Either I'm unable to find my failure or concurrent access to a console window is internally locked by some kind of mutex. Unthreaded and threaded trials take approx. the same time. I tried to run a thread for each row to write.
Code: Select all
#include <windows.h>
#include <stdio.h>
#include <conio.h>
#include <process.h>

#define     COLS        160
#define     ROWS        55
#define     ITERATIONS  100

CHAR_INFO disp[ROWS][COLS];
HANDLE console;
COORD size = { COLS, ROWS };
COORD src = { 0, 0};
SMALL_RECT  dest = { 0, 0, COLS, ROWS };

typedef struct tag_THREADDATA
{
  CHAR_INFO rowdata[COLS];
  unsigned rownum;
} THREADDATA, *PTHREADDATA;


unsigned __stdcall write_line(void *arg)
{
  static COORD sz = {COLS, 1},
               start = {0, 0};
  SMALL_RECT  dst = {0, ((PTHREADDATA)arg)->rownum, COLS, ((PTHREADDATA)arg)->rownum};

  WriteConsoleOutput(console, ((PTHREADDATA)arg)->rowdata, sz, start, &dst);
  _endthreadex(0);
  return 0;
}

void write_screen_threadad(int colrange, int charrange, int k) {
    int i, j;
    HANDLE threadhandles[ROWS];
    THREADDATA rows[ROWS];

    for (i = 0; i < ROWS; ++i) {
        rows[i].rownum = i;
        for (j = 0; j < COLS; ++j) {
             rows[i].rowdata[j].Char.AsciiChar = k % charrange+1;
             rows[i].rowdata[j].Attributes = k % colrange+1;
             k++;
        }
    }

    for (i = 0; i < ROWS; ++i)
        threadhandles[i] = (HANDLE)_beginthreadex(NULL, 0, write_line, &rows[i], 0, NULL);

    WaitForMultipleObjects(ROWS, threadhandles, TRUE, INFINITE);

    for (i = 0; i < ROWS; ++i)
        CloseHandle(threadhandles[i]);

}

void write_screen_unthreadad(int colrange, int charrange, int k) {
    int i, j;

    for (i = 1; i <= ROWS; ++i) {
        for (j = 1; j <= COLS; ++j) {
             disp[i-1][j-1].Char.AsciiChar = k % charrange+1;
             disp[i-1][j-1].Attributes = k % colrange+1;
             k++;
        }
    }
    WriteConsoleOutput(console, (CHAR_INFO *)disp, size, src, &dest);
}

void runTest(int col_range, int char_range) {
    int i, startT;

    startT = GetTickCount();

    for (i = 0; i < ITERATIONS; i++) {
      write_screen_unthreadad(col_range, char_range, i);
    }

    printf("unthreaded %ld s\n", (GetTickCount() - startT)/1000);
    getch();

    startT = GetTickCount();

    for (i = 0; i < ITERATIONS; i++) {
      write_screen_threadad(col_range, char_range, i);
    }

    printf("threaded %ld s\n", (GetTickCount() - startT)/1000);
    getch();
}

int main(int argc, char **argv) {
    console = GetStdHandle(STD_OUTPUT_HANDLE);

    runTest(250,200);

    return 1;
}


Regards
aGerman


Top
   
PostPosted: 19 Jul 2016 13:47 
Online

Joined: 02 May 2016 18:20
Posts: 222
aGerman: Interesting approach, too bad it didn't work out. I looked at the CPU load in the Activity manager while running it and it was roughly the same for threaded and unthreaded. So I guess that could be an indication only one thread at a time runs via a mutex. Or perhaps it could also be an indication that Windows does not allow more CPU time to the console process, regardless of threads?


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic  [ 79 posts ]  Go to page Previous 1 2 3 4 5 6 Next

All times are UTC-06:00


Who is online

Users browsing this forum: Bing [Bot], misol101, Yahoo [Bot] and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Limited