Curious inquiry: Why is multicore-optimization "not worth it"?

ShuRugal · December 21, 2015

I remember several recent statements from ED that it had been determined that multicore-optimization would not provide a significant increase in performance. I was wondering, what are the technical reasons for this?

Granted, I am no expert on distributed computing, but it seems logical to me that taking major functions which tend to hog CPU time (looking at you, AI and projectile computations) and breaking them off onto separate cores would be massively beneficial.

In my head, I visualize how this would work best as a client-server architecture (think network computing, but without the network): One core process would handle input/output, track and report positions of individual assets, and be responsible for determining the result of interactions between those assets. This thread could also handle simulation of the player aircraft. A second thread would do nothing but manage AI decisions and movement, and would report the results (positions) to the core process. A third thread would do nothing but determine the movement of projectiles (spawned by report from core or AI thread) and would report results to the core thread. We already have audio on it's own thread, so this would fill the fourth core of a typical quad-core processor.

Now, perhaps I just have such a poor knowledge of how multithreaded computing works that I am missing something critical here, but i don't see what that is. Could someone give me a technical explanation as to why the above solution would NOT result in improved engine response versus having all of those function run linearly, as is currently the case?

King_Hrothgar · December 21, 2015

It is not difficult to create a new program to use multiple cores from the start. It's actually pretty easy tbh, you just have to do it. But going back and trying to modify an existing program with millions of lines of code to use more than one core is another matter entirely. That is a huge undertaking, so huge it borders on rewriting the program from scratch.

As for the payoff of doing that, it would allow us to have bigger battles or allow for higher fidelity AI flight and damage modeling. The former isn't a serious concern, the current I5's can run a fairly big mission just fine as is. So there isn't much to gain there. The latter creates it's own issues as flight modeling is the single most time consuming part of something like an FC3 level aircraft. And that means that not only would they have to rewrite DCS's foundation, then they'd have to create dozens of new FM's and not get paid directly for it. It would have some monetary benefit in the role of keeping DCS fresher, much like the new graphics engine. But it's hard to take a screenshot of an AI flight model and thus harder to advertise.

Nac33 · December 21, 2015

@ShuRugal......Amen brother, your preaching to the choir, i want to know too....;-)

Edited December 21, 2015 by Nac33

cichlidfan · December 21, 2015

I would say KH has hit the high points fairly well. The bottom line, as much as we might not like it, is that the cost/benefit ratio is far too high.

Edited December 21, 2015 by cichlidfan

nervousenergy · December 21, 2015

Anything directly related to the core engine, sure, but I think this overlooks a big potential gain: AI aircraft. We already take a big fidelity hit with AI aircraft using a simplified flight and damage model to save on processor time, but the multiplayer architecture seems to offer a way to branch them off.

The game engine obviously supports external inputs to how the models fly... you see that every time you play against other humans. I'm not a coder, and I'm sure there's likely reasons this won't work in DCS, but absent knowing those reasons it seams not that difficult to write the AI routines as completely separate bots that interact with the game engine exactly like human players do now. You could have as many high-fidelity opponents as you had spare cores, either on the same hardware or other machines.

One issue could be that AI bots were well known as tools of cheating, going all the way back to Quake, but that seems less than likely in our small community.

Wolf Rider · December 21, 2015

Which is why, in our NVidia Control Panel, we should set the Autothreading Optimisation option to OFF instead of leaving it at its default setting of AUTO

OnlyforDCS · December 21, 2015

Now, perhaps I just have such a poor knowledge of how multithreaded computing works that I am missing something critical here, but i don't see what that is. Could someone give me a technical explanation as to why the above solution would NOT result in improved engine response versus having all of those function run linearly, as is currently the case?

Ill try and explain as simply as possible. Imagine you are running two processes on separate cores. Simple enough right? So far so good. Now imagine if those two processes need to do complex calculations on the fly but they also need to interact with each other, sharing their results of their computations between the cores, we are after all running a single application albeit on two cores. Sounds pretty straightforward, but this is where things can get complicated. In a well coded multithreaded application both threads would be optimized to run in sync and share their results with the least amount of wait time between the cores. However this requires very complicated predictive coding, to prevent one core waiting for the other and creating a bottleneck in the process. When you have a very complex simulation application like DCS which has many different systems and has been optimized to run as fast as possible on a single core, splitting the code into two processes would only result in the aformentioned bottleneck most probably creating very little benefit in terms of speed. So you can't simply split the code, you would need to do some serious rewriting of the really low level stuff. ED has determined that the investment of time and money in doing this is simply not worth it, since it would probably involve a complete rewrite of the simulation engine from scratch.

This is a really basic explanation, indeed maybe there are specific physics computations that simply can't be done in parallel, at least not in any way to make the benefits readily obvious. Multithreaded coding is probably some of the most difficult programming known to man, there are shortcuts in the form of APIs etc, but they will only take you so far, especially in a complex simulation environment that was developed in the infancy of said technology.

Edited December 21, 2015 by OnlyforDCS

SkateZilla · December 21, 2015

Which is why, in our NVidia Control Panel, we should set the Autothreading Optimisation option to OFF instead of leaving it at its default setting of AUTO

Display Driver and DirectX Auto threading and Application Processing type are different.

DCS can be Serial Processing Thread, but DX11 can still divide draw call and gpu driver processing among several threads to keep one thread from being overloaded (ie Like DirectX9).

DirectX11 Tasking being multi threaded relieves the CPU overhead of the draw calls from being dropped onto ONE core, instead of one core trying to process 200K draw calls, 2 Cores can process 100K each.

Which is another reason why DX11 has reduced CPU overhead and able to render more objects and render them faster

ShuRugal · December 21, 2015

It is not difficult to create a new program to use multiple cores from the start. It's actually pretty easy tbh, you just have to do it. But going back and trying to modify an existing program with millions of lines of code to use more than one core is another matter entirely. That is a huge undertaking, so huge it borders on rewriting the program from scratch.

As for the payoff of doing that, it would allow us to have bigger battles or allow for higher fidelity AI flight and damage modeling. The former isn't a serious concern, the current I5's can run a fairly big mission just fine as is. So there isn't much to gain there. The latter creates it's own issues as flight modeling is the single most time consuming part of something like an FC3 level aircraft. And that means that not only would they have to rewrite DCS's foundation, then they'd have to create dozens of new FM's and not get paid directly for it. It would have some monetary benefit in the role of keeping DCS fresher, much like the new graphics engine. But it's hard to take a screenshot of an AI flight model and thus harder to advertise.

So, the problem is not that there would be no performance gain, it's that ED can't afford the man-hours to re-write the engine from the ground up, which is essentially what would need to happen?

Ill try and explain as simply as possible. Imagine you are running two processes on separate cores. Simple enough right? So far so good. Now imagine if those two processes need to do complex calculations on the fly but they also need to interact with each other, sharing their results of their computations between the cores, we are after all running a single application albeit on two cores. Sounds pretty straightforward, but this is where things can get complicated. In a well coded multithreaded application both threads would be optimized to run in sync and share their results with the least amount of wait time between the cores. However this requires very complicated predictive coding, to prevent one core waiting for the other and creating a bottleneck in the process.

in commo, we solve this problem by feeding in a snyc bit at predetermined intervals. For a less data-heavy solution (the cores don't need to share full details on their individual threads, just the end results of each step) have each thread output a timestamped progress log to an area where any thread may read it on demand.

This way, even if you have one thread hung for a few cycles waiting on the most recent output from another, you don't have the entire app hung, which is what happens when a subprocess of a single-threaded application skips a beat.

When you have a very complex simulation application like DCS which has many different systems and has been optimized to run as fast as possible on a single core, splitting the code into two processes would only result in the aformentioned bottleneck most probably creating very little benefit in terms of speed. So you can't simply split the code, you would need to do some serious rewriting of the really low level stuff. ED has determined that the investment of time and money in doing this is simply not worth it, since it would probably involve a complete rewrite of the simulation engine from scratch.

well, sure, if all you do is split it down the middle and say "okay, core 1, you execute all the evens lines, core 2, all the odd lines, go!" you have that problem. But that's not even remotely what I am asking.

This is a really basic explanation, indeed maybe there are specific physics computations that simply can't be done in parallel, at least not in any way to make the benefits readily obvious. Multithreaded coding is probably some of the most difficult programming known to man, there are shortcuts in the form of APIs etc, but they will only take you so far, especially in a complex simulation environment that was developed in the infancy of said technology.

Except that there already exists a running mode where all the physics calculations are handled in parallel: multiplayer. The type of multithreading I am suggesting would operate almost exactly like a multiplayer match; Instead of each client controlling a single player aircraft and his projectiles, each client would be responsible for one type of entity.

But, as KH already pointed out, that would most likely require a ground-floor rewrite of the engine.

SkateZilla · December 21, 2015

Multiplayer doesnt mean Multi Threaded or Parallel Processing.

ShuRugal · December 22, 2015

Multiplayer doesnt mean Multi Threaded or Parallel Processing.

true, but i would be interested in a technical description of why the same general precepts do not apply.

GGTharos · December 22, 2015

Data synchronization is much more complex than most people believe. You have to design it and you have to do that well from the top. It is isn't trivial. In some cases it can actually take away future expandability or flexibility, too.

The point is always cost/benefit. What do you really gain from using more threads? I can think of a few things, can you?

And leave AI FM out of this, is truly the least of useful possibilities :-)

cichlidfan · December 22, 2015

The point is always cost/benefit. What do you really gain from using more threads?

I would say KH has hit the high points fairly well. The bottom line, as much as we might not like it, is that the cost/benefit ratio is far too high.

:smartass:

Wolf Rider · December 22, 2015

Display Driver and DirectX Auto threading and Application Processing type are different.

DCS can be Serial Processing Thread, but DX11 can still divide draw call and gpu driver processing among several threads to keep one thread from being overloaded (ie Like DirectX9).

DirectX11 Tasking being multi threaded relieves the CPU overhead of the draw calls from being dropped onto ONE core, instead of one core trying to process 200K draw calls, 2 Cores can process 100K each.

Which is another reason why DX11 has reduced CPU overhead and able to render more objects and render them faster

perhaps, if DX was left alone (without outside interference) to do its thing... but then you end up with third party (VGA witchery) trying to bump the TIR onto one, the AA onto another, and other, and causing all sorts of hiccups in the process.

If the sim/ game itself ain't "multicored" (in the true sense) turn the Autothreading to OFF, don't leave it set to AUTO

.........................`

Don't believe me? ... try it, and try it honestly - before reacting

Edited December 22, 2015 by Wolf Rider

ShuRugal · December 22, 2015

The point is always cost/benefit. What do you really gain from using more threads? I can think of a few things, can you?

Off the top of my head? reduction/elimination of cluster-munition framekills, higher cap on the number and intelligence of AI units, more dynamic weather effects, simulation of airflow/turbulence around other aircraft, buildings, and mountains. If we dedicate one virtual core to each of those things, (AI, Munitions, Weather, Atmospheric fluid dynamics) that leaves your average I5/7 Quad four more threads to handle I/O, Sound, and whatever else is required.

As far as cost/benefit goes... I'd pay to have weather and airflow around obstacles dynamically simulated...

GGTharos · December 22, 2015

Off the top of my head? reduction/elimination of cluster-munition framekills,

AFAIK that was some sort of shader issue, so, no.

higher cap on the number and intelligence of AI units, more dynamic weather effects, simulation of airflow/turbulence around other aircraft, buildings, and mountains.

Of these, only the higher cap on AI units is really valid. Weather changes are slow and don't require that much computation (and more to the point, a lot of the flow fields can probably just be pre-computed - a look up is usually massively more efficient in terms of speed compared to real-time computation if it doesn't need frequent update).

Wake turbulence could be modeled as a 3D object.

If we dedicate one virtual core to each of those things, (AI, Munitions, Weather, Atmospheric fluid dynamics) that leaves your average I5/7 Quad four more threads to handle I/O, Sound, and whatever else is required.

Sound is already running in a separate thread. AI is about the only thing here that could potentially net you massive gains, but even that isn't for certain as it actually requires developing the AI more.

As far as cost/benefit goes... I'd pay to have weather and airflow around obstacles dynamically simulated...
The question is how much. If you paid ED 5 mil to focus on that alone, they just might do it. :)

King_Hrothgar · December 22, 2015

And leave AI FM out of this, is truly the least of useful possibilities :-)

For supersonic jet fighters, I agree. For WW2, it's a deal breaker for me. I will not purchase any additional WW2 modules until the AI FM and DM are of similar or the same fidelity as player aircraft. Obviously, upping the detail there requires true multicore support. I also don't see that happening any time soon for the reasons I already mentioned. Fortunately (in a sense), I'm totally burnt out on WW2 anyways and so I just don't care. But if I did care, I'd have to burn you at the stake for such a heretical comment.:P

GGTharos · December 22, 2015

I will not purchase any additional WW2 modules until the AI FM and DM are of similar or the same fidelity as player aircraft.

... as soon as similar or same fidelity of AI as players is available.

Obviously, upping the detail there requires true multicore support. I also don't see that happening any time soon for the reasons I already mentioned. Fortunately (in a sense), I'm totally burnt out on WW2 anyways and so I just don't care. But if I did care, I'd have to burn you at the stake for such a heretical comment.:P

Your rage stems from misunderstanding :P

fltsimbuff · December 22, 2015

The quote might well be taken out of context. People were asking for multithread support to help with the high CPU usage, which was limiting frame rate.

As expected, it seems that streamlining the graphics engine took a ton of load off the CPU, and made the limitation once again the GPU. In the context of an engine upgrade, without changes to AI capabilities, the statement that adding additional threads wouldn't help was completely true.

They updated the engine, and with the same AI capabilities it made no sense to multi-thread it more than it was already.

That doesn't necessarily rule out that they may add more threads in the future with certain increases in functionality. That additional functionality may have just been out of scope of the engine rewrite. Now that we're on the new engine, and once it is out of beta, the possibility still exists ED could add additional AI capabilities and move them to different threads if that is necessary. Only the ED devs are qualified to make the decisions on whether or not that is beneficial though.

ShuRugal · December 22, 2015

If you paid ED 5 mil to focus on that alone, they just might do it. :)

So, if we can get 100k supporters for DCS: Slope Soaring Physics, we cant get this off the ground?

:p

EDIT: I just had a complete out-of-the-box epiphany with regards to offloading support for AI PFM....

What if a separate application, dedicated to giving AI the ability to control PFM, were developed, and have it literally operate as a multiplayer client, only connected to localhost instead of an online server? Would that even work?

Edited December 22, 2015 by ShuRugal

GGTharos · December 22, 2015

Why not pitch it and see if they'll agree :)

Buzzles · December 31, 2015

EDIT: I just had a complete out-of-the-box epiphany with regards to offloading support for AI PFM....

What if a separate application, dedicated to giving AI the ability to control PFM, were developed, and have it literally operate as a multiplayer client, only connected to localhost instead of an online server? Would that even work?

That idea is quite old :)

I first read about that local client design pattern in Game Coding Complete (Third Ed, 2009) by McShaffry, and I know he didn't come up with the idea :)

It solves some problems, but creates others as you do now have to deal with a lot more data sync issues (it's essentially the netcode problem!), plus it doesn't get around the fact that resources per machine are finite.

IIRC, it's basically how iDTech3 (Quake 3) engine worked as well, and that is an ancient engine. I think DCS already works in a similar fashion.

Now, if you were talking about spinning some of those clients up on say, Azure or Amazon Cloud or a local cluster, then you have a way to avoid the finite local resources problem :)

Edited December 31, 2015 by Buzzles

theGozr · December 31, 2015

When choosing the future of your own product better start or porting with a good engine supporting all the goodies not getting a new "Old" Engine

sobek · January 1, 2016

When choosing the future of your own product better start or porting with a good engine supporting all the goodies not getting a new "Old" Engine

Unless the off the shelf engines don't cater to your product.

Sign In

Curious inquiry: Why is multicore-optimization "not worth it"?

Recommended Posts

ShuRugal

King_Hrothgar

Nac33

cichlidfan

nervousenergy

Wolf Rider

OnlyforDCS

SkateZilla

ShuRugal

SkateZilla

ShuRugal

GGTharos

cichlidfan

Wolf Rider

ShuRugal

GGTharos

King_Hrothgar

GGTharos

fltsimbuff

ShuRugal

GGTharos

Buzzles

theGozr

sobek

Recently Browsing 0 members