Jump to content

_picon

Members
  • Posts

    13
  • Joined

  • Last visited

About _picon

  • Birthday 08/31/2020

Personal Information

  • Flight Simulators
    DCS, IL2
  • Location
    Sweden

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. When it works, nsight graphics. https://developer.nvidia.com/nsight-graphics . But I've found that attaching isn't always working (no clue to why though).
  2. No, I haven't really looked into it much to be honest, as I was able to utilize nsight (with the crash reporting option turned off).
  3. Not really, but I have played around with the cascade distances to improve the quality of in cockpit shadows. (essentially, adjust the range to use the highest resolution cascade, to "just" be inside the cockpit (like 1 meter or so), instead of something longer). But that is kind of orthogonal to the issue at hand.
  4. Hi, This may be a too big of a change, given that Vulkan behaves quite differently in this area. (and/or that this wouldn't be applicable at all when doing ray traced shadows). This is partly speculation on my side here as well, since I can't know in detail what is going on. What we are looking at here is a screenshot of the timing of the shadow rendering. From what I can tell, DCS is using a cascaded shadow map technique, which has 4 shadow maps. (and during rendering we seem to choose one, but we do see some artifacts at boundaries here and there). The rendering (closest focus is done first) is done in the DSV boxes below. But between this rendering, there is a whole bunch of compute dispatches, that don't run in parallel (since they update the same UAV). The results is that these compute dispatches are not using the full width of the GPU (ie, doesn't scale with GPU size), and are therefore very inefficient. Would it be possible to either use the existing vendor extensions (if the dispatches don't write to the same locations)? https://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__dx.html#gaeb78a97e256f3c6c511451dded3994e5 https://gpuopen-librariesandsdks.github.io/ags/group__dx11_u_a_v_overlap.html Or, alternatively, at the potential use of additional vidmem, use different UAVs for the different shadowmaps, and launch the dispatches in an interleaved fashion? (ie, launch a dispatch for shadow map 0, then for shadow map 1, etc). The idea being that each set of 4 dispatches would then run in parallel, instead of just a single one). Perhaps even draw all the shadows in a single pass using multiple views. Given that the shadow map rendering doesn't scale with screen resolution, this reduces the goodness we see from DLSS/FSR.
  5. Thanks, will take a look. I've currently having issues with RenderDoc (doesn't attach to the app properly, no API detected)
  6. Thanks. Perhaps it makes sense to just confirm the same thing doesn't pop up in the new Vulkan based engine.
  7. I can provide more details if needed, like the entire capture (but it is fairly big!), but this a screenshot of my capture. The "Copy/Clear Pass #1" (expanded) contains a number of ClearUnorderedAccessViewUint calls. If one looks at the resources involved, most are cleared twice, with the same data. (while one is at it, it might be worth looking into if it is necessary to do the clears at all, or if the data is totally overwritten later by compute shaders)
  8. Hi, I noticed through RenderDoc that the frame rendering starts out roughly with a bunch of ClearUnorderedAccessViewUint calls. They actually take some GPU time, on the order of 0.4ms on a 3060 Ti. But it seems half of them are redundant, so each buffer is cleared twice with the same value. It would make sense to only clear them once, or so I think... Should hopefully be a small/easy fix, with a measurable GPU performance improvement. My settings was "high", and the latest Stable release was used (2.8) Best Regards.
  9. It actually looks like DLSS is possible to use with DX11 though (https://developer.nvidia.com/rtx/dlss/get-started). But it would make sense that any integration would be done on a newer graphics engine, which indeed hopefully means that Vulkan is not that far off... I do wonder if we are talking DLSS 3 (with frame generation), on very recent GPUs? That could help out in CPU limited situations as well... Hopefully the Streamline API is used, which might help integration of other HW vendors technology. Potentially even just DLAA (for high quality antialiasing?) DLSS implies use/generation of motion vectors, does this mean that the current multisampled deferred shading (using the edge detection and multipass algorith) is going to replaced by some TAA (temporal anti aliasing), even in the case of DLSS not being used?
  10. One thing to note is that the rendering today would actually be done on two threads (effectively), as the DirectX 11 driver would spin of a separate thread that will do the heavy lifting. Engine calls the DirectX 11 API calls on it's thread, and the driver will handle all API calls (in order) on a separate thread. It seems quite prudent to (as a first iteration) keep a single rendering thread that would still utilize DirectX 11, while spinning of more threads to handle other things. (yet prepare for more scaling). It would be quite interesting to hear if that is indeed what is the plan for the first release... A naïve, single threaded, Vulkan (or Direct12) implementation will (almost) always behave even worse CPU usage wise even compared with DirectX 11, as the driver will not spin off a separate thread (normally) to split the work. This is all known, as one of the primary motivators for using Vulkan/DX12 in the first place is the ability to utilize multiple cores to do parallel recording of graphics commands.
  11. I've had similar things happening when the TrackIR app is misbehaving for some reason. In those cases I recovered by killing it (TrackIR) and restarting it. (probably not what happens here, but it does match the behaviour, so just in case...).
  12. Another reference: Spudknocker video at 5:36 A quick analysis using indicates that the mirror rendering is performed to a single sampled render target (1024x1024 at High settings) even when using super sampling. In contrast, when MSAA is enabled, this render target is multisampled. This would explain why the quality is higher in MSAA. When using supersampling, it seems reasonable that the image quality should be higher. For best quality, one suggestion would be a MSAA target but enable per sample shading, to get the rotated grid sample pattern. Don't know if this also affects rendering of other screens.
×
×
  • Create New...