Windows 8.1 is bringing a new incremental update to the driver model to WDDM 1.3, which will enable incremental new GPU computing functionality. One of the important pieces is the ability to "map default buffer" (which I will call as MDB), which should be particularly interesting for compute shaders running on APUs/SoCs which combine CPU and GPU on a single chip.

We can explain the feature as follows. In a typical discrete card, GPU has it's own onboard graphics memory. The application allocates memory on the GPU buffer, and the shaders read/write data from this memory. The buffers allocated in GPU memory are called "default buffers" in Direct3D parlance. Let us assume the GPU shader has written some output that you want to read on the CPU. Currently this is done in multiple stages. First, the application allocates a "staging buffer", which is allocated by the Direct3D driver in a special area of system memory such that the GPU can transfer data between the GPU default buffers and staging buffers over the PCI Express bus efficiently. GPU copies the data from GPU buffer to the staging buffer.  The CPU then issues a "map" command that allows the CPU to read/write from the staging buffer. This multi-stage process is inefficient for APUs/SoCs where the GPU shares the physical memory with the CPU.  In Direct3D 11.2, the staging buffer and the extra copy operation will no longer be required on supported hardware and the CPU will be able to access the GPU buffers directly.  Thus, MDB will be a big win for many GPU computing scenarios due to the reduced copy overhead on APUs/SoCs.

Intel recently rolled it's own extension called InstantAccess for Haswell. My understanding is that InstantAccess is a bit more general than MDB because InstantAccess allows mapping of textures as well as buffers whereas D3D 11.2 only allows mapping of default buffers but not textures. Extensions similar to MDB are also common in OpenCL. Both Intel and AMD allow the CPU to read/write from OpenCL GPU buffers. In addition, Intel also exposes some ability for the GPU to read/write from preallocated CPU memory which afaik is not allowed in Direct3D yet. The efficiency of different solutions is still a question that we don't know much about. For example, AMD's OpenCL extension allows the CPU to access GPU memory on Llano, but the CPU reads the data from GPU memory at a very slow speed while writing the data is still pretty fast.

UPDATE: Intel confirmed support for MDB on Ivy Bridge onwards.

At this time, there is no official confirmation about which hardware will support MDB. My expectation is that  MDB will likely be available on all recent single chip CPU/GPU systems such as AMD's Trinity and Kabini as well as Intel's Haswell and Ivy Bridge. AMD has already rolled out WDDM 1.3 drivers but curiosly those do not work on Llano and Zacate APUs so I am a little pessimistic about whether those APUs will support this new feature. Microsoft for its part only stated that they expect it to be "broadly available" once WDDM 1.3 drivers are rolled out. I will update the article when we get official word from the vendors about the hardware support status.

Apart from MDB, Microsoft has also added support for runtime shader linking. This will be quite useful for both compute and graphics shaders. The idea is that one can precompile functions in the shader before hand and ship the compiled code, while linking can be done at runtime. Separate compilation and linking has been available under CUDA 5 and OpenCL 1.2 as well. Runtime shader linking is a software feature and will be available on all hardware on Windows 8.1.

C++ AMP, Microsoft's C++ extension for GPU computing, has also been updated with the upcoming VS2013. I think the biggest feature update is that C++ AMP programs will also gain a shared memory feature on APUs/SoCs where the compiler and runtime will be able to eliminate extra data copies between CPU and GPU. This feature will also be available only on Windows 8.1 and it is likely built on top of the "map default buffer" as Microsoft's AMP implementation uses Direct3D under the hood. C++ AMP also brings some other nice additions including enhanced texture support and better debugging abilities.

In addition to compute, Microsoft also introduced a number of graphics updates such as tiled resources but we will likely cover those separately. More information about Direct3D changes can be found in preliminary docs for D3D 11.2 and a talk at BUILD

 

POST A COMMENT

16 Comments

View All Comments

  • BryanC - Tuesday, July 02, 2013 - link

    Do you know what the coherency model is for MDB? For example, when is the GPU guaranteed to see CPU writes, and when is the CPU guaranteed to see GPU writes? At kernel boundaries? Reply
  • codedivine - Tuesday, July 02, 2013 - link

    Author here. Yes, I am assuming kernel boundaries. Reply
  • zdw - Tuesday, July 02, 2013 - link

    Does OpenGL have similar functionality to MDB?

    I'd assume it does, as unified memory goes back on SGI's platforms for quite a while - the SGI O2 had unified CPU/GPU memory like 15 years ago.
    Reply
  • Krysto - Wednesday, July 03, 2013 - link

    When is Microsoft going to support OpenGL by default themselves? Are they really going to use Google's ANGLE to translate DirectX to OpenGL for WebGL in IE11, just so they don't have to do that?

    They need to support OpenGL in Windows Phone and Windows, otherwise most of iOS and Android mobile games will never arrive on those platforms.
    Reply
  • ananduser - Wednesday, July 03, 2013 - link

    Maybe ANGLE is the solution to WebGL's inherent security problem, potentially allowing kernel access. If such is the case, why not ? Do we know that they are using ANGLE or ANGLE-like translation for WebGL in IE11 ? Reply
  • BaronMatrix - Monday, July 08, 2013 - link

    Is it just me or is anyone else upset that that VISUALS on Win8 look like crayon drawings compared to Win7 AERO...?

    We actually went backwards... Even with multi-mon... And they STILL HAVE CRAP for tablet share..

    Ballmer's going to destroy MS... They broke HyperV with USB3 controllers, Modern apps have no entry in the volume mixer AND they jump around between monitors... Flash in IE is a NIGHTMARE... I had to search for an hour to find the right Flash package...

    I had more blue screens in one month than when I worked at MS testing Windows XP...

    I'M TOTALLY PISSED. I want my "Modern" VISUALS... But as I said, they did it because Intel IGPs at 100W can barely do AERO, so of course ATOM withotu PowerVR will get them sued AGAIN...

    Intel, holding back graphics for 20 years...

    That's a cool catch phrase...
    Reply

Log in

Don't have an account? Sign up now