What Does DirectCompute Really Mean For Gamers?

What We Tested: Other Apps And Test Config

DiRT 3

DiRT 3 employs DirectCompute for its high-definition ambient occlusion (HDAO) effect. Unfortunately, there is no equivalent effect in the game based on pixel shading, so we can’t compare the two directly. However, the game offers three AO options. The first of these, “Low,” does offer occlusion rendered under cars via pixel shading. For the purposes of DirectCompute, this is effectively an “off” setting. The “High” setting employs HDAO at half-resolution, as discussed earlier. “Ultra” runs HDAO at full resolution. All other graphics options were set to either Low or, when possible, Off.

Metro 2033

The advanced depth of field (DOF) effect in Metro 2033 needs three rendering passes. Two of these employ pixel shading, while the third uses DirectCompute. The original clear image is combined with a blurred image to create a much more realistic, cinematic appearance that can add more of a 3D feel to games without a lot of the 3D cost and (sometimes literal) headache. The DoF feature is enabled/disabled via a checkbox in the game's benchmarking tool.

Civilization 5

Civilization 5 uses DirectX 11 and DirectCompute to leverage a variable bit rate texture codec algorithm. The algorithm is so efficient that 2 GB of leader textures compress down to less than 150 MB of disk storage.

“Textures on disk are stored in a Discrete Cosine Transfer-like format,” explains AMD’s Neal Robison. “In a nutshell, this means that texture data is stored in frequency space as it allows optimal compression, similar to the JPEG format. A shader is used to rapidly decompress this data in real-time and recompress it on-the-fly into a DCT texture format, which the GPU directly supports. This allows significant reductions in the texture storage requirements. More importantly, it allows significantly faster loading of high-quality and high-resolution textures into video memory compared to what a pure CPU implementation could provide. This is why the superb-looking leader scenes appear almost instantly in the game, regardless of how many civilizations are present.”

Civ5 contains an integrated benchmarking module designed to test this DirectCompute-based compression/decompression performance. You can find the tool’s usage detailed in the Civilization 5 benchmark modes.doc file within the Civilization 5 folder. By modifying a custom shortcut to the program, we used the following parameters: DX11_executable_filename –LeaderBenchmark –duration 90 –norendering. This loaded the benchmarking tool, ran it for 90 seconds, and did not allow for graphical output via compute. So, while the test is running, you can hear it, but the screen remains blank. To add DirectCompute and visual output back in, eliminate the –norendering parameter from the shortcut.

We kept our desktop test unchanged from the prior APU article, save for an update to the Asus Crosshair V Formula’s BIOS.

Test Hardware
Test System 1
Processor
AMD FX-8150 (Zambezi) 3.6 GHz, Socket AM3+, 8 MB Shared L3 Cache, Turbo Core enabled, 125 W
Motherboard
Asus Crosshair V Formula (Socket AM3+), AMD 990FX/SB950
Memory
8 GB (2 x 4 GB) AMD Performance Memory AE34G1609U2 (1600 MT/s, 8-9-8-24)
SSD
240 GB Patriot Wildfire SATA 6Gb/s
Graphics
AMD Radeon HD 7970 3 GB

AMD Radeon HD 5870 1 GB
Power Supply
PC Power & Cooling Turbo-Cool 860 W
Operating System
Windows 7 Professional, 64-bit
Test System 2
Processor
AMD A8-3850 (Llano) 2.9 GHz, Socket FM1, 4 MB L2 Cache, 100 W, Radeon HD 6550D Graphics
Motherboard
Gigabyte A75-UD4H (Socket FM1), AMD A75 FCH
Memory
8 GB (2 x 4 GB) AMD Performance Memory AE34G1609U2 (1600 MT/s, 8-9-8-24)
SSD
240 GB Patriot Wildfire SATA 6Gb/s
Graphics
AMD Radeon HD 7970 3 GB

AMD Radeon HD 5870 1 GB
Power Supply
PC Power & Cooling Turbo-Cool 860 W
Operating SystemWindows 7 Professional, 64-bit
This thread is closed for comments
37 comments
    Your comment
  • Ha. Are those HL2 screenshots on page 3 lol?
  • Khimera2000
    so... how fast is AMD's next chip??? :) a clue??? anything?
  • de5_Roy
    would pcie 3.0 and 2x pcie 3.0 cards in cfx/sli improve direct compute performance for gaming?
  • hunshiki
    hotsacomanHa. Are those HL2 screenshots on page 3 lol?


    THAT. F.... FENCE. :D

    Every, single, time. With every, single Source game. HL2, CSS, MODS, CSGO. It's everywhere.
  • hunshikiTHAT. F.... FENCE. Every, single, time. With every, single Source game. HL2, CSS, MODS, CSGO. It's everywhere.


    Ha. Seriously! The source engine is what I like to call a polished turd. Somehow even though its ugly as f%$#, they still make it look acceptable...except for the fence XD
  • theuniquegamer
    Developers need to improve the compatibility of the API for the gpus. Because the consoles used very low power outdated gpus can play latest games at good fps . But our pcs have the top notch hardware but the games are playing as almost same quality as the consoles. The GPUs in our pc has a lot horse power but we can utilize even half of it(i don't what our pc gpus are capable of)
  • marraco
    I hate depth of field. Really hate it. I hate Metro 2033 with its DirectCompute-based depth of field filter.

    It’s unnecessary for games to emulate camera flaws, and depth of field is a limitation of cameras. The human eye is able to focus everywhere, and free to do that. Depth of field does not allow to focus where the user wants to focus, so is just an annoyance, and worse, it costs FPS.

    This chart is great. Thanks for showing it.



    It shows something out of many video cards reviews: the 7970 frequently falls under 50, 40, and even 20 FPS. That ruins the user experience. Meanwhile is hard to tell the difference between 70 and 80 FPS, is easy to spot those moments on which the card falls under 20 FPS. It’s a show stopper, and utter annoyance to spend a lot of money on the most expensive cards and then see thos 20 FPS moments.

    That’s why I prefer TechPowerup.com reviews. They show frame by frame benchmarks, and not just a meaningless FPS. TechPowerup.com is a floor over TomsHardware because of this.

    Yet that way to show GPU performance is hard to understand for humans, so that data needs to be sorted, to make it easy understandable, like this figure shows:




    Both charts show the same data, but the lower has the data sorted.

    Here we see that card B has higher lags, and FPS, and Card A is more consistent even when it haves lower FPS.
    It shows on how many frames Card B is worse that Card A, and is more intuitive and readable that the bar charts, who lose a lot of information.

    Unfortunately, no web site offers this kind of analysis for GPUs, so there is a way to get an advantage over competition.
  • hunshiki
    I don't think you owned a modern console Theuniquegamer. Games that run fast there, would run fast on PCs (if not blazing fast), hence PCs are faster. Consoles are quite limited by hardware. Games that are demanding and slow... or they just got awesome graphics (BF3 for example), are slow on consoles too. They can rarely squeeze out 20-25 FPS usually. This happened with Crysis too. On PC? We benchmark FullHD graphics, and go for 91 fps. NINETY-ONE. Not 20. Not 25. Not even 30. And FullHD. Not 1280x720 like XBOX. (Also, on PC you have a tons of other visual improvements, that you can turn on/off. Unlike consoles.)

    So .. in short: Consoles are cheap and easy to use. You pop in the CD, you play your game. You won't be a professional FPS gamer (hence the stick), or it won't amaze you, hence the graphics. But it's easy and simple.
  • kettu
    marracoI hate depth of field. Really hate it. I hate Metro 2033 with its DirectCompute-based depth of field filter.It’s unnecessary for games to emulate camera flaws, and depth of field is a limitation of cameras. The human eye is able to focus everywhere, and free to do that. Depth of field does not allow to focus where the user wants to focus, so is just an annoyance, and worse, it costs FPS.


    'Hate' is a bit strong word but you do have a point there. It's much more natural to focus my eyes on a certain game objects rather than my hand (i.e. turn the camera with my mouse). And you're right that it's unnecessary because I get the depth of field effect for free with my eyes allready when they're focused on a point on the screen.
  • npyrhone
    Somehow I don't find it plausible that Tom's Hardware has *literally* been bugging AMD for years - to any end (no pun inteded). Figuratively, perhaps?
  • xenol
    There's one thing I hate about current implementations of AO: it's too coarse. An object that's no more than say two feet behind something manages to receive some AO treatment. I want to say it's a shadow, but it's clearly not in the direction of the light source.
  • gtguy257
    The human eye cannot focus everywhere at once. In fact it has a very limited depth of field. But, the human eye can focus so quickly that you rarely notice unless you focus from up close to far away. The effect isn't a flaw in camera systems it is a function of how any optic works. Whether that is your eye or a camera.
  • TeraMedia
    @xenol:

    Looking at the cartoonish pic above (last page), I would have to agree with you. It looks like they turned up the effect too strongly because they wanted to make it easily visible. The real world doesn't look like that at all. Look at the corner of your room, and you can see a faint darkening in the very corner that gradually brightens as you move away a few inches. But in the above pic, the darkening is strong enough to almost black out the pixels. To be more realistic, I think it needs to be more subtle.
  • TeraMedia
    It would have been great to see a lower-end GCN card such as a 7750 so that we could see the frame rate impact of this feature when the card is already stretching - but is still capable. The inclusion of the 5870 kind of approximates this, I suppose, but the 7750 would have been a decent add-on match for an A8-equipped computer.
  • gamerk316
    Rather then continuing to throw resources at approximating the Rendering equation, can we PLEASE move to Ray Tracing already? All these little problems that are hard to implement in terms of the rendering equation are a natural outcome of Ray Tracing.
  • bloc97
    Why is Vsync enabled in the Dirt 3 Benchmark Screenshot?
  • bloc97
    gtguy257The human eye cannot focus everywhere at once. In fact it has a very limited depth of field. But, the human eye can focus so quickly that you rarely notice unless you focus from up close to far away. The effect isn't a flaw in camera systems it is a function of how any optic works. Whether that is your eye or a camera.


    But it is an annoyance since when something isn't focused on the screen, you cannot see it until you turn your camera to it...
    Just like when you look something at your side, you don't need to turn your head...
  • bloc97
    gamerk316Rather then continuing to throw resources at approximating the Rendering equation, can we PLEASE move to Ray Tracing already? All these little problems that are hard to implement in terms of the rendering equation are a natural outcome of Ray Tracing.

    Really? Ray Tracing would make the game 0.3 FPS unless you have a quad Crossfire of HD 7990's, and even then you will only get 4 FPS...
  • shin0bi272
    I dont know if its because I just woke up or if Im right but in those 3 BF3 screens showing the no ao, ss ao, and hbao, look identical. I dont see any difference what so ever save for some small shadows around the edges of the boxes and pallets.

    I would have also liked to have seen them include an nvidia card in their benchmarks to see the fps you got doing the same test with the competitors products.
  • nebun
    Ambient Occlusion is nothing new....nVidia has had it for a very long time, a lot of people just don't know about it
  • phuzi0n
    I'm really confused as to what the point of this article is other than to state an obvious fact, fast GPU's are fast at GPU computing. The article focuses on AMD's push for GPU computing and throws in APU benchmarks but COMPLETELY MISSES THE POINT OF AMD'S APU COMPUTING PUSH which is to use the APU to do GPU computing along with a more powerful GPU doing the rendering work. I want to see benchmarks of games using the APU for computing while a variety of other GPU's do rendering versus the GPU's doing compute and rendering without the APU.
  • ehicks05
    On page Ambient Occlusion, Continued, 2nd to last paragraph,

    Should it be "borne out" instead of "born out"?
  • A Bad Day
    "If you have a quad-core host processor and a graphics engine with 2000 ALUs, are there any guesses as to which approach has more potential to make efficient use of available compute resources?"

    Obviously the quad-core! It has higher clock rate!

    /sarcasm

    On serious note, I wonder what clock rate would a quad-core CPU need to run at to match a low end GPU. God forbid if its a single core CPU.
  • atikkur
    gpgpu in games? only SAO? a bit too late talked about this,, nvidia had been long toying this, just name it, bokeh-filter, dof, cuda-water-sim-effect in just cause 2 (the best water ive seen), fluid, smoke, particle, hair, fur, cloth, destruction, rigid-body.... and everything physx you can think of, just ready to be discovered by developer for free.