Futuremark Releases 3DMark Time Spy DirectX 12 Benchmark
by Daniel Williams on July 14, 2016 1:00 PM EST- Posted in
- GPUs
- Futuremark
- 3DMark
- Benchmarks
- DirectX 12
Today Futuremark is pulling the covers off of their new Time Spy benchmark, which is being released today for all Windows editions of 3DMark. A showcase of sorts of the last decade or so of 3DMark benchmarks, Time Spy is a modern DirectX 12 benchmark implementing a number of the API's important features. All of this comes together in a demanding test for those who think their GPU hasn’t earned its keep yet.
DirectX 12 support for game engines has been coming along for a few months now. To join in the fray Futuremark has written the Time Spy benchmark on top of a pure DirectX 12 engine. This brings features such as asynchronous compute, explicit multi-adapter, and of course multi-threading/multi-core work submission improvements. All of this comes together into what I think is not only visually interesting, but also borrows a large number of gaming assets from benchmarks of 3DMarks past.
For those who haven’t been following the 3DMark franchise for more than a decade, there are portions of the prior benchmarks showcased as shrunken museum exhibits. These exhibits come to life as the titular Time Spy wanders the hall, giving a throwback to past demos. I must admit a bit of fun was had watching to see what I recognized. I personally couldn’t spot anything older than 3DMark 2005, but I would be interested in hearing about anything I missed.
Unlike many of the benchmarks exhibited in this museum, the entirety of this benchmark takes place in the same environment. Fortunately, the large variety of eye candy present gives a varied backdrop for the tests presented. To add story in, we see a crystalline ivy entangled with the entire museum. In parts of the exhibit there are deceased in orange hazmat suits demonstrating signs of a previous struggle. Meanwhile, the Time Spy examines the museum with a handheld time portal. Through said portal she can view a bright and clean museum, and view bustling air traffic outside. I’ll not spoil the entire brief story here, but the benchmark makes good work of providing both eye candy for the newcomers and tributes for the enthusiasts that will spend ample time watching the events unroll.
From a technical perspective, this benchmark is, as you might imagine, designed to be the successor to Fire Strike. The system requirements are higher than ever, and while Fire Strike Ultra could run at 4K, 1440p is enough to bring even the latest cards to their knees with Time Spy.
Under the hood, the engine only makes use of FL 11_0 features, which means it can run on video cards as far back as GeForce GTX 680 and Radeon HD 7970. At the same time it doesn't use any of the features from the newer feature levels, so while it ensures a consistent test between all cards, it doesn't push the very newest graphics features such as conservative rasterization.
That said, Futuremark has definitely set out to make full use of FL 11_0. Futuremark has published an excellent technical guide for the benchmark, which should go live at the same time as this article, so I won't recap it verbatim. But in brief, everything from asynchronous compute to resource heaps get used. In the case of async compute, Futuremark is using it to overlap rendering passes, though they do note that "the asynchronous compute workload per frame varies between 10-20%." On the work submission front, they're making full use of multi-threaded command queue submission, noting that every logical core in a system is used to submit work.
Meanwhile on the multi-GPU front, Time Spy is also mGPU capable. Futuremark is essentially meeting the GPUs half-way here, using DX12 explicit multi-adapter's linked-node mode. Linked-node mode is designed for matching GPUs - so there isn't any Ashes-style wacky heterogeneous configurations supported here - trading off some of the fine-grained power of explicit multi-adapter for the simplicity of matching GPUs and useful features that can only be done with matching GPUs such as cross-node resource sharing. For their mGPU implementation Futuremark is using otherwise common AFR, which for a non-interactive demo should offer the best performance.
To take a quick look at the benchmark, we ran the full test on a small number of cards on the default 1440p setting. In our previous testing AMD’s RX 480 and R9 390 traded blows with each other and NVIDIA’s GTX 970. Here though, the RX 480 pulls a small lead over the R9 390 while they both leave a slightly larger gap ahead of the GTX 970. Only to then see the GeForce GTX 1070 appropriately zip past the lot of them.
The graphics tests scale similarly to the overall score in this case, and if these tests were a real game anything less than the GTX 1070 would provide a poor gameplay experience with framerates under 30 fps. While we didn’t get any 4K numbers off our test bench, I ran a GTX 1080 in my personal rig (i7-2600k @4.2GHz) and saw 4K scores that were about half of my 1440p scores. While this is a synthetic test, the graphical demands this benchmark can place on a system will provide a plenty hefty workload for any seeking it out.
Meanwhile, for the Advanced and Professional versions of the benchmark there's an interesting ability to run it with async compute disabled. Since this is one of the only pieces of software out right now that can use async on Pascal GPUs, I went ahead and quickly ran the graphics test on the GTX 1070 and RX 480. It's not an apples-to-apples comparison in that they have much different performance levels, but for now it's the best look we can take at async on Pascal.
Both cards pick up 300-400 points in score. On a relative basis this is a 10.8% gain for the RX 480, and a 5.4% gain for the GTX 1070. Though whenever working with async, I should note that the primary performance benefit as implemented in Time Spy is via concurrency, so everything here is dependent on a game having additional work to submit and a GPU having execution bubbles to fill.
The new Time Spy test will be coming today to Windows users of 3DMark. This walk down memory lane not only puts demands on the latest gaming hardware but also provides another showcase of the benefits DX12 can bring to our games. To anyone who’s found FireStrike too easy of a benchmark, keep an eye out for Time Spy in the near future.
75 Comments
View All Comments
powerarmour - Thursday, July 14, 2016 - link
I don't get your point about OpenGL, just simply look at the Vulkan numbers for AMD and Nvidia, and the order in performance between the cards.FORTHEWIND - Thursday, July 14, 2016 - link
Um no. Vulkan is based on Mantle. Get your fact check please.Scali - Friday, July 15, 2016 - link
How so? The DOOM Vulkan FAQ states that async compute was not enabled on nVidia cards yet. So you can't compare it to the async results of this benchmark.bobacdigital - Friday, July 15, 2016 - link
Yes you can compare the results... AMD only gets Async Compute when TSAA is the AA of choice (stated by the developers the other AA options will turn off Async Compute until they add support for them). Most the early benchmarks were using SMAA as the choice of aliasing. So if both AMD and Nvidia both use SMAA then you are comparing apples to apples.. Even then AMD was still getting 20 frames boost (Async was adding an addtional 10 frames ontop of that).It is true that open gl blew for AMD .. but the interesting point is that Vulkan runs substantially better on AMD hardware without async .. and with async the 480 is within 10% or less of the 1080... The Fury X (last gen card) is beating the 1070 and trading blows with the 1080.
bobacdigital - Friday, July 15, 2016 - link
Sorry within 10% of the 1070 (not the 1080)Scali - Friday, July 15, 2016 - link
Another reason why DOOM comparisons are difficult is because DOOM uses AMD's intrinsic shader extension, but nothing equivalent on nVidia.So the gains on AMD hardware are partly Vulkan, partly AMD-specific shader optimizations, and partly async compute.
On the nVidia side, you purely see OpenGL vs Vulkan. All gains come from better API efficiency.
bluesoul - Saturday, July 16, 2016 - link
so Pascal didnt gain from Async Compute?Scali - Saturday, July 16, 2016 - link
We don't know yet. The DOOM Vulkan FAQ says async compute is not enabled on nVidia hardware yet, and they're still working on this. So we should expect a future update which enables it, with gains.But, well, you just want to hear that Pascal can't handle it, right? Even though Time Spy already proves that it does.
Looking at your other comments here, you throw around some half-truths, and some fancy buzzwords with no sources whatsoever, trying to build up some theory that NV/Pascal can't do async compute. It's interesting how many people like you have been active on various forums, with similar propaganda. Makes you wonder if AMD is paying shills/trolls again...
Sadly, I'm an actual dev, with CUDA and DX11/DX12 experience, so I actually know what this stuff is all about. And I'm not fooled by the lies that these people spread.
Sadly, not enough is done to shut these people up.
Yojimbo - Thursday, July 14, 2016 - link
It's a conspiracy, I tell you. Hmm, wait, maybe not. Maybe their facts are just sexist.tcnasc - Thursday, July 14, 2016 - link
Daniel, could you please share some results for your rig (i7-2600K + GTX1080), I'm curious to see how the older CPUs handle the 1080. I have an i5-2500K@4.4GHz now and I will get a GTX1080 soon. Will it be a significant bottleneck?