At the launch of Intel's LGA-2011 based Sandy Bridge E CPU we finally had a platform capable of supporting PCI Express 3.0, but we lacked GPUs to test it with. That all changed this past week as we worked on our review of the Radeon HD 7970, the world's first 28nm GPU with support for PCIe 3.0.

The move to PCIe 3.0 increases per-lane bandwidth from 500MB/s to 1GB/s. For a x16 slot that means doubling bandwidth from 8GB/s under PCIe 2.1 to 16GB/s with PCIe 3.0. As we've seen in earlier reviews and our own internal tests, there's hardly any difference between PCIe 2.1 x8 and x16 for modern day GPUs. The extra bandwidth of PCIe 3.0 wasn't expected to make any tangible difference in gaming performance and in our 7970 tests, it didn't.

Why implement PCIe 3.0 at all then? For GPU compute. Improving bandwidth and latency between the CPU and the GPU are both key to building a high performance heterogenous computing solution. While  good GPU compute benchmarks on the desktop are still hard to come by, we did find one that showed a real improvement from PCIe 3.0 support on the 7970: AMD's AES Encrypt/Decrypt sample application. 

Simply enabling PCIe 3.0 on our EVGA X79 SLI motherboard (EVGA provided us with a BIOS that allowed us to toggle PCIe 3.0 mode on/off) resulted in a 9% increase in performance on the Radeon HD 7970. This tells us two things: 1) You can indeed get PCIe 3.0 working on SNB-E/X79, at least with a Radeon HD 7970, and 2) PCIe 3.0 will likely be useful for GPU compute applications, although not so much for gaming anytime soon.

Comments Locked


View All Comments

  • palladium - Thursday, December 22, 2011 - link

    I doubt there would be a significant difference in gaming between PCIe gen 1 and gen 3. For now anyway.
  • descendency - Thursday, December 22, 2011 - link

    I may be wrong, but wouldn't a PCI 2.1 x8 be the same as a PCI 1.0 x16?
  • Kevin G - Thursday, December 22, 2011 - link

    In terms of bandwidth, PCI-E 2.1 with 8 lanes would be the same as PCI-E 1.0 at 16 lanes. However, due to the higher clock of PCI-E 2.1, there should be latency improvements to give it a small edge.
  • tw99 - Thursday, December 22, 2011 - link

    I just wanted to say thank you for including the 8800 GT in your benchmark charts. Even though its dated hardware, including it in your comparisons illustrates the punch that the newer hardware has and assists in decision making for people like myself looking to upgrade from their current setup.
  • Drazick - Thursday, December 22, 2011 - link

    Hi Anand,
    Why wouldn't you use MATLAB and MATLAB + JACKET for GPGPU testings?

    Moreover, it would be great if you could add MATLAB for your application performance test bed.

  • Cairista - Friday, December 23, 2011 - link

    I second this! (registered just to say this)
  • MySchizoBuddy - Sunday, December 25, 2011 - link

    Both Matlab PCT and Jacket is made on top of CUDA so it won't help with testing AMD cards.
    ViennaCL or ArrayFire which are both OpenCL can be used for both Nvidia and AMD
  • Ryan Smith - Sunday, December 25, 2011 - link

    Actually we picked up ArrayFire, but it's one of our failed benchmarks as the ArrayFire benchmark collection executes too quickly and was proving to be rather insensitive to GPU performance. Not that it couldn't be the basis of an interesting benchmark, but the included benchmarks don't cut the mustard for our needs. We don't write our own benchmarks (technical skill aside, in-house benchmarks raise questions of independent verification), so we can only work with what we have.

    And actually I did some research into Accelereyes' products ahead of our review, and I caught notice that Jacket 2.0 supports OpenCL We don't have that or MATLAB so I can't speak of its performance but I will leave the door open. I don't have any experience with MATLAB (it's not heavily used in pure CompSci), but if any of you do and are willing to lend us your expertise, I definitely agree it could make for an interesting benchmark.
  • koarl0815 - Monday, January 2, 2012 - link

    Hi Ryan,
    my attention was just drawn to this thread. I'm the head of ViennaCL and looking forward to assist you with setting up an OpenCL benchmark suite. Feel free to contact me for details (my email address can be found on the ViennaCL webpage).
  • twotwotwo - Thursday, December 22, 2011 - link

    That looks like 64MB in under a third of a second, or more than 192MB/s. Looks like my Neal Stephenon-esque highly-secure crypto-tastic data-haven-under-a-mountain will need a sweet gaming rig in it.

Log in

Don't have an account? Sign up now