LucidLogix Virtu MVP Technology and HyperFormance

While not specifically a feature of the chipset, Z77 will be one of the first chipsets to use this remarkable new technology. LucidLogix was the brains behind the Hydra chip—a hardware/software combination solution to allow GPUs from different manufacturers to work together (as we reviewed the last iteration on the ECS P62H2-A). Lucid was also behind the original Virtu software, designed to allow a discrete GPU to remain idle until needed, and let the integrated GPU deal with the video output (as we reviewed with the ASUS P8Z68-V Pro). This time, we get to see Virtu MVP, a new technology designed to increase gaming performance.

To explain how Virtu MVP works, I am going to liberally utilize and condense what is said in the Lucid whitepaper about Lucid MVP, however everyone is free to read what is a rather interesting ten pages.

The basic concept behind Virtu MVP is the relationship between how many frames per second the discrete GPU can calculate, against what is shown on the screen to the user, in an effort to increase the 'immersive experience'.

Each screen/monitor the user has comes with a refresh rate, typically 60 Hz, 75 Hz or 120 Hz with 3D monitors (Hz = Hertz, or times ‘per second’). This means that at 60 times per second, the system will pull out what is in the frame buffer (the bit of the output that holds what the GPU has computed) and display what is on the screen.

With standard V-Sync, the system will only pull out what is in the buffer at certain intervals—namely at factors of the base frequency (e.g. 60, 30, 20, 15, 12, 10, 6, 5, 3, 2, or 1 for 60Hz) depending on the monitor being used. The issue is with what happens when the GPU is much faster (or slower) than the refresh rate.

The key tenet of Lucid’s new technology is the term responsiveness. Responsiveness is a wide-ranging term which could mean many things. Lucid distils it into two key areas:

a) How many frames per second can the human eye see?
b) How many frames per second can the human hand respond to?

To clarify, these are NOT the same questions as:

i) How many frames per second do I need to make the motion look fluid?
ii) How many frames per second makes a movie stop flickering?
iii) What is the fastest frame (shortest time) a human eye would notice

If the display refreshes at 60 Hz, and the game runs at 50 fps, would this need to be synchronized? Would a divisor of 60 Hz be better? Alternatively, perhaps if you were at 100 fps, woud 60 fps be better? The other part of responsiveness is how a person deals with hand-to-eye coordination, and if the human mind can correctly interpolate between a screen's refresh rate and the output of the GPU. While a ~25 Hz rate may be applicable for a human eye, the human hand can be as sensitive as 1000 Hz, and so having the correlation between hand movement and the eye is all-important for 'immersive' gaming.

Take the following scenarios:

Scenario 1: GPU is faster than Refresh Rate, VSync Off

Refresh rate: 60 Hz
GPU: 87 fps
Mouse/Keyboard responsiveness is 1-2 frames, or ~11.5 to 23 milliseconds
Effective responsiveness makes the game feel like it is between 42 and 85 FPS

In this case, the GPU is 45% faster than the screen. This means that as the GPU fills the frame buffer, it will continuously be between frames when the display dumps the buffer contents on screen, such that the computation of the old frame and the new frame is still in the buffer:

This is a phenomenon known as Tearing (which many of you are likely familiar with). Depending on the scenario you are in, tearing may be something you ignore, notice occasionally, or find rather annoying. For example:

So the question becomes, was it worth computing that small amount of frame N+1 or N+3?

Scenario 2: GPU is slower than Refresh Rate, VSync Off

Refresh rate: 60 Hz
GPU: 47 fps
Mouse/Keyboard responsiveness is 1-2 frames, or ~21.3 to 43 milliseconds
Effective responsiveness makes the game feel it is between 25 and 47 FPS

In this case, the GPU is ~37% slower than the screen. This means that as the GPU fills the frame buffer slower than what the screen requests and it will continuously be between frames when the display dumps the buffer contents on screen, such that the computation of the old frame and the new frame is still in the buffer.

So does this mean that for a better experience, computing frame N+1 was not needed, and N+2 should have been the focus of computation?

Scenario 3: GPU can handle the refresh rate, V -Sync On

This setting allows the GPU to synchronize to every frame. Now all elements of the system are synchronized to 60 Hz—CPU, application, GPU and display will aim for 60 Hz, but also at lower intervals (30, 20, etc.) as required.

While this produces the best visual experience with clean images, the input devices for haptic feedback are limited to the V-Sync rate. So while the GPU could enable more performance, this artificial setting is capping all input and output.

Result:

If the GPU is slower than the display or faster than the display, there is no guarantee that the frame buffer that is drawn on the display is of a complete frame. A GPU has multiple frames in its pipeline, but only few are ever displayed at high speeds, or frames are in-between when the GPU is slow. When the system is set a software limit, responsiveness decreases. Is there a way to take advantage of the increased power of systems while working with a limited refresh rate—is there a way to ignore these redundant tasks to provide a more 'immersive' experience?

LucidLogix apparently has the answer…

The answer from Lucid is Virtu MVP. Back in September 2011, Ryan gave his analysis on the principles of the solution. We are still restricted to the high level overview (due to patents) explanation as Ryan was back then. Nevertheless, it all boils down to the following image:

Situation (A) determines whether a rendering task/frame should be processed by the GPU, and situation (B) decides which frames should go to the display. (B) helps with tearing, while (A) better utilizes the GPU. Nevertheless, the GPU is doing multiple tasks—snooping to determine which frames are required, rendering the desired frame, and outputting to a display. Lucid is using hybrid systems (those with an integrated GPU and a discrete GPU) to overcome this.

Situation (B) is what Lucid calls its Virtual V-Sync, an adaptive V-Sync technology currently in Virtu. Situation (A) is an extension of this, called HyperFormance, designed to reduce input lag by only sending required work to the GPU rather than redundant tasks.

Within the hybrid system, the integrated GPU takes over two of the tasks for the GPU—snooping for required frames, and display output. This requires a system to run in i-Mode, where the display is connected to the integrated GPU. Users of Virtu on Z68 may remember this: back then it caused a 10% decrease in output FPS. This generation of drivers and tools should alleviate some of this decrease.

What this means for Joe Public

Lucid’s goal is to improve the 'immersive experience' by removing redundant rendering tasks, making the GPU synchronize with the refresh rate of the connected display and reduce input lag.

By introducing a level of middleware that intercepts rendering calls, Virtual V-Sync and HyperFormance are both tools that decide whether a frame should be rendered and then delivered to the display. However the FPS counter within a title counts frame calls, not completed frames. So as the software intercepts a call, the frame rate counter is increased, whether the frame is rendered or not. This could lead to many unrendered frames, and an artificially high FPS number, when in reality the software is merely optimizing the sequence of rendering tasks rather than increasing FPS.

If it helps the 'immersion factor' of a game (less tearing, more responsiveness), then it could be highly beneficial to gamers. Currently, to work as Lucid has intended, they have validated around 100 titles. We spoke to Lucid (see next page), and they say that the technology should work with most, if not all titles. Users will have to add programs manually to take advantage of the technology if the software is not in the list. The reason for only 100 titles being validated is that each game has to be validated with a lot of settings, on lots of different kit, making the validation matrix huge (for example, 100 games x 12 different settings x 48 different system hardware configurations = time and lots of it).

Virtu MVP causes many issues when it comes to benchmarking and comparison of systems as well. The method of telling the performance of systems apart has typically been the FPS values. With this new technology, the FPS value is almost meaningless as it counts the frames that are not rendered. This has consequences for benchmarking companies like Futuremark and overclockers who like to compare systems (Futuremark have released a statement about this). Technically all you would need to do (if we understand the software correctly) to increase your score/FPS would be to reduce the refresh rate of your monitor.

Since this article was started, we have had an opportunity to speak to Lucid regarding these technologies, and they have pointed out several usage scenarios that have perhaps been neglected in other earlier reviews regarding this technology. In the next page, we will discuss what Lucid considers ‘normal’ usage.

The Z77 Chipset Lucid’s Take on Virtu MVP and How it Should Work
Comments Locked

145 Comments

View All Comments

  • Iketh - Sunday, April 8, 2012 - link

    "handling input in a game engine" means nothing here. What matters is when your input is reflected in a rendered image and displayed on your monitor. That involves the entire package. Lucid basically prevents GPUs from rendering an image that won't get displayed in its entirety, allowing the GPU to begin work on the next image, effectively narrowing the gap from your input to the screen.
  • extide - Tuesday, April 10, 2012 - link

    I am sure he knows that. He was just giving a bit of detail as to his exact experience, of which I would bet is far more than most people on here. You have to be very aware of things such as latency and delay when you are handling input in a game engine. I agree with the OP and am skeptical also. The bit that makes me most curious is the transfer of the fully rendered screens from one framebuffer to the other, that has to add some latency, and probably enough to make the entire process worthless. It's not like Lucid has a good track record on stuff like this, I mean we all know how their cross platform SLI/CF took off and worked so well....
  • Iketh - Wednesday, April 11, 2012 - link

    Why would you need to physically copy framebuffers?? I'm sure pointers are used...

    I have no idea if this has tangible benefits, but theoretically it does. None of us know until we can test it. I'm more inclined to discredit the people already discrediting Lucid, despite Lucid's track record. That's what you call hating.
  • Iketh - Wednesday, April 11, 2012 - link

    excuse me, you're right... it has to copy the frame from gpu to igpu... what kind of crap tech is this???
  • ssj3gohan - Sunday, April 8, 2012 - link

    Personally, I'm absolutely uninterested in anything 'high-performance', especially fancy gaming stuff. Not to say that I don't think that's a valid market niche, but I see other possibilities.

    I'm really looking forward to new thin ITX boards with built-in DC-DC converter (i.e. running directly off a 19V brick), and I am especially wondering whether Intel (or Zotac, possibly) is going to build a golden board this time around. Last time, they made DH61AG which was a nice board, but lacked an msata port (kind of a must for a truly thin computer) and 'only' had an H61 chipset.

    With H77, I expect it will be possible to make a thin ITX board with USB 3.0 and a fast on-board SSD option, combining this with an HD 4000 equipped processor would enable users to build a truly thin (sub-4 inch thick) computer that fits on the back of their monitor but still provides ample computing power.
  • Senti - Sunday, April 8, 2012 - link

    It sounds to me that Lucid Virtual V-Sync is just glorified triple buffering with a lot of marketing and a bit of overhead for transferring frames and powering two video cards instead of one. I'm very skeptical on the HyperFormance too.
  • Cavalcade - Sunday, April 8, 2012 - link

    It seems a bit more involved than triple buffering, more like having 2 buffers where the back buffer is not flipped until it is fully rendered. Seems like this would lead to more stuttering, and given the number of times they asked Mr. Cutress to reiterate that this would be a bug, it may be something they are seriously concerned with.

    Thinking about it a little more, I'm not sure what advantages this system would have over a system with separated input and rendering modules. The academic side of me is extremely interested and hopeful, but the practical developer side of me is going to require a lot more to be brought on board.
  • Iketh - Sunday, April 8, 2012 - link

    Separate input and rendering modules, as I stated in an earlier post, means nothing. They allow for a responsive mouse cursor, for instance. But, when you actually provide input that alters the RENDERED WORLD, you have to wait for that input to reflect on screen. It doesn't matter how perfectly the software solution is architected, you still have to wait for the rendering of the image after your input.

    Lucid simply prevents renders that never get displayed in their entirety, allowing the GPU to work on the NEXT image, shortening the time from your input to the screen.
  • Cavalcade - Monday, April 9, 2012 - link

    The comment was to indicate that while I have experience writing input systems, rendering is still relatively new to me; simply a qualifier of my impression and opinion.

    The way I am understanding Lucid, it is attempting to preempt displaying a frame that is not fully rendered in time for the next screen refresh. By presenting a virtual interface to both the GPU and the application, the application believes the frame has been rendered (displaying user input at that time) and proceeds to render the next frame. Thinking more about it, would this reduce the time interval between input reflected in frame one (which was preempted) and frame two (which will be displayed) so that rather than having input sampled at a fixed rate (say 60Hz) and displayed at a variable rate, input would be more closely tied to the frame for which it is intended.

    My interest is rising, but it still seems like a rather complex solution to a problem that I either haven't experienced, or which doesn't really bother me.
  • Iketh - Tuesday, April 10, 2012 - link

    it's not preemtively doing anything, except determining if a frame added to the queue will finish rendering in time... if not, it >>>>DOESNT LET THE GPU RENDER IT<<<< and places the previously rendered image in its place, allowing the GPU to immediately begin work on the FOLLOWING frame... that's it... it cuts unneeded frames from queues

    as for your input sampling rate question, that's entirely based on how the application is coded to handle input, lucid has nothing to do with this...

Log in

Don't have an account? Sign up now