Adventures in NVIDIA GRID K1 and Horizon View
Just had a really interesting week helping a customer with a proof of concept environment using Horizon View 6 and an NVIDIA GRID K1 card, so I thought I’d blog about it. You know, like you do. Anyway, this customer already had a View 5.2 environment running on vSphere 5.1 on a Cisco UCS back end. They use 10 Zig V-1200 zero clients on the desktop to provide virtual desktops to end users. As they are an education customer providing tuition on 3D rendering and computer games production, they wanted to see how far they could push a K1 card and get acceptable performance for end users. Ultimately, the customer’s goal is to move away from fat clients as much as possible and move over to thin or zero clients.
I have to say that in my opinion, there is not a great deal of content out there about K1 cards and Horizon View. One of the best articles I’ve seen is by my good friend Steve Dunne, where he conducted a PoC with a customer using dedicated VGA. In our case, we were testing out sVGA or Shared VGA as the TCO of dedicated would be far too high and impossible to justify to the cheque signers. This PoC involved a Cisco UCS C240-M3 with an NVIDIA K1 card pre-installed. The K1 card has four GPU cores and 16GB RAM on board and takes up two PCI slots.
I’m not going to produce a highly structured and technical test plan with performance metrics as this really isn’t the way we ran the initial testing. It was really a bit more ad hoc than that. We ran the following basic tests:-
– Official 720p “Furious 7” trailer from YouTube (replete with hopelessly unrealistic stunts)
– Official 1080p “Fury” trailer from YouTube (replete with hopelessly unrealistic acting)
– Manipulating a pre-assembled dune buggy type unit in LEGO Digital Designer
– Manipulating objects in 3DS Max
When we started off at View 5.2, we found that video playback was very choppy, lip sync was out on the trailers and 3D objects would just hang on the screen. Not the most auspicious of starts, so I ensured the NVIDIA VIB for ESXi 5.1 was the latest version (it was) and ESXi was at 5.1 U1. We made a single change at a time and went back over the test list above to see what difference it made. We initially set the test pool to be hardware 3D renderer, with 256MB of video RAM. The command gpuvm was checked to ensure the test VMs were indeed assigned to the GRID K1 card and we used nvidia-smi -l to monitor the GPU usage during the testing process.
Remember that when you configure video RAM in a desktop pool, half of the RAM is assigned from host memory and the other half is from the GRID K1 card, so keep this in mind when sizing your environment. The goal of the customer is to provide enough capacity for 50 concurrent heavy 3D users per host, so two GRID K1 cards per host is what they’re looking at, pending testing and based on the K120Q profile.
So after performing some Google-Fu, the next change we decided to make was to add the Teradici audio driver to the base image, the main reason for this was that there is apparently a known issue with 10Zig devices and audio sync, so we thought we’d give it a try. Although audio quality and sync did improve, it still didn’t really give us the results we were looking for.
Having gone back to the drawing board (and forums), the next change we made was to upgrade the View agent on the virtual desktop from 5.2 to 5.3. Some customers in the VMware Communities forums had observed some fairly major performance improvements doing this, without the need to upgrade the rest of the View infrastructure to 5.3. We did this and boy did things improve! It was like night and day and the video was much improved, barely flickered at all and the audio was perfect. At this point we decided that View 5.2 was obviously not going to give us the performance we needed to put this environment in front of power users for UAT.
The decision was taken to upgrade the identical but unused parallel environment at another site from vSphere 5.1 and View 5.2 to vSphere 5.5 U2 and View 6.0.1. The reasoning behind this was that I knew that PCoIP had improved markedly in 6.0.1 from the 5.x days, plus is meant we were testing on the latest platform available. Once we upgraded the environment and re-ran the tests, we saw further improvement without major pool or image changes. We updated VMware Tools and also the View Agent to the latest versions as part of the upgrade and the customer was really impressed with the results.
In fact, as we were watching the 720p and 1080p videos, we remarked that you’d never know you were watching it on a zero client, basically streamed from a data centre. That remark is quite telling, as if a bunch of grizzled techies say that, end users are likely to be even more chirpy! We also performed more rudimentary testing with LEGO Digital Designer and also 3DS Max, with improved results. The PoC kit has now been left with the customer, as really we need subject matter experts to test whether or not this solution provides acceptable end user performance to totally replace fat clients.
What is the takeway from this?
The takeaway is that in my opinion, VDI is following a similar path to what datacentre virtualisation followed a few years back. First you take the quick and easy wins such as web servers and file servers and then you get more ambitious and virtualise database servers and messaging servers once confidence in the platform has been established.
VDI has started with the lighter use of the “knowledge user” who uses a web browser and some Office applications to a basic level. Now this target has been proven and conquered, we’re moving up the stack to concentrate on users who require more grunt out of their virtual desktop. The improvements in Horizon View, ESXi and now with the support of hardware vendors such as NVIDIA and Teradici, native levels of performance for some heavy use cases can be achieved at a sensible cost.
That being said, running a PoC will also help finding where the performance tipping point is and what realistic expectations are. In our case, early testing has shown that video, audio and smaller scale 3D object manipulation is more than feasible and a realistic goal for production. However, much larger vehicles such as the Unreal Development Kit may still be best suited to dedicated gaming hardware rather than a virtual desktop environment. The one thing of course I haven’t mentioned is the fact that the customer is GigE to the desktop and have a 10GigE backbone between their two sites. This makes a huge difference. I doubt we’d have seen the same results on a 100Mbps/1Gbps equivalent environment.
The customer will be testing the PoC until the end of December, hopefully I can share the results nearer the time as to whether or not they proceed and if so, how they do it. Hopefully for anyone researching or testing NVIDIA GRID cards in a VDI environment, this has been some help.