In Theory: Is this how OnLive works?
Digital Foundry ponders the technology behind the big splash cloud gaming company
A Big Splash
After its sensational - and controversial - debut at GDC last year, little has been heard publicly about the OnLive cloud gaming system. During the summer, the firm kicked off its promised beta sign-up phase, but in a world where footage is leaked from betas within hours of their debut, the lack of any tangible feedback on the system from testers was telling. Was OnLive still on schedule? In the run-up to Christmas, momentum picked up: many confirmed OnLive beta testers finally broke cover and a mammoth 48-minute presentation from company front man Steve Perlman was released.
In a small, intimate venue, the Columbia University alumnus, equipped with the OnLive browser plug-in and microconsole, presented what amounted to a more informal re-run of the original GDC presentation - mostly the same tech, showcasing the same games. More details emerged about the make-up of the system, and Perlman produced a mouth-watering presentation of Crysis running via OnLive on the iPhone. The core issues many commentators took issue with (latency and video compression) were also covered, albeit in an extremely vague manner.
So, the big questions remain: in particular, just how does OnLive compress video? Perlman suggests that OnLive has created a new video compressor divorced from the conventions of normal video encoding: the so-called group of pictures (GOP). GOP is all about retaining and re-using as much video information as possible to reconstruct the current frame. Picture elements can be brought from past and future frames to ensure the highest-possible compression. But OnLive has a problem. Taking elements from future frames would require buffering them and thus introducing lag, which would sit on top of the time taken to beam information across the internet as well as the inherent latency in the game itself.
The Mystery Tech
Perlman says that OnLive doesn't use GOP, and uses a proprietary compression system. Jason Garrett-Glaser, one of the key developers of the x264 open source h264 encoding system used industry-wide and thus extremely well-connected, claims otherwise.
"As far as I know, OnLive is just using h264, so this doesn't really go in the 'new and alternative' category," he wrote on the Doom9 forum under his online alias, Dark Shikari. "Their 'new idea' is splitting the stream into 16 rectangular slices, each of which gets its own encoder. This brilliant idea massively reduces compression on the edges between slices when the scene is in motion and lets them brag about latency 16 times lower than what they actually have."
The process of cutting the picture into pieces and parallelising the encoding of the whole image is actually supported in the h264 spec. For many-threaded decoders such as that within the PS3, the use of slices makes for far faster playback of challenging content (for example the WipEout HD 1080p60 videos). However, Garrett-Glaser suggests that OnLive is physically cutting the screen into 16 pieces and sending them to 16 different independent encoders.
"With slices, each slice can reference data anywhere in the previous frame," says Garrett-Glaser. "This means that if something moves from slice A to slice B, there's no problem: slice A can point to it with a motion vector just as if it didn't cross the edge. But OnLive isn't using slices: they're encoding the video as a bunch of separate streams. These streams are completely separate, and so each stream cannot reference data from the other streams. So, if something crosses the edge of a slice, it cannot be referenced properly! This effect is normally rather small, as it only affects the edges of the frame, but with OnLive's method, it affects about eight times the number of macroblocks as it otherwise would, because it affects those on both sides of each slice boundary as opposed to merely the frame edges."