Tech Focus: Kinect - Year One
Digital Foundry on Kinect tech evolution and Microsoft's support strategy
Last week's release of Forza Motorsport 4 is the first stage in Microsoft's new strategy of adding Kinect functionality to the entirety of its first party core game output. Almost one year after the release of the depth-cam hardware, the question is, just how far have developers managed to push the system and does the mandatory core game support help or hinder the platform?
By now the raw capabilities of the Kinect platform are a matter of public record: there's a VGA webcam capable of 640x480's worth of RGB resolution in combination with an infra-red depth sensor of the same resolution. On top of that is an advanced multi-array mic that is adept at working out the position of incoming audio - a neat trick.
Microsoft's strategy of force-fitting Kinect into core titles may end up being counter-productive, serving to highlight its weaknesses rather than its strengths.
In essence, what we have here is a full production version of the original PrimeSense reference camera: contrary to rumour-mongering of the time, there are no hardware downgrades, there was no removal of the main processing chip - indeed, with the addition of the multi-array mic and the motorised base, Kinect is actually more advanced than the prototype, perhaps explaining the somewhat high launch price.
Where confusion remains is in just how much access the core Xbox 360 hardware has to this raw spec. Connect up Kinect to a PC and you can stream both camera feeds with full resolution at 30 frames per second. However, initial specs for Kinect on 360 saw depth resolution dropped to quarter res: 320x240. Why?
USB bandwidth is at a premium on the Xbox 360 - the controller chip is significantly below spec compared to a PC and the available bandwidth has to service a whole range of devices including USB flash drives (which can run game installs), Xbox controllers and plastic musical instrument facsimiles. So while a PC can easily get around 30MB/s from the USB port, the Xbox 360 is limited to something like 15MB/s.
So to date, we haven't seen any Kinect titles running the full frame-rate RGB camera feed simultaneously with the depth scan - and while well-sourced news stories indicate that Microsoft has restored the full resolution of the depth camera to its 640x480 spec, we've yet to see any Kinect titles translate that enormous boost into any kind of material gameplay advantage over the launch games - perhaps forthcoming first party titles like Kinect Sports: Season Two and Dance Central 2 will change that.
Some elements of the Kinect tools for developers have been improved significantly over the last year: the core libraries that Microsoft provides to game creators have been refined immensely. For example, it's well known that seated gameplay could not be robustly supported by the skeletal tracking system. While tracking movement of the legs while seated is still an issue, the upper body can now be accurately scanned. Kinect's skeletal tracking works by comparing depth data with pre-stored images, and the accuracy of this procedure has improved a great deal - motions that could only be tracked with borderline precision are now significantly more accurate.
However, some limitations of the system still prove to be far more difficult to deal with - and with the Xbox 360 generation of hardware now coming to an end, it's unlikely that these will ever be seriously addressed. For example, Microsoft libraries only allow the skeletal data to be remapped onto an Avatar for example, so if a developer wants a player's motion to be tracked to another object, custom libraries or middleware will be required.
Similarly, much as Microsoft may deny or ignore it, Kinect has a clear latency problem that generally sees motion-based input processed at a minimum of 200ms - that's a good 50ms slower than OnLive running in optimal conditions, and combined with display lag can see response dulled to a quarter of a second. Microsoft has held numerous developer-briefings on best case programming practises that can reduce lag and theoretically take down input latency to 100ms - but the fact is that the CPU and GPU scheduling techniques being suggested do not fall into line with the way modern games are being made.
Part of the problem is the nature of the USB interface itself. It's believed that the simple process of taking Kinect inputs and beaming them across to the USB port takes around 60-70ms. In comparison, the wireless joypad technology used by Microsoft is a mere 8ms. Secondly, the whole nature of bodily movement is that it generally takes significantly longer to shake an arm or a leg than it is to press a button. Even factoring out processing latency completely, achieving the same function is going to take longer.
As it is, Microsoft's strategy of force-fitting Kinect functionality into its core titles (with several third parties following suit) may well end up being counter-productive in that it serves to highlight the weaknesses of the platform rather than the strengths.
Let's take a look at the most recent release with Kinect functionality: Forza Motorsport 4. The basic interface mechanic is nothing different on a conceptual level to what we've seen before in launch title Joyride: acceleration and braking are entirely automatic with the Kinect tracking hand motion in order to simulate steering. Turn 10's reasoning is that this is the most direct, intuitive interface for non-gamers - but then we are left with the uncomfortable mix of a hardcore game proposition combined with a distinctly "lite" way of playing it. It's an unwieldy fit.