Digital Foundry's guide to games media assets
Maximising the impact of screens and trailers
If you're in the process of creating a game trailer, this approach has other significant benefits. In an industry where every man-hour counts, the ability to use the same captures for screenshots and game trailers cuts down on duplication of effort: one capture session provides the raw materials for both sets of assets: it's a lesson we learned in the games media when screenshots and coverdisc video were required. Elimination of the duplication in effort saves an enormous amount of time, which is better spent elsewhere.
In creating video game trailers, several approaches seem to have come to the fore in recent years. There is the basic CG approach, where a mood is set, where teasers are given about the content of the game. Sometimes - as in the case of the recent Dead Island video - this can give a game a certain buzz, but always the audience will be left wondering what the actual product is going to look like. If there is a massive difference between the content of the teaser and the final game, the audience may well feel cheated or short-changed.
Once media outlets have their hands on your video, you are effectively at the mercy of their encoders. The better the quality of the source video, the better the result of a second generation encode.
Gameplay trailers are the natural progression, but even then there are different levels of coverage applicable to different audiences. A cross-format project with a PC SKU instantly gives you a head-start (assuming the computer version is at the same level of development). Using the PC version allows you to capture excellent quality video at 720p on max settings, and by using a software-based tool such as FRAPS you do not even require dedicated capture hardware. In this way you can show the game looking at its best very easily - you can even create a 1080p asset, just as DICE did with its recent Battlefield 3 trailers.
However, while this approach will work for general videos aimed at a generic audience, the fact is that enthusiast players want to see the products working on their platform, and when they don't see them, they begin to worry about the quality of the game. DICE's Battlefield 3 trailers look nothing short of sensational but pretty much the only question Digital Foundry readers are asking us is how the game will look on console, shorn of its 1080p base resolution and minus the cutting edge DirectX 11 effects. I can imagine that unveiling the console versions is a future element of the ongoing marketing campaign.
Crytek employed an intriguing strategy along these lines: mainstream trailers were generated using the PC version of Crysis 2, but standalone extended gameplay segments from both Xbox 360 and PlayStation 3 SKUs were distributed too - marketing initiatives aimed specifically at the hardcore gamers. If Crytek hadn't released these videos, the chances are that games media outlets would have made their own - in this situation, the developer/publisher remains in control of the media assets being created for the game while answering the questions the audience has about how the title looks on their platform.
Digital Foundry has created a number of gameplay trailers, and our hardware has been used by developers on countless others. Our tips of getting the best-looking assets out there are fairly straightforward. As we see it, there are three major technical elements to bear in mind when it comes to the creation of video assets such as gameplay capture or trailers.
First up, it's worth bearing in mind that internet video plays out on a very different colour space to the output of the video games consoles. HDMI output uses 24-bit RGB, and most capture cards immediately downscale this to 16-bit YPrPb (YUY2), immediately resulting in a downsampling of the chroma data (something we set out to avoid with our own hardware). This occurs because YUY2 is the favoured output format of the HDMI ports of the camcorders most capture cards were designed to work with: game capture is a very niche market.
The core data then gets squeezed down again when encoding internet video, this time being converted into the 12-bit YV12 format. The chances are that your video will not be viewed at 720p, so chroma takes a hit again when re-encoded into standard def formats, emphasising the compromises.
All you can do here is to be aware of the issue and be selective in the clips you use - any kind of video using pure reds or blues can look pretty poor, but you won't know how poor until the final asset is encoded. HUD components using red or blue persistent elements will show this artifact and should be avoided.
The next element to factor in is video compression. Capture and editing typically takes place using an intermediate codec such as Apple's ProRes or the superior PC/Mac cross-platform alternative, CineForm HD. However, when exporting your final video, the usual route is to use the editing system's standard h.264 compressor - with Final Cut Pro this is Quicktime. Apple's implementation of h.264 is very limited compared to the full potential of the actual spec, so for improved picture quality at the same level of bandwidth, and for more encoding options, it's highly recommended that you install the free x264 Quicktime encoder on a Final Cut Pro system. This open source h.264 encoder is swiftly becoming the industry standard (it's used by YouTube and Gaikai to name but two), and it's not only faster than Quicktime but offers visibly superior results and better compression.
Sticking with Quicktime is fine so long as you give your edit enough bandwidth to retain as much quality as possible - but how much bandwidth you require varies according to the material you are encoding. A slow moving game with muted colour schemes such as The Chronicles of Riddick or Alan Wake will require far less bandwidth than something like the colourful, action-packed Bayonetta.
So what's the solution? A quality-based encode (CRF in x264, for example) allows you to specify a set quality level that every frame will adhere to, ensuring you'll get the result you want. In the CRF range, 23 is the lowest quality we'd recommend for a source asset, while anything below 17 will be a waste of bandwidth - visual refinements will go unnoticed by the human eye.
Once media outlets have their hands on your video, you are effectively at the mercy of their encoders (many of which are painfully poor), but the rule of thumb is that the better the quality of the source video, the better the result of a second generation encode.