The Continuing Evolution of Video Codecs
The current dominant codec for Internet video is H.264, but there are influential companies dedicated to alternatives. With the advent of 4K (aka Ultra HD), new codecs will be useful. H.265 (aka HEVC) and VP9 could split the market, and another approach from Mozilla, called Daala, may prove to be a wild card.
When discussing delivery of video over the Internet, three aspects of the format are important protocol, container, and codec. Protocol and container are discussion topics in their own right. In this article we will concentrate on emerging video codecs, and their potential impact on mobile networks.
Media delivery has undergone continual evolution, driven by technological advances and consumer demand for higher quality and lower cost. Audio has developed from the low fidelity Phonograph through many generations of physical media and digital formats. Video has followed the same trend. Each step change in quality or cost point is often accompanied by a ‘format war’, for example VHS versus Betamax or Blu-Ray versus HD-DVD. Often there is a clear victor who dominates the market segment, and not always the superior format.
State of Play
H.264, otherwise known as Advanced Video Codec (AVC), remains the most widely used video codec. The only real challenger to its crown within this generation of video codecs has been VP8, which is used in the WebM format created by Google. VP8 was developed by On2 Technologies, which was acquired by Google in February 2010.
With WebM, Google attempted to exploit the chink in the AVC armor: that it is a patented technology. WebM is a combination of the VP8 video codec, OGG Vorbis audio, and the Matroska container format, all of which are open source (though VP8 makes use of some patented technologies).
Another exploitable opportunity was the developing HTML5 standard. Traditionally, internet video has either been played by a “break out player” outside of the browser context, or by a plugin. The dominant technology in the browser plugin space has been Adobe (Shockwave) Flash.
HTML5 introduced the ‘video’ tag that allows the browser to play video natively, embedded in the web page, without the need for a plugin player. Crucially, HTML does NOT mandate supported video formats, and so each browser vendor is free to choose. The vendor’s choice is influenced by factors such as the desire to promote open source (Firefox), or having a stake in a particular technology (Chrome).
AVC became a pragmatic choice, given the breadth of content and device support. Eventually, all of the major browser vendors introduced support for AVC.
Importantly, at the time of writing, support for VP8 and VP9 on iOS Safari is a definite ‘no’. Apple has consistently supported AVC while ignoring VP8, and this is another important factor given the popularity of iDevices.
The failure of WebM/VP8 to make a significant impact within the current generation of video codecs can largely be attributed to the following:
- AVC was already very well established, with mature codec implementations.
- Content providers had already built a huge catalogue of AVC content, with the tools and workflows in place to do so.
- Hardware acceleration for AVC decoding is commonplace. This is particularly important for mobile devices, as hardware decoding implementations drain the battery less than pure software implementations using a general purpose CPU.
- Apple refused to add VP8 support to their devices, ensuring that to provide adequate coverage of devices the content provider would need to supply an AVC variant. Why then also provide a VP8 alternative?
AVC licensing is administered by MPEG-LA. In a nutshell, there are two enforcement points for the license. A license required to implement and distribute the codec based on the standard, and another to distribute ‘for-pay’ content. Content that is free to users has no license to pay until “at least” 2015.
Will we see MPEG-LA maintain this stance on ‘free’ content, or will we see royalty payments introduced? Note that it is not the content producer who is liable for royalty payments, but rather the ‘apparent provider’. As one of the largest “free” video providers on the Internet, these issues are the main drivers behind Google’s (and therefore YouTube’s) VP8/VP9 initiatives.
VP8 itself uses patented techniques in video compression, that are owned by the MPEG-LA patent pool. In order to maintain the position that users would be free of royalty payments, Google licensed the technology from MPEG-LA in March 2013.
The Next Generation
Network demands of streaming high quality video, in conjunction with the availability of mass market higher resolution devices, has created the need for a new generation of video codecs to provide higher quality video at lower bit rates. Most commercial content is now produced at 1080p, and many smartphones are now capable of recording at that resolution. Ultra-HD resolution content such as 4K video is emerging, with television leading the way. Netflix has introduced a 4K video service, with content limited to recent Hollywood blockbusters and popular series such as “House of Cards”. The library can only grow over time.
Ultra-HD formats at 4K or 8K resolution makes sense for very large screens, such as cinema. However in the home, math tells us that at a viewing distance of 10 feet the human eye cannot distinguish individual pixels on a 720p screen. This means that there is no gain in viewing quality for any resolution above 720p.
1080p is now commonplace, and at a viewing distance of 10 feet the screen would need to be at least 77 inches diagonal before the viewer could distinguish individual pixels. In the short term, Ultra-HD television will be a niche market. Displays are still very expensive and with limited available content, it remains to be seen if Ultra-HD Television will become truly mass market.
Counterintuitively, 4K+ resolution may have more of an impact on the video experience on laptops, mobile devices, and tablets due to the closer proximity of the viewer to the screen. 4K laptops are already available, with 10.1” tablet devices in the works.
Increasing video resolution brings unavoidable costs into the ecosystem:
- For the content provider
- Higher processing cost to encode
- Higher storage requirements for video on demand (VoD) applications
- Higher bandwidth required to serve the content
- For the consumer
- Higher processing cost to decode, which can affect battery life
- Higher bandwidth required to play the content without stalling and buffering
- More bytes transferred, which impacts mobile data plans
- For the network
- Higher bandwidth demands
- User dissatisfaction if video cannot be played smoothly
The increased processing cost can be absorbed by the continual evolution of hardware to provide greater processing capabilities. Processing will eventually be supported by hardware accelerated chipsets for that codec. For VoD services, cloud instances of video encoders can be spun up to encode high volumes of video data in a short time. This is a tactic used by YouTube when a new format is introduced.
The more difficult issue is the effect on the network due to the bandwidth required to sustain playback of the content at higher bit rates. This has necessitated a new generation of video codec, to provide high quality video at a lower bit rate. The two major players in this space are the next generation of the AVC and VP8 codecs, High Efficiency Video Codec (HEVC, otherwise known as H.265) and VP9 respectively.
Both codecs produce content at the same perceptible quality level as their predecessors, but at a lower bit rate. The trade-off is that both require more processing to do so. There are many published comparisons available online, however the codec implementations are immature, with development ongoing, and both encoding speed and quality constantly improving. Each of the comparisons therefore is comparing two particular codec implementations at a point in time.
One such study can be found here.
A summary table from the study is reproduced below. For each column, the ‘-‘ entry is the reference, and the % values are how much the bit rate needed to be increased or decreased in order to maintain the same quality (measured using PSNR).
The conclusion of the study states:
“The typical encoding times of the VP9 encoder are around 130 times higher than those measured for the x264 encoder. On the other hand, when compared to the H.265/MPEG HEVC reference encoder implementation, the VP9 encoding times are lower by a factor of 7.35, on average.
According to the experimental results, the coding efficiency of VP9 was shown to be inferior to both H.264/MPEG AVC and H.265/MPEG HEVC with an average bit rate overhead at the same objective quality of 8.4% and 79.4%, respectively.”
In a nutshell, for coding efficiency HEVC > AVC > VP9. For encoding speed AVC > VP9 > HEVC. These are early codec implementations, and these profiles will change over time.
On paper, HEVC has 10 times the encoding complexity of AVC and twice decoding complexity. This means that modern laptop and high end mobile devices area already capable of decoding in software. The cost is a heavy battery drain, which can be addressed with hardware acceleration.
YouTube is currently providing VP9 content as a DASH variant. Netflix has chosen to use a HEVC encoding for its 4K TV service. This has punished the early adopters of 4K television, as the first models contained hardware support for AVC decoding only. This makes the early release 4K equipment incompatible with the Netflix service.
Impact on Mobile Networks
All of this has ramifications for mobile networks as well as fixed-line networks. Ultra HD is unlikely to be delivered to devices on mobile networks in the short to medium term. Displays supporting 4K will only be supported on larger tablet screens, and so not applicable to the majority of mobile devices.
In addition, even using the new generation codecs, the bandwidth cost is simply too high. Even today on smartphone and tablet devices, content providers such as YouTube will serve a lower resolution to the same device on 3G than on Wi-Fi due to the bandwidth demands.
However, Ultra-HD will drive adoption of the new generation video codecs in the video publishing space. Once the codec implementations and content publishing tools reach a level of maturity with HEVC/VP9, content publishers will have the ability to easily produce lower resolution versions of the content targeted at mobile devices using the next generation codecs. The saving will be a 40- to 50 percent decrease in the bandwidth costs. Hardware acceleration is then desirable on mobile devices to minimize the increased battery drain.
The battle between VP9 and HEVC will be a closer fight than was between VP8 and AVC, and there may be room in the market for both – at least in the short term. YouTube already publishes VP9 content as variants in its DASH format, with no current support for HEVC. However if Apple release a 4k capable device using HEVC and maintain their stance in not supporting VP9, YouTube will be forced to either:
- Encode 4K using AVC. This is unlikely, as the bandwidth costs would be prohibitive.
- Do not provide 4K to those users. Again, an unlikely scenario.
- Provide HEVC variants. The most likely outcome.
At this point, comparative studies show HEVC to be the superior format. As described earlier, this may change over time as the codecs mature. Also in question is the “patent free” status of VP9. Given that Google had to license patents from the MPEG-LA patent pool for VP8, it is likely that VP9 also infringes patents in the MPEG-LA patent pool (which has grown to include HEVC technology). Will Google need to license the technology from MPEG-LA? Will MPEG-LA be willing to grant the license, or will they see it as an opportunity to hurt a competing format?
We are likely to see a lot of movement in this space over the coming year. As more 4K+ devices become available, more 4K content will become available. A clearer picture should then emerge on the adoption of HEVC and VP9, and use will filter down into lower resolution video. It is then that we will see the real impact on the mobile market.
A brief note on delivery protocol and container. The industry trend has been a move from traditional HTTP progressive download formats such as MP4, 3GPP, and Flash Video; to adaptive streaming formats such as HTTP Live Streaming (HLS) or MPEG-DASH.
With yet another set of resolutions and codecs being made available, this trend will continue. MPEG-DASH is codec agnostic, where HLS supports only AVC. I would expect Apple to revise HLS to also support HEVC. These adaptive streaming formats allow the content publisher to advertise all of the available variant formats of a video, and for the device to choose the variant to be delivered based on factors such as supported codec, network bearer, and bandwidth.
Are there any challengers to HEVC and VP9 in the next generation of video codec? The Mozilla foundation is working on a truly royalty free next generation codec named ‘Daala’. The project is in very early development, however its goal is:
“… to provide a free to implement, use and distribute digital media format and reference implementation with technical performance superior to h.265.”
This will be an interesting project to watch.
AVC will likely remain the dominant video codec for the mobile market in the short term. Content providers will not provide HEVC or VP9 variants for mobile until a critical mass of mobile devices provide support. The mobile market will not have critical mass until hardware accelerated chipsets are available, and these devices permeate the market.
It is hard to predict a clear winner between HEVC and VP9, and there may be room in the market for both. Apple has shown no sign of adding support for VPx codecs, and maintained its support for the MPEG standards (AVC and HEVC). Given the popularity of Apple devices, HEVC is all but guaranteed a place in the ecosystem. The first 4K device from Apple should validate this. The success of VP9 depends on support from clients and content providers, and in Google the owner of the standard has a large share of the client and content market.