Update Please check our latest blog for the latest developments in 360 video encoding

I recently compared the encoding settings used by several leading VR distribution platforms, including YouTube, Jaunt, Littlstar and Samsung VR. The conclusion: apart from YouTube, all of these major platforms still rely on the 14 year old (!!) H.264/AVC codec, instead of the more recent and much more efficient H.265/HEVC or VP9 codecs, resulting in larger file sizes and lower image quality. It’s about time to shed some light on the pros and cons of these and other advanced codecs, and explore what the future of video encoding looks like. Let’s go!

What’s wrong with H.264?

We have to start this journey by exploring why we believe H.264 has seen it’s best days, because what is there not to like? H.264 video runs smoothly on nearly any device imaginable (even iPhones), because it has been around for ages and so hardware support is ubiquitous. It is also very quick to encode videos to H.264, saving you hours of precious render time.

However, the reason it is so fast to encode videos to H.264 is also the reason why these videos have a HUGE file size: this 14 year old codec is not that complex, making it fast, but therefore not so efficient at compressing video!

On top of that, H.264 does not support resolutions over 4K (4096×2304 pixels is the maximum implemented in many pieces of video software, although it’s not a hard limit of the codec), something which is problematic even for the current generation of stereoscopic 360º workflows, where 4096×4096 files are common. So 6K and 8K resolutions, which the next generation of VR cameras promise to deliver, are definitely out of the question with H.264.

Finally, and most importantly, the image quality of H.264 is just not very good. Black often ends up looking gray, and blocky encoding artefacts are very common when the bitrate is a tad too low.

So now that we’ve bashed this golden oldie, let’s look at what some of the modern alternatives have to offer.

H.264 + 1 = H.265

H.264’s younger brother H.265, often called HEVC for High Efficiency Video Codec, offers a massive improvement in compression efficiency. In fact, HEVC files are generally 50% (!!) smaller in file size than the same video encoded as H.264, with the same visual quality! Besides, colors look better, artefacts are less pronounced, and it supports resolutions up to 8K (8192×4320).

This should be enough reason for everyone to make the switch to HEVC, right?! Well, not so fast..

Despite the incredible compression of HEVC, this codec is not widely supported yet, apart from relatively new Android devices (which have support since Android 5.0), which is why many 360º video producers use HEVC to play back their content on the Samsung Gear VR headset.

HEVC is not supported on iOS devices and it plays terrible on most PCs, which makes it nearly impossible to review your Gear VR export until it’s copied to your Samsung. On top of that, you can’t upload an HEVC encoded video to YouTube, because it’s not supported as an input format, and chip makers are only now coming out with HEVC enabled hardware, even though the codec was published in 2013 already.

So why this slow adoption of this otherwise brilliant codec? Two related words: money and patents.

The license fees for using the HEVC codec in hardware and/or software “are many times higher than what licensees paid for H.264 for the past decade”, and few companies dared to make this humongous investment before they were 100% sure HEVC would become the new industry standard.

These high license fees are partially explained by the fact that H.264 has one patent pool, while HEVC has two patent pools that want to see money. Also, having an additional patent pool does not only cost more money, it also poses a significantly increased risk of parties suing each other for patent infringement, which is yet another reason for companies to avoid HEVC..

Another thing to consider is that new codecs like HEVC usually make a trade-off between compression versus complexity, resulting in much smaller file sizes at the cost of slower encoding and more power required for playback.

Luckily, chip makers like Nvidia are releasing SDKs which allow hardware accelerated encoding and decoding of HEVC video, which promises to significantly speed things up, but only for devices with an Nvidia GPU in it. However, I have yet to come across a phone sporting an Nvidia 1080 Ti, so until then, these performance improvements are limited to PCs. The Nvidia hardware encoding SDK still has some bugs as well, where it does not encode the height of a video correctly when you use hardware accelerated encoding in combination with scaling the resolution, as I mentioned in my FFmpeg cheat sheet for 360 video.

In short, HEVC is an amazing codec which is unfortunately not yet widely supported, and which has several things working against it becoming the new industry standard (read: costs and patents).

Google to the rescue

Google owns YouTube, and since YouTube has a ton of videos on it, it would quickly become extremely expensive to either:

  1. Stream all these videos as large H.264 files
  2. Or to pay the enormous license fees of HEVC

So Google thought: “We have some smart people working here and we have loads of cash, so let’s buy a little compression company and build a royalty free codec ourselves!” And this is exactly what happened when Google purchased On2, together with their VP8 codec, for $106 million back in 2009.

Google immediately open-sourced the codec and was hoping it would be an H.264 killer. Unfortunately, H.264 had already become the industry standard, and VP8 was just not so much better that it could revolutionize the industry.

But the codec kept evolving and resulted in VP8’s younger and way more powerful sibling, VP9. When Netflix tested VP9 versus HEVC, their conclusion was that these two codecs had fairly similar performance and compression rates, but that VP9 wins in compressing videos larger than HD, which is why VP9 might be extremely suitable for high-res VR videos (8K@120fps anyone?).

On top of that, Android devices support VP9 since Android 4.4 (HEVC since 5.0), the codec is open-source and royalty free, and it plays so smooth on PCs that we were able to play a 4096×4096@60fps VP9 without any hiccups in our Chrome browser… on a laptop (!!), while a 3840×2160@30fps HEVC video wouldn’t play on that same device at all.

The what-bitrates-do-we-use box

To get an optimal quality-to-size ratio, it is recommended to encode your videos using a variable bitrate. But what bitrate is ideal?

After countless tests, we determined that for a 3840×2160@30fps VP9 or HEVC video, a target bitrate of 15Mbit works very well. As a general rule of thumb, we double this bitrate if we double the resolution and up it by 50% if we double the frames per second.

This results in the following list:

  • 3840×2160@30: 15Mbit
  • 3840×2160@60: 22Mbit
  • 4096×4096@30: 30Mbit
  • 4096×4096@60: 45Mbit

Because we use a variable bitrate, the actual bitrate of the output video is usually slightly higher or lower than the target bitrate you set.

VP9 even supports transparent videos, which is how we were able to create the effect in the video below:

So if VP9 is so amazing, why isn’t everyone using it then, except for YouTube and Headjack? Honestly, I don’t understand this at all. The only reason I can think of is that everyone knows H.264, and so H.265/HEVC might seem like the logical successor. I think that besides Netflix, not many companies have given VP9 an honest chance.

Of course VP9 has its downsides. For example, some video players, like VLC, have trouble playing back VP9 smoothly, while FFplay-based players like MPC-HC have no trouble with it at all. Encoding a video to VP9 is also fairly slow, even compared to HEVC, since there is not really any hardware accelerated encoding available for it yet, even though both Intel and ARM chips have built-in VP9 hardware decoding (which is why VP9 videos play so smoothly on both PCs and mobile devices, except for the prissy iPhone again of course).

One final little quirk with this codec is that you are currently able to upload VP9 video to YouTube, but you’re not able to add 360 metadata to a VP9 file yet, so you can’t use it to upload a 360 video to YouTube… very silly. I already opened an issue for this with Google, so hopefully they will address this soon.

To summarize, VP9 offers similar compression and performance as HEVC, but plays back smoothly on both mobile and PC, which is the main reason we switched Headjack from H.264 & HEVC to VP9. Best of all, this codec is FREE to use, so no license fees or patent quibbles, which is why I hope that hardware accelerated encoding will arrive soon.

Why won’t they switch?

We just covered the downsides of the old but widely used H.264 codec, and how more advanced codecs like HEVC and VP9 offer better image quality at lower bitrates, allowing even people with relatively slow internet speeds to stream 4K video. So with all these benefits, why aren’t more parties switching over to these advanced codecs? I’d say because H.264 is easy! You don’t have to worry about device support, it just works. It is also cheap (although VP9 is free), and fast to encode.

Both HEVC and VP9 are amazing codecs, but they do have their downsides. Since advanced codecs require more processing power to decode, for example, you’re more likely to overheat your Samsung S6 by playing back an HEVC video than an H.264 video. However, even though the current generation of advanced codecs are still fighting for a share of the market, next generation codecs are already being developed in full force! What do these futuristic codecs promise to deliver and when can we expect them to arrive?

The future of video codecs

When we look into the crystal ball to see what the future of video encoding looks like, we see quite a couple of contenders:

  • VP10
    Successor of the open-source VP9 codec, which promises to cut VP9 file size in half
  • Dalaa
    Open-source codec from the Mozilla and Xiph (the guys behind FLAC, Theora, Vorbis, Opus and Ogg)
  • Thor
    Open-source codec from Cisco
  • AV1
    Amazon, Cisco, Google, Intel, Microsoft, Mozilla and Netflix formed the Alliance for Open Media to develop a royalty-free codec called AV1
  • NETVC
    A standardization project by the Internet Engineering Task Force (IETF) for a next-generation royalty-free codec

Who can spot the pattern here? It seems everyone despises the insane licensing fees of the HEVC codec and believes that the internet is better off with a royalty-free, open-source codec instead. I personally can’t wait for the day the MPEG crew sees its money river dry up and we can finally have a free-to-use, ubiquitous, superior video codec that runs on all devices and works in all players.

In the list above, NETVC is in our opinion most likely to create the new industry standard (which will hopefully support spatial audio natively), because it is run by the IETF, the main standards organization of the internet. NETVC will likely take the best parts of the other codecs in the list to create one best-in-class, open standard. So when can we expect this next-gen codec to arrive?

According to the NETVC website, they are submitting the storage format and codec specifications to the Internet Engineering Steering Group (IESG) in May 2017, so right now! If everything goes according to plan, test results should be in by December of this year. All in all, I wouldn’t expect to see any of these next-gen codecs appear in your Adobe Premiere export window until the end of 2018 at the earliest. Until then, VP9 is our weapon of choice!

UPDATE: As Tsahi mentioned, the Alliance For Open Media is working on the AV1 codec, which is a consolidation “of three potentially competitive open source codecs; Cisco’s Thor, Google’s VP10, and Mozilla’s Daala…most of the code will come from VP10