Research & Development

Posted by Matthew Shotton on , last updated

We're releasing open source some of our initial work exploring the challenges of creating object-based video experiences in the browser.

The ambitions of object-based media bring a whole new range of challenges when it comes to the composition and rendering of media. We've done a number of experiments to show how pushing the final rendering of the media to a client device can create some really novel user experiences. To date a lot of the experiments we've released have been focused on audio; this is in part due to the WebAudioAPI which provides a really rich set of functionality for creating audio based experiences in the browser. Recently we've been exploring the challenges associated with creating object-based video experiences in the browser.

I'm really happy to say we're releasing open source some of our initial work in this area, the HTML5-Video-Compositor. This is an experimental JavaScript library for playing back dynamically modifiable edit decision lists (EDLs), and applying WebGL shader based effects.

Looking forward

Internally we've been using this library to prototype user experiences which we envision might be available anywhere between one and 15 years in the future. To allow us to do this we've heavily focused the development of the library around modern and evolving web technologies. This has allowed us to focus on creating compelling user experiences first and using these use-cases to drive technology development. This has the unfortunate side effect of the library being unlikely to work on the full range of devices and browsers in the wild currently.

Using this library we've built prototypes for variable length news stories, text based a/v editing tools, object-based weather reports, and a range of other experiments.

Timing and synchronisation

I have this theory that 90% of the problems in broadcast engineering are timing and synchronisation related. The compositing of media in the browser is no exception. The core set of functionality we've tried to capture in the HTML5 video compositor is the timing and synchronisation of the playback of videos and other media sources. The library allows you to cut together two separate videos near seamlessly, synchronise the playback of multiple videos in parallel and manage the pre-loading of videos in a just-in-time fashion. It has a range of other functionality.

Although the video compositor renders its output to a single canvas, under the hood a HTML5 video element is created for each clip in the EDL. Because there is currently no implemented standard for synchronising multiple videos, this makes the timing potentially problematic.

This library takes a pragmatic approach to the issues of timing and sync, we attempt to do the best we can with what current state-of-the-art browsers provide without guaranteeing frame accuracy. This is done in the hope that the technology will improve in time to fill any gaps in the quality of the rendered output. That said, the subjective results of the output with modern web browsers has been very promising.

Effects & render graphs

Although the initial focus of this work was on the timing and synchronisation of the playback of videos, we also wanted to explore what was possible in terms of compositing and effects.

Typically per-pixel operations on images and videos can be highly CPU intensive, especially when you might want to perform a range of operations on various sources before they are finally composited (cross-fades, chroma-keying, overlays, etc). Fortunately most modern browsers now support WebGL which provides a mechanism for offloading these intensive, but often easily parallelise-able tasks to the GPU which is much better suited to perform them.

We've implemented a simple way to apply effects to videos played through the video compositor. This currently has a number of limits, such as only allowing one effect per video source. We're currently exploring how to develop this further into a much more general purpose render graph-based approach, taking inspiration from the Web Audio API and existing libraries like seriously.js.

Next steps

This library is one small step on the path towards an object-based media future. As we continue our work exploring the impact of object-based experiences on client-side rendering we look forward to continuing to contribute our work back to the wider open source community for people to experiment, prototype and build on top of.

Topics