We’ve had the HTML
<video> element for over a decade. Yet, everyone still defaults to embedding YouTube frames instead of hosting their own videos. The underlying problem is that the
<video> element isn’t suitable for embedding short video files on webpages.
<video> element works great for large streaming platforms and tube sites. However, video is nowhere near as simple to use as other adaptive embedded media, such as responsive images.
You need half an hour of learning to get started with responsive images. All you need for responsive images is to include a specially formatted list of an image in different sizes and file formats in your HTML document. The web browser uses the list to pick a format it supports at the right dimensions for the visitor’s device. Here’s a slightly simplified example:
Neat, right? It’s a bit more complicated than that, but that’s the essentials. Given some familiarity with the basics of HTML, you can guess what it does and even learn that syntax. Using media queries, you can even respect user preferences like reducing motion, saving data, and dark mode.
The story isn’t the same when it comes to video. The HTML
<video> element is similar to the
<picture> element. It even uses the same
<source> element to list videos in different codecs/formats, and you place fallback content inside it the same way.
<source> element does not currently support either the
sizes attributes for
<video>. You can only set a single source (
src), and its container and codecs information through the
HTML doesn’t provide web authors any affordances to send a high-resolution video to a desktop or tablet, and a lower resolution to a mobile phone. You can send an oversized video to mobile devices, but at potentially high data and battery costs. Or you can send an undersized video and scale it up (with ugly upscaling artifacts) to desktops. A 720p (720×405 px) video suitable for desktops and tablets contains ×2,25 times more pixels (roughly ×2,1 times more data) than a 480p (480×270 px) video file for mobile.
You also have to spend time learning and integrating a complicated new library into your documents. Serving video is still relatively expensive, so you might also need a separate library to reduce the hosting costs (e.g. WebTorrent). If you’re planning on publishing many videos, it might be worth it. However, it’s too much overhead just to add a few minutes of video to a blog post every once in a while.
Neither HLS nor DASH is suitable when you “just need to add a simple video to a webpage”. They’re too complicated and too powerful for such a simple use case. The HTML standard has just left this gap unfilled for a decade. It might help explain why everyone just defers to embedding a frame hosted by YouTube to embed video on their websites. HTML video is too much work even if you’re motivated to host it yourself.
Maybe I’m asking for a faster horse here, but I do believe the HTML standard needs to address this issue. There needs to be a simpler way to embed a video on a page and have the web browser pick a file with dimensions appropriate to the device. The default web browser multimedia player also needs to add a control to let the viewer override the quality picked by the browser.
Scott Jehl kickstarted the discussion about this in January 2021 with his call to add the
media attribute to the video
source element. It’s supported in Safari, and was part of the HTML standard a decade ago. It was removed from the standard, but … Safari isn’t known for keeping up with the times when it comes to web standards.
The proposal enables web authors to specify different video sources for different screen resolutions. It wouldn’t enable the user to override it, and it’s unclear how full-screen such would be handled. It’s currently being discussed in the Web Hypertext Application Technology Working Group (WHATWG). WHATWG is the organization currently maintaining the HTML standard.
Here’s an example using the capabilities proposed by Scott Jehl. In this example, screens of 700 px or larger gets a large video file, and smaller screens gets a small one instead. You can go more granular than this, but the below would already get 90 % of the job done.
This would be an improvement over the status quo. The media query lets the browser pick a more appropriate source, but it’s left up to the document author to decide what’s best for different devices. The syntax doesn’t give the web browsers any information about what’s different between the different video sources. It can’t make an informed decision about the best source without knowing more about them. Without this information, it would also be impossible for the browser to display a controller to let viewers choose their preferred video playback quality.
I believe that a better solution would be to use the
sizes (and possibly even
srcset) attributes instead of abusing media queries. Just for a minute, forget how this attribute is used on
source elements descending from a
picture element. Instead, think of how it’s used for picking favicons. Web authors can include multiple favicon files and the browser looks at the
sizes attribute to pick an appropriate size. Let me try to explain it with another example:
The browser could then look at the intrinsic size of the video element, the screen resolution, network conditions, and pick the most appropriate source. It could even display a button to let users switch between the available video resolutions.
There are still issues with this approach, but it would make responsive videos on the web just as simple as responsive images. For example, what happens if you’re watching small.mp4, and switch to full-screen mode? Surely, you expect it to switch to large.mp4 instead and continue playback at the same time position. What if the two video files are of different durations? There’s a hornet’s nest of potential issues, but I’d take the occasional stings over the status quo any week of the day.
There are still unresolved questions, however. For example, the
poster attribute lets you specify a placeholder poster image. Do we need a new
posterset attribute to provide a set of responsive images at different resolutions? Then what about image formats? Or using a keyframe from the video file? Should posters be moved inside the
video element as another descendant
source element with a
kind="poster" attribute? I don’t know.
One thing’s for sure: either we need much cheaper and faster smartphones with virtually free data plans; or HTML video needs to be overhauled to allow for responsive videos. Or we could ignore the problem and continue outsourcing it to YouTube. It works pretty well if you don’t mind centralization and Google injecting ads into your videos.