Inside of WebRTC adaptive bitrate streaming algorithm from Softvelum – Softvelum: efficient tools to build your streaming networks

Streaming media industry has been adopting ultra-low latency technologies for years and WebRTC definitely took its lion share. As adaptive bitrate (ABR) became a crucial part in content delivery, WebRTC vendors had to follow and come up with some solutions.

While traditional HTTP-based ABR protocols like HLS and DASH remain widely used, they struggle to meet the demands of low latency, typically ranging from 2 to 10 seconds at best. WebRTC ABR enables sub-second latencies (often under 500ms), making it ideal for use cases like online gambling, auctions, live sports, and betting. As the need for lower latency grows, WebRTC ABR algorithms offer a powerful alternative for delivering high-quality live content with the lowest possible delay.

Comparing adaptive bitrate: HLS vs. WebRTC

In traditional adaptive bitrate playback, like HLS or MPEG-DASH, the client side typically selects a video rendition based on network conditions, available bandwidth and buffer status. The server provides multiple renditions, and the client dynamically switches between them by downloading segments of different quality to ensure smooth playback. This client-side approach is suitable for non-interactive streaming but introduces higher latency due to buffering.

In WebRTC, the server plays a more active role in adjusting the stream quality. WebRTC relies on real-time peer-to-peer connections where the server dynamically adjusts bitrate and video resolution based on instantaneous feedback from the client. Unlike HLS, WebRTC does not use buffering for latency management. Instead, it continuously adapts to network conditions in real time to deliver a nearly instant viewing experience, making it ideal for interactive use cases.

Bandwidth estimation (BWE) can be implemented in a number of ways. The existing challenges and potential solutions are outlined in the outstanding Bandwidth Estimation (BWE) and Janus article by Lorenzo Miniero.

Nimble Streamer media server utilizes ABR BWE approach based on a new predictive algorithm that uses Round-Trip Time (RTT) data from Transport-Wide Congestion Control (TWCC) feedback. This algorithm estimates clients’ network conditions and distance more accurately by analyzing RTT fluctuations and building its own metrics around them. This allows serving the greatest number of WebRTC clients with optimal renditions they are capable of receiving, and saving Nimble Streamer hardware and network resources. Nimble WebRTC output is streamed with WHEP signaling with help of Pion framework.

In this article we describe this ABR algorithm in details.

ABR renditions preparation

In WebRTC ABR streaming, the client communicates its codec capabilities and preferences to the server through the initial SDP (Session Description Protocol) offer. This offer includes a list of supported codecs, with the client’s preferred codec listed first. The server reviews this list and responds with an SDP answer that confirms the selected codec based on compatibility. This codec negotiation ensures that both client and server can optimize streaming quality while dynamically adapting to network conditions. Hence it’s important to notice that the greater is the diversity of renditions in terms of codecs, bitrates, and frame rates provided, the more devices and applications will be capable of playing the WebRTC ABR stream.

The server won’t cancel a client with a poor bandwidth, but a client will receive the lowest rendition instead. We recommend having such clients in mind and include the lowest possible bitrate and resolution in the ABR encoding ladder for them. Per our experience, it’s better not to cancel or blackout such sessions (i.e. send add black frames instead of video). So start the encoding ladder from 360p or even 240p.

Please also refer to our WebRTC adaptive bitrate WHEP article to get familiar with specifics of Nimble Streamer setup for such scenarios in case you’d like to try it.

Difference with other algorithms

All available WebRTC ABR algorithms have some issues to consider.

Older algorithms like Google Congestion Control decrease a resolution for a client when the packet loss is detected. That’s not always correct, because even though RTT may be low, packet loss is still high, like in LAN over Wi-Fi or cellular networks. There still could be a higher resolution to be delivered in such case.

Another one is that the increase of outgoing bitrate could lead to network exhaustion. The bandwidth throughput is fixed and limited in the real world. When some implementations discover that network conditions are good, they increase the resolution for all sessions. Imagine, there’s a regular 720p rendition at 4Mbps provided to all the clients: it will process only 250 sessions for a 1Gbps up-link.

Some algorithms don’t care about other sessions, they provide the same increased or decreased rendition to all the clients despite their respective network conditions.

Unlike other algorithms, Nimble Streamer “keeps in mind” all the clients getting the stream and knows all required metrics about each session. This allows calculating the best rendition to provide for a specific sets of sessions. Let’s see how it’s achieved.

Notice: we’ll call an instance of Softvelum Nimble Streamer media server just “Nimble” going forward in this article just for the sake of simplicity.

The algorithm

When a session has just connected, it does not yet report any stats about itself and cannot take part in calculations for providing the resolution. That’s why Nimble sends the lowest rendition only and decreases the zapping (start time) of the stream, making the start-up nearly instantaneous.

Over time, all active clients will send session data to Nimble, allowing it to determine whether to provide an improved rendition to each client or not. If a session fails to send metrics, or Nimble does not receive them, that session is cut after a specific period.

The Swarm

The algorithm relies on several key metrics based on client feedback via TWCC protocol in order to guide server decisions for managing specific sessions. These metrics include:

RTT Index which reflects the client’s network latency,
Bitrate Index which indicates the client’s current rendition and
Penalty Time used for managing sessions which failed to achieve a better rendition and have a high packet loss ratio.

Nimble uses these metrics to adjust the rendition of each individual session or to keep current streaming quality.

Nimble creates a list called the Swarm to manage accepted sessions, including their RTT Index, Bitrate Index and Penalty Time.

For each session, Nimble calculates an RTT Index based on its RTT value returned via TWCC for a specific time interval. The lowest value on this interval is picked as the RTT Index. Clients with the lowest RTT Index, indicating favorable network conditions, are selected for resolution upgrades and placed in the Acceleration Swarm. This process of resolution upgrade is called Acceleration.

However, not all sessions qualify for this Swarm. Sessions with an RTT exceeding 600ms (the highest possible RTT Index), or those with a Penalty Time are excluded. In addition to that, even if a session has the minimal RTT Index, it will not be placed in the Swarm if it does not require an upgrade to a new rendition.

The Acceleration Swarm also has a limit on the number of sessions it can accommodate. Priority is given to sessions with the lowest RTT Index and minimal packet loss for acceleration.

To sum it up: sessions will have a chance to increase their resolution (Accelerated) if they meet the following criteria:

There is an available place for a session in the Acceleration Swarm.
The session has a calculated RTT Index and it’s low, indicating good network conditions.
The session has not yet reached the highest rendition (as determined by its Bitrate Index).
The session has priority over others in the Swarm.
The session does not have any Penalty Timeout assigned.

Penalty Time

If a session is being accelerated and it cannot achieve a higher rendition, like due to network issues, it will be assigned a Penalty Timeout before it’s allowed to be accelerated again.

If this session fails to accelerate to the next rendition the next time, it will receive the Penalty Time double in size. However, its next failed attempts won’t increase Penalty Timeout as significantly as in the first two attempts and will add penalty by small increments.

This Penalty Timeout won’t be increased indefinitely. If it reaches a pre-defined maximum value, it will stay on this value.

If the session finally achieves some higher rendition, the Penalty Timeout is reset for it.

What is the Bitrate Index and how it’s related to ABR

Bitrate Index is another element that’s included in calculations.

It’s a property of a stream and reflects available renditions for a specific stream as a calculated index .

Bitrate Index (Renditions Index) is a space divided into blocks of 500 kbps, with an ascending number assigned to each block. E.g. for streams with renditions of 200 kbps, 500 kbps, and 1Mbps, the index is 0, 1, and 2 respectively.

Streams have variable bitrates during their lifespan, so Nimble constantly calculates the Bitrate Index for the stream’s renditions in some fixed interval. The stream’s highest bitrate at this interval is considered as its current bitrate and the Bitrate Index is assigned accordingly.

When a session requests a stream, it’s also assigned a Bitrate Index which is reached at this current moment. A session strives for the highest Bitrate Index available.

The sum of the RTT Index and Bitrate Index of a session defines a value by which the session is sorted in the Acceleration Swarm. The Swarm is sorted in descending order.

The goal is to provide a chance to change resolution for a session with low RTT time (which means good network conditions), but not yet reached the maximum rendition (i.e. has low Bitrate Index). Between two sessions with the same Bitrate Index, the one with the lowest RTT index will be accelerated first. In other words, clients that are closer to the server will be accelerated faster until they reach the highest bit rate. Thus Nimble provides the best quality to the session which have confirmed the quality of their network rather than providing it to any random session.

Warming session with Null Padding for bitrate upgrade

Every time a session gets into the Swarm, Nimble considers that this session requires the bitrate change to the next Bitrate Index value regardless from its current Bitrate Index value.

Nimble doesn’t switch the session immediately after the bitrate change is requested.

Instead, it “warms” the session. It’s trying to estimate if upgrading the bitrate (and resolution) won’t hurt the network and the client’s viewing experience.

The trick is to send null-padded packets instead of packets with actual frame data. These null packets will emulate bitrate switching for a session that requires a bitrate upgrade. If the client confirms that the packets are received for a certain period of time, then the client is considered as “heated” and it’s been allowed for resolution and bitrate change. It then requires some time for checking, e.g. see if packet loss is not significantly higher, than was a few moments before.

As the checks are completed successfully and the padding index is high then the session is awarded with a higher bitrate and removed from the Acceleration Swarm.

Summary

This WebRTC adaptive bitrate algorithm is tuned to process a pull of viewers’ sessions according to their network conditions. For a small number of simultaneous viewers Nimble will give the best possible quality within current available bandwidth until it reaches sessions’ network limits. And if the number of viewers goes up and bandwidth capacity exceeds, Nimble will gradually reduce the quality of some of the existing sessions, picking up those with worse network conditions. So our algorithm allows serving all the viewers and give them the optimal level of customer satisfaction possible.

Please feel free to try our WebRTC implementation in action, and we are looking forward to getting your feedback.

Special thanks

We’d like to say special thanks to Lorenzo Miniero for the outstanding research and work on ABR WebRTC, and Sean DuBois and all Pion contributors for creating and maintaining such an amazing framework.