Low Latency, Bufferbloat, AQM and L4S
Recently on the Preseem podcast, our hosts welcomed guest Bjorn Ivar Teigen, Head of Research at latency management specialists Domos, to talk about low latency, Bufferbloat, and queue management.
Domos works with ISPs and application developers to improve internet latency. Based in Oslo, Norway, Bjorn also contributes to standardization work in the Broadband Forum and the Internet Engineering Task Force (IETF). He also has a Ph.D. in Latency Modeling and Optimization.
What follows is a recap of the episode, but you can listen to the full wide-ranging discussion here. The podcast is hosted by Dan Siemon, Chief Product Officer, and Jeremy Austin, Senior Product Manager at Preseem.
What Causes Network Latency
The panel began by chatting about the causes of latency in networking, and its relationship to subscriber quality of experience. Dan began by pointing out that hardware and networks in general have traditionally been over-optimized for throughput. And, though it’s true you can get higher throughput with deeper buffers, much of the customer experience is actually latency-driven.
In fact, the over-focus on throughput has led to the troublesome Bufferbloat problem. It’s also led to the subscriber experience deteriorating when links get busy. For example, a link at 90% feels fine, but then at 95%, the experience suffers greatly. Fundamentally, this is caused by too much buffering which negatively impacts network latency and causes poor QoE.
Dan also mentioned that there’s sometimes an initial misconception that network latency is strictly caused by propagation, e.g. how long it takes to get from A to B. However, this is simply not a concern for most access networks. The speed of light and the speed of electrons in copper is fast enough that you really don’t need to worry about it being too slow 🙂
The reality is that too much buffering (e.g. packets queued up waiting to be transmitted) is what causes network latency. You do need to have some buffering to maintain good link utilization. If you have too many buffers, however, you’re queueing extra packets and that just adds latency and doesn’t add throughput. Measuring latency, identifying when links are saturated and/or actively managing traffic to make sure queues don’t build up is ultimately how to achieve low latency and keep QoE optimal, even under load.
Maximizing for Throughput Can Cause Bufferbloat
Bjorn explained that one of the reasons why maximizing for throughput ends up causing Bufferbloat is that internet transport protocols are really sensitive to packet loss. As a result, there’s an incentive to avoid packet loss when you’re building a box or a router in the network and testing for maximum throughput. Packet loss will impact your test scores when you’re evaluating whether you’re hitting your targets, so the incentive is to make the buffers so large that loss becomes unlikely.
The tests also don’t measure for latency so you end up getting better scores. In addition, many will be single fat flows. These are very different than real internet traffic in terms of how much buffering you’ll need. As Bjorn outlined, one of the big differences is that when you have many competing flows coming from different sources, the link is going to be seen as variable from the perspective of each of the flows.
That’s because you’re potentially sharing it with many other flows. If it’s just a single flow, then the link is stable and you can view it over time and understand how it behaves, but on the real internet you’re sharing the link with other people and their behavior is essentially random.
Dan also mentioned that TCP is probing to see how much bandwidth is available. When it suddenly shifts, you get packet loss and it causes a retransmit loop. As a result, the application perceives latency and has to retransmit the segments. TCP is always trying to model the network internally for each flow. The best way to make one flow not see any loss is to give it a lot of buffering but this just doesn’t play well once you have real traffic.
Latency and Loss
Loss in the network doesn’t create latency. However, loss in a reliable transport protocol creates latency for the application. That’s because the application can’t deliver the bytes until the loss is fixed. Applications really care about loss because they will perceive that entire round trip as latency. So the difference between network latency and application latency can be very stark.
You might ask (as Jeremy did), if we know latency causes pain, why don’t most end users know this, other than gamers? Why is this not more prominent in discussions about internet speed?
Bjorn replied that it’s hard to measure the problems associated with latency. Average latency can be great but the user’s experience can be bad, so it’s not always a one-to-one relationship. Also, ISPs tend not to market on latency or to compete on it, so it’s not like you can choose to go to the ISP with the best latency.
Dan added that it’s much easier to measure throughput than latency. To measure latency properly, you really need constant roundtrip-level information. Also, customers don’t understand latency targets. They buy plans based on Mbps and not peak latency numbers.
What users care about is whether or not their applications will work well. This typically comes down to responsiveness, e.g. how quickly things download or how reliably the frames of a video conference are delivered. It really comes down to timings and delays, and the probability that those are often too large.
The Value of AQM
The panel also discussed the benefits of active queue management (AQM) in providing low network latency. As Dan mentioned, every link should have some form of good AQM on it. Sometimes fair queueing and AQM are conflated, but technically AQM is on an individual queue, something like CoDel, and then you can have multiple queues per flow.
AQM intelligently drops traffic when necessary so that transport protocols back off. Maybe even more importantly, it provides some isolation between things and also isolation between subscribers. If one subscriber can hurt another, that’s a problem. If one flow can hurt another, that’s a problem. So driving that isolation property all the way to the bottom does a really good job of improving QoE.
If you have a subscriber network, you’re going to have subscribers with different plans. As a result, you need to deliver a prorated share of the link, and will need some kind of subscriber awareness within those traffic envelopes. It’s very hard to push subscriber awareness down to every device in the network, however.
What is L4S?
L4S stands for Low Latency, Low Loss, and Scalable Throughput, and is a new standard for congestion control on the internet. Bjorn said that legacy AQM systems all signal congestion by either dropping packets or tagging packets and assuming the TCP senders will interpret that tag in the same way they interpret a dropped packet.
One of the benefits of L4S is that the feedback signal that’s signaling congestion is more nuanced, and that allows the sender to react less dramatically, so it doesn’t drop as far. L4S-compatible flows will signal that they’re L4S-compatible and will then expect to be put in an L4S-compatible queue. If there is no such queue, then L4S behaves as legacy TCP, so it backs down to the old version of the algorithm. You get less Bufferbloat if the links are stable.
Dan pointed out that deployment is a problem, however, because you now need to have this dual queue setup in equipment and that split needs to exist wherever there’s congestion, otherwise it just falls back and acts like normal TCP. As a result, it could be a while before it’s widely deployed.
As good hosts, we gave Bjorn the last word and he said that essentially all your network issues show up in your latency measurements. L4S is solving a problem that’s important to solve and getting better congestion feedback is valuable. He agreed that deployment will take time but will ultimately be worthwhile.
Follow the Preseem podcast and listen to all of our episodes here.