Internet Congestion Collapse - Computerphile

ComputerphileComputerphile
Education3 min read21 min video
Mar 4, 2022|96,067 views|3,411|209
Save to Pod

Key Moments

TL;DR

Internet congestion collapse in 1986 led to TCP's vital congestion control algorithms.

Key Insights

1

In 1986, the internet experienced a significant 'congestion collapse' where bandwidth dropped drastically, impacting its usability.

2

This period of severe degradation highlighted the need for more robust network protocols.

3

The congestion collapse spurred crucial developments in Transmission Control Protocol (TCP), particularly its reliability and control mechanisms.

4

Van Jacobson and Michael Carroll's 1988 paper, 'Congestion Avoidance and Control,' introduced foundational algorithms still in use today.

5

TCP employs sequence numbers and acknowledgments for reliable data transfer.

6

Flow control, managed by receive windows, prevents overwhelming the receiving host.

7

The core innovation for congestion control is 'Additive Increase, Multiplicative Decrease' (AIMD), balancing probing for bandwidth with aggressive backing off upon congestion.

8

Slow Start is a related mechanism that rapidly increases the transmission window at the beginning of a connection to quickly find available bandwidth.

9

These algorithms transformed the internet from a fragile network prone to collapse into a more stable and scalable system.

THE 1986 CONGESTION COLLAPSE

Around 1986, the internet, then used by tens of thousands of machines for file transfers and news exchange, suffered a critical event known as congestion collapse. Bandwidth, already limited, dropped dramatically, in some cases to as low as 40 bits per second, a thousand times less than the expected 32 kilobits per second. This severe degradation lasted for a significant period, representing one of the few times the internet's performance was fundamentally undermined in its early days.

EMERGENCE OF TCP AND RELIABILITY MECHANISMS

The crisis of congestion collapse led to significant refinements in core internet protocols, most notably the Transmission Control Protocol (TCP). TCP's primary role is to ensure reliable data transmission between hosts. It achieves this through mechanisms like sequence numbers, which identify individual packets, and acknowledgments (ACKs), sent back by the receiver to confirm receipt of packets. If an ACK for a particular packet isn't received within a set timeout period, TCP assumes the packet is lost and resends it, thereby guaranteeing delivery.

THE ROLE OF FLOW CONTROL

Before the widespread issue of congestion, TCP implemented flow control to manage data transmission rates. This mechanism involves the receiving host advertising a 'receive window,' indicating how much buffer space it has available for incoming packets. The sender respects this window, limiting the number of packets 'in flight' (sent but not yet acknowledged) to avoid overwhelming the receiver's capacity. This ensures that a slower receiving computer doesn't get flooded with data it cannot process.

THE PROBLEM OF NETWORK CONGESTION

While flow control prevents overwhelming the receiver, the internet's core challenge was network congestion – the bottleneck created by numerous hosts sending data across shared network infrastructure. Routers, acting as intermediaries, have limited buffer space. When too many packets arrive simultaneously, these buffers fill up, forcing routers to drop packets. This reality, distinct from receiver capacity issues, was the root cause of the 1986 congestion collapse.

CONGESTION CONTROL: ADDITIVE INCREASE, MULTIPLICATIVE DECREASE

The groundbreaking solution, detailed in Van Jacobson and Michael Carroll's 1988 paper, was the development of congestion control. Central to this is the 'congestion window,' which dynamically adjusts the number of packets a sender can have in flight based on network conditions. The algorithm employs 'Additive Increase, Multiplicative Decrease' (AIMD): the window size increases linearly with each successfully acknowledged packet flight, probing for bandwidth. However, upon detecting congestion (via packet loss or timeouts), the window size is aggressively halved, drastically reducing the sending rate.

SLOW START FOR RAPID WARM-UP

To expedite the initial phase of a connection and quickly discover available bandwidth, TCP also utilizes a 'Slow Start' mechanism. Instead of starting with a small window and gradually increasing it, Slow Start begins with a very small window and doubles it with each successful round-trip acknowledgment. This exponential increase allows the sender to rapidly ramp up data transmission until it approaches the network's capacity or hits the point where congestion is detected, at which stage the AIMD algorithm takes over. This combination of strategies fundamentally stabilized the internet.

Navigating Internet Congestion: Key Principles

Practical takeaways from this episode

Do This

Use TCP for reliable data transfer with sequence numbers and acknowledgements.
Implement flow control to prevent overwhelming the receiving computer.
Employ congestion control to adapt to network conditions and avoid collapse.
Utilize Additive Increase, Multiplicative Decrease (AIMD) to balance efficiency and stability.
Employ Slow Start at connection initiation to quickly probe available bandwidth.

Avoid This

Send packets blindly without tracking acknowledgements (leads to unmanageable slowness).
Ignore the network's capacity; sending too fast causes congestion and packet loss.
Assume consistent bandwidth; network traffic is highly variable.
Rely solely on flow control; it doesn't address network-wide congestion.

Common Questions

In 1986, the internet experienced a major breakdown known as congestion collapse, where network bandwidth plummeted to extremely low levels, sometimes as low as 40 bits per second, severely degrading performance for tens of thousands of users.

Topics

Mentioned in this video

personVan Jacobson

One of the two researchers who wrote the influential paper on congestion avoidance and control.

personMichael Carrolls

One of the two researchers who wrote the influential paper on congestion avoidance and control.

conceptIn-flight Packet

A packet that has been sent but has not yet received an acknowledgement, indicating it is currently in transit on the network.

conceptReceive Window

An indicator within TCP acknowledgements that tells the sender how much buffer space the receiver has available for incoming packets.

conceptCongestion Collapse

An event in internet history around 1986 where network bandwidth dropped significantly, causing severe degradation.

softwareTransport Control Protocol

A core internet protocol (TCP) that sits at the end hosts to ensure reliable delivery of packets, using sequence numbers and acknowledgements.

conceptTimeout

A timer setting in TCP that, if a packet's acknowledgement is not received within a certain period (e.g., 400ms), triggers a retransmission.

conceptCongestion Window

A dynamic budget of traffic allowed to be 'in flight' on the network, managed by the sender to account for congestion.

softwareLog4j

Mentioned at the very end as an example of something pervasive, like milk or water.

conceptCongestion

A state where network routers are overwhelmed with too many packets, leading to delays and packet loss.

conceptCongestion Control

A network protocol mechanism designed to prevent and manage congestion by adjusting the rate of data transmission based on network feedback.

bookCongestion Avoidance and Control

A highly influential 1988 paper by Van Jacobson and Michael Carrolls detailing fixes for internet congestion, considered seminal for internet protocols.

softwareTCP

Transmission Control Protocol, used to ensure reliable packet delivery through mechanisms like sequence numbers and acknowledgements.

conceptACK

Acknowledgement, a message sent back to confirm receipt of a packet, crucial for TCP's reliability.

toolChecksum

A mechanism used to detect errors in data transmission, though implicitly mentioned as part of reliability.

conceptFlow Control

An early mechanism in TCP that limits the rate of data transmission based on the receiving computer's buffer capacity (receive window) to prevent overwhelming it.

conceptBandwidth

The maximum rate at which data can be transferred over a network connection.

conceptSlow Start

A TCP algorithm that probes bandwidth aggressively at the beginning of a connection by doubling the window size exponentially to find the optimal rate quickly.

More from Computerphile

View all 82 summaries

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free