Arrays vs Linked Lists - Computerphile

ComputerphileComputerphile
Education4 min read30 min video
Jul 11, 2017|503,664 views|10,741|925
Save to Pod

Key Moments

TL;DR

Arrays vs. Linked Lists performance depends on CPU architecture and cache, not a universal answer.

Key Insights

1

Arrays store data contiguously in memory, while linked lists use nodes with pointers scattered across memory.

2

Accessing a specific element in an array is O(1) due to direct indexing, whereas in a linked list it's O(n) as traversal is required.

3

Performance benchmarks on different hardware (Atari, Raspberry Pi, iMac) showed varying results for array vs. linked list operations.

4

The presence and configuration of CPU caches significantly impact performance, favoring arrays when data is accessed sequentially.

5

Compiler optimizations play a crucial role, with advanced compilers potentially unrolling loops for arrays to boost performance.

6

Reverse traversal of arrays on the Atari was faster due to specific CPU instructions, demonstrating architecture-specific optimizations.

FUNDAMENTAL DATA STRUCTURE DIFFERENCES

Arrays and linked lists are two fundamental ways to store collections of data. An array stores elements contiguously in a single block of memory, allowing direct access via an index. Each element can be referenced directly, starting from index zero. In contrast, a linked list stores elements in nodes, with each node containing the data and a pointer to the next node. These nodes can be located anywhere in memory, and the list is navigated by following these pointers from a starting point (often called 'head' or 'first'). This fundamental difference in memory organization is key to understanding their performance characteristics.

OPERATION SCENARIOS AND TESTING METHODOLOGY

To compare the performance, a common operation was chosen: iterating through all elements and summing a specific integer value ('p') within each element's structure. This operation was implemented in C for both arrays and linked lists, creating 125,000 elements. Two C programs were written, nearly identical except for the data structure implementation. The process involved generating random data, then running the summing function 100 times on each structure to calculate an average execution time, recorded using the system's clock function.

INITIAL BENCHMARKS ON THE ATARI ST

The first set of tests was conducted on an Atari ST. The linked list traversal took approximately 166 clock ticks, while the array traversal took around 178.5 clock ticks. This initial result suggested that, on this particular machine, the linked list was slightly faster for this type of operation. Interestingly, when the array traversal was reversed (from the end to the beginning), the time dropped to 114 ticks, making it even faster than the linked list. This deviation from initial expectations prompted a deeper investigation into the underlying hardware.

IMPACT OF CACHING ON MODERN ARCHITECTURES

Moving to more modern machines like the Raspberry Pi and iMac revealed a different performance landscape. On the Raspberry Pi, the array was roughly twice as fast as the linked list. When the array traversal was reversed, it became slower than the forward traversal. On the iMac, the array was significantly faster, about five times quicker than the linked list, regardless of traversal direction. The key difference highlighted is the presence of CPU caches. Modern CPUs use caches to store frequently accessed data and instructions, dramatically speeding up access compared to fetching directly from main memory.

THE ROLE OF CACHE LINES AND MEMORY ACCESS

The disparity in performance on machines with caches is largely due to cache lines and memory access patterns. When a CPU accesses data, it fetches a block (a cache line, e.g., 128 bytes) into the cache. For arrays, sequential access means most of the data needed is likely already in the cache or fetched in contiguous blocks. Linked lists, however, require two memory accesses per element: one for the data and another for the pointer to the next node. This sequential access pattern, combined with caches, makes arrays more efficient on modern architectures where CPUs are much faster than memory.

COMPILER OPTIMIZATIONS AND ARCHITECTURE SPECIFICITY

Compiler optimizations also play a vital role. The iMac tests, using a different compiler (Clang), showed much faster array performance, attributed to loop unrolling where the compiler processes multiple array elements in a single loop iteration. Furthermore, specific CPU instructions, like those on the Atari's 68000 processor for decrementing and branching, can optimize reverse array traversal. Ultimately, the performance comparison between arrays and linked lists is not universal; it's highly dependent on the specific hardware architecture, its caching mechanisms, available CPU instructions, and the effectiveness of the compiler.

Arrays vs. Linked Lists: Performance Considerations

Practical takeaways from this episode

Do This

Arrays offer faster random access for retrieving individual elements.
Linked lists are more flexible for insertions and deletions, especially in the middle.
Test performance on the target architecture; results vary significantly (e.g., with/without caches).
Consider compiler optimizations, as they can drastically alter performance.
When traversing an array forwards, modern CPUs with caches tend to favor arrays.
When traversing an array backwards on older architectures, specific instructions can make it faster.

Avoid This

Do not assume arrays are always faster than linked lists, or vice-versa.
Do not compare 'ticks' across different machines without understanding their clock speeds.
Do not rely on theoretical performance; real-world testing is crucial.
Avoid assuming identical performance on similar architectures (e.g., Atari ST vs. Falcon) without testing.

Performance Test Results (Summing 'P' values for 125,000 elements)

Data extracted from this episode

MachineOperationResult (Ticks/Time)
Atari STLinked List166 ticks
Atari STArray (Forward)178.5 ticks
Atari STArray (Backward)114 ticks
iMacLinked List29.54 (iMac ticks)
iMacArray (Forward)43.44 (iMac ticks)
iMacArray (Backward)47.46 (iMac ticks)
Raspberry PiArray (Forward)96575 (time)
Raspberry PiLinked List1858.61 (time)
Raspberry PiArray (Backward)1011.55 (time)
Atari FalconArray46 ticks
Atari FalconLinked List58.5 ticks

Common Questions

It's not a simple yes or no. Arrays offer faster random access, but linked lists can be faster for certain operations or on specific hardware configurations, especially when considering memory access patterns and CPU caches. Real-world testing on target architecture is key.

Topics

Mentioned in this video

More from Computerphile

View all 82 summaries

Found this useful? Build your knowledge library

Get AI-powered summaries of any YouTube video, podcast, or article in seconds. Save them to your personal pods and access them anytime.

Try Summify free