When should I use an array instead of a linked list?

Arrays are generally preferred when you need frequent and fast access to elements at arbitrary positions, and when the size of the data is known or doesn't change drastically. They can also be faster on modern systems with caches due to sequential memory access.

When should I use a linked list instead of an array?

Linked lists are better when you frequently insert or delete elements, especially in the middle of the data structure, as this is more efficient than with arrays. They are also useful when the size of the data is unknown or highly dynamic.

Does the order of traversal (forward vs. backward) affect performance?

Yes, it can significantly impact performance. On older systems without caches, the order might matter less. However, on modern systems with caches, traversing backwards on an array can sometimes be slower than forwards if specific CPU instructions aren't leveraged, though compiler optimizations can also play a huge role.

How do CPU caches affect array vs. linked list performance?

CPUs use caches to store frequently accessed data and instructions for faster retrieval. On systems with caches, arrays often perform better because their contiguous memory layout allows for efficient cache line filling and instruction fetching. Linked lists, with scattered memory locations, make less efficient use of caches.

Can compiler optimizations change which data structure is faster?

Absolutely. Compilers can perform advanced optimizations, like loop unrolling or instruction reordering, that can dramatically alter the performance of arrays and linked lists, sometimes to the point where it reverses the expected outcome.

Why did the linked list perform better than the array on the Atari?

On the Atari ST, which lacks a CPU cache, memory access was the bottleneck. The slightly different instruction timings or perhaps memory allocation patterns might have favored the linked list in this specific test scenario, whereas on systems with caches, arrays generally pull ahead.

Key Moments

Arrays vs Linked Lists - Computerphile

Computerphile

Education4 min read30 min video

Jul 11, 2017|503,722 views|10,742|925

computers computerphile computer science array linked list speed of linked list speed of array array vs linked list linked list vs array

Save to Pod

Key Moments

TL;DR

Arrays vs. Linked Lists performance depends on CPU architecture and cache, not a universal answer.

Key Insights

Arrays store data contiguously in memory, while linked lists use nodes with pointers scattered across memory.

Accessing a specific element in an array is O(1) due to direct indexing, whereas in a linked list it's O(n) as traversal is required.

Performance benchmarks on different hardware (Atari, Raspberry Pi, iMac) showed varying results for array vs. linked list operations.

The presence and configuration of CPU caches significantly impact performance, favoring arrays when data is accessed sequentially.

Compiler optimizations play a crucial role, with advanced compilers potentially unrolling loops for arrays to boost performance.

Reverse traversal of arrays on the Atari was faster due to specific CPU instructions, demonstrating architecture-specific optimizations.

FUNDAMENTAL DATA STRUCTURE DIFFERENCES

Arrays and linked lists are two fundamental ways to store collections of data. An array stores elements contiguously in a single block of memory, allowing direct access via an index. Each element can be referenced directly, starting from index zero. In contrast, a linked list stores elements in nodes, with each node containing the data and a pointer to the next node. These nodes can be located anywhere in memory, and the list is navigated by following these pointers from a starting point (often called 'head' or 'first'). This fundamental difference in memory organization is key to understanding their performance characteristics.

OPERATION SCENARIOS AND TESTING METHODOLOGY

To compare the performance, a common operation was chosen: iterating through all elements and summing a specific integer value ('p') within each element's structure. This operation was implemented in C for both arrays and linked lists, creating 125,000 elements. Two C programs were written, nearly identical except for the data structure implementation. The process involved generating random data, then running the summing function 100 times on each structure to calculate an average execution time, recorded using the system's clock function.

INITIAL BENCHMARKS ON THE ATARI ST

The first set of tests was conducted on an Atari ST. The linked list traversal took approximately 166 clock ticks, while the array traversal took around 178.5 clock ticks. This initial result suggested that, on this particular machine, the linked list was slightly faster for this type of operation. Interestingly, when the array traversal was reversed (from the end to the beginning), the time dropped to 114 ticks, making it even faster than the linked list. This deviation from initial expectations prompted a deeper investigation into the underlying hardware.

IMPACT OF CACHING ON MODERN ARCHITECTURES

Moving to more modern machines like the Raspberry Pi and iMac revealed a different performance landscape. On the Raspberry Pi, the array was roughly twice as fast as the linked list. When the array traversal was reversed, it became slower than the forward traversal. On the iMac, the array was significantly faster, about five times quicker than the linked list, regardless of traversal direction. The key difference highlighted is the presence of CPU caches. Modern CPUs use caches to store frequently accessed data and instructions, dramatically speeding up access compared to fetching directly from main memory.

THE ROLE OF CACHE LINES AND MEMORY ACCESS

The disparity in performance on machines with caches is largely due to cache lines and memory access patterns. When a CPU accesses data, it fetches a block (a cache line, e.g., 128 bytes) into the cache. For arrays, sequential access means most of the data needed is likely already in the cache or fetched in contiguous blocks. Linked lists, however, require two memory accesses per element: one for the data and another for the pointer to the next node. This sequential access pattern, combined with caches, makes arrays more efficient on modern architectures where CPUs are much faster than memory.

COMPILER OPTIMIZATIONS AND ARCHITECTURE SPECIFICITY

Compiler optimizations also play a vital role. The iMac tests, using a different compiler (Clang), showed much faster array performance, attributed to loop unrolling where the compiler processes multiple array elements in a single loop iteration. Furthermore, specific CPU instructions, like those on the Atari's 68000 processor for decrementing and branching, can optimize reverse array traversal. Ultimately, the performance comparison between arrays and linked lists is not universal; it's highly dependent on the specific hardware architecture, its caching mechanisms, available CPU instructions, and the effectiveness of the compiler.

Mentioned in This Episode

●Software & Apps

●Concepts

Arrays vs. Linked Lists: Performance Considerations

Practical takeaways from this episode

Do This

Arrays offer faster random access for retrieving individual elements.

Linked lists are more flexible for insertions and deletions, especially in the middle.

Test performance on the target architecture; results vary significantly (e.g., with/without caches).

Consider compiler optimizations, as they can drastically alter performance.

When traversing an array forwards, modern CPUs with caches tend to favor arrays.

When traversing an array backwards on older architectures, specific instructions can make it faster.

Avoid This

Do not assume arrays are always faster than linked lists, or vice-versa.

Do not compare 'ticks' across different machines without understanding their clock speeds.

Do not rely on theoretical performance; real-world testing is crucial.

Avoid assuming identical performance on similar architectures (e.g., Atari ST vs. Falcon) without testing.

Performance Test Results (Summing 'P' values for 125,000 elements)

Data extracted from this episode

Machine	Operation	Result (Ticks/Time)
Atari ST	Linked List	166 ticks
Atari ST	Array (Forward)	178.5 ticks
Atari ST	Array (Backward)	114 ticks
iMac	Linked List	29.54 (iMac ticks)
iMac	Array (Forward)	43.44 (iMac ticks)
iMac	Array (Backward)	47.46 (iMac ticks)
Raspberry Pi	Array (Forward)	96575 (time)
Raspberry Pi	Linked List	1858.61 (time)
Raspberry Pi	Array (Backward)	1011.55 (time)
Atari Falcon	Array	46 ticks
Atari Falcon	Linked List	58.5 ticks

Common Questions

It's not a simple yes or no. Arrays offer faster random access, but linked lists can be faster for certain operations or on specific hardware configurations, especially when considering memory access patterns and CPU caches. Real-world testing on target architecture is key.