Under what circumstances are linked lists useful?
Linked lists are very flexible: With the modification of one pointer, you can make a massive change, where the same operation would be very inefficient in an array list.
They can be useful for concurrent data structures. (There is now a non-concurrent real-world usage sample below - that would not be there if @Neil hadn't mentioned FORTRAN. ;-)
For example, ConcurrentDictionary<TKey, TValue>
in .NET 4.0 RC use linked lists to chain items that hash to the same bucket.
The underlying data structure for ConcurrentStack<T>
is also a linked list.
ConcurrentStack<T>
is one of the data structures that serve as the foundation for the new Thread Pool, (with the local "queues" implemented as stacks, essentially). (The other main supporting structure being ConcurrentQueue<T>
.)
The new Thread Pool in turn provides the basis for the work scheduling of the new Task Parallel Library.
So they can certainly be useful - a linked list is currently serving as one of the main supporting structures of at least one great new technology.
(A singly-linked list makes a compelling lock-free - but not wait-free - choice in these cases, because main operations can be carried out with a single CAS (+retries). In a modern GC-d environment - such as Java and .NET - the ABA problem can easily be avoided. Just wrap items you add in freshly created nodes and do not reuse those nodes - let the GC do its work. The page on the ABA problem also provides the implementation of a lock-free stack - that actually works in .Net (&Java) with a (GC-ed) Node holding the items.)
Edit:
@Neil:
actually, what you mentioned about FORTRAN reminded me that the same kind of linked lists can be found in probably the most used and abused data structure in .NET:
the plain .NET generic Dictionary<TKey, TValue>
.
Not one, but many linked lists are stored in an array.
- It avoids doing many small (de)allocations on inserts/deletes.
- Initial loading of the hash table is pretty fast, because the array is filled sequentially (plays very nice with CPU cache).
- Not to mention that a chaining hash table is expensive in terms of memory - and this "trick" cuts "pointer sizes" in half on x64.
Essentially, many linked lists are stored in an array. (one for each bucket used.) A free list of reusable nodes is "interwoven" between them (if there were deletes). An array is allocated at the start/on rehash and nodes of chains are kept in it. There is also a free pointer - an index into the array - that follows deletes. ;-) So - believe it or not - the FORTRAN technique still lives on. (...and nowhere else, than in one of the most commonly used .NET data structures ;-).
Linked lists are very useful when you need to do a lot of insertions and removals, but not too much searching, on a list of arbitrary (unknown at compile-time) length.
Splitting and joining (bidirectionally-linked) lists is very efficient.
You can also combine linked lists - e.g. tree structures can be implemented as "vertical" linked lists (parent/child relationships) connecting together horizontal linked lists (siblings).
Using an array based list for these purposes has severe limitations:
- Adding a new item means the array must be reallocated (or you must allocate more space than you need to allow for future growth and reduce the number of reallocations)
- Removing items leaves wasted space or requires a reallocation
- inserting items anywhere except the end involves (possibly reallocating and) copying lots of the data up one position