This article is about the algorithm by R. W. Floyd for detecting and removing a loop in a linked list data structure. It is a classical question during IT job interview, generating a lot of badly written web pages on the topic.
Suppose to have a linked list data structure where each block has a pointer to the next one. The invariant is that the last element must point to NULL. Now suppose that, for some reason, this invariant is violated and this results in a loop. The image below represents the faulty linked list (boxes and arrows in black). We want an efficient way to identify the last node and restore its pointer to NULL.
Trivial solutions include memorizing the memory location of visited blocks.
Floyd’s algorithm can be used to detect and correct a loop like this in linear time and using a constant amount of memory. What is interesting it is not the algorithm itself but the mathematical reasoning behind it.
The algorithm works in two phases:
- Identify node ;
- initialize two pointers to the list head;
- move one block ahead, two blocks ahead until end is reached or .
- Identify node ;
- initialize two pointers, to the list head and to ;
- until , set and move one block ahead.
Why it works
We need to show the effectiveness of the algorithms for the two phases separately. But first a bit of notation:
|Symbol||Meaning||Value in example|
|block of loop beginning|
|distance between list head and||4|
|distance between and||1|
Let’s start considering phase 1; if there is no loop then eventually reaches the end before (because it is going double its speed). If there is a loop then eventually are gonna be inside it. Inside the loop, let be the distance between the two pointers at step . Then,
Eventually, hence and, therefore, and we reach (note we need just existence not uniqueness).
Phase 2 is about proving that moving one block at the time they meet in .
During phase 1, we moved for blocks and for blocks, where indicate the amount of complete loops for respectively. Since they finally met in and was moving twice the speed of we can say that
At the beginning of phase 2 is initialized to the list head and to and we move them one block ahed at the time. When reaches it has moved exactly blocks, which means that has moved blocks, and, since it started from distance from it has arrived exactly where the loop begins, at .