Iterator Pattern: Traverse Collections in OOP

TL;DR

The Iterator Pattern provides a way to access elements of a collection sequentially without exposing the underlying representation. It decouples iteration logic from the collection, allowing you to traverse different data structures using a uniform interface. This pattern is fundamental to modern programming and is built into most languages.

Prerequisites: Basic Python syntax, understanding of classes and objects, familiarity with lists and basic data structures, knowledge of interfaces/abstract classes

After this topic: Implement custom iterators for your own collections, recognize when to use the Iterator Pattern versus direct access, design iterable classes that work with Python’s for-loop syntax, and explain the benefits of separating iteration logic from collection structure in technical interviews.

Core Concept

What is the Iterator Pattern?

The Iterator Pattern is a behavioral design pattern that provides a standard way to traverse a collection without exposing its internal structure. Instead of accessing elements directly through indices or internal pointers, you use an iterator object that knows how to move through the collection one element at a time.

Why It Matters

Imagine you have different data structures: arrays, linked lists, trees, graphs. Each has a different internal structure, but you want to traverse them all using the same approach. The Iterator Pattern solves this by:

Hiding complexity: Users don’t need to know if your collection is an array, linked list, or custom structure
Single Responsibility: Iteration logic is separate from the collection’s main purpose
Multiple traversals: You can have multiple iterators traversing the same collection simultaneously
Uniform interface: All collections can be traversed the same way

Core Components

The pattern involves two main participants:

Iterator: An object that knows how to traverse the collection. It typically provides methods like next() to get the next element and has_next() to check if more elements exist.
Iterable/Collection: The data structure being traversed. It provides a method to create an iterator (often called iterator() or __iter__()).

In Python, this pattern is deeply integrated into the language through the iterator protocol: any object implementing __iter__() (returns an iterator) and __next__() (returns the next item) can be used in for-loops and other iteration contexts.

When to Use It

You need to traverse a complex data structure without exposing its internals
You want to provide multiple ways to traverse the same collection
You’re designing a collection class that should work with standard language features (like for-loops)
You need to support lazy evaluation or infinite sequences

Visual Guide

Iterator Pattern Structure

classDiagram
    class Iterator {
        <<interface>>
        +has_next() bool
        +next() Element
    }
    class ConcreteIterator {
        -collection: Collection
        -position: int
        +has_next() bool
        +next() Element
    }
    class Iterable {
        <<interface>>
        +create_iterator() Iterator
    }
    class ConcreteCollection {
        -items: List
        +create_iterator() Iterator
        +add(item)
    }
    
    Iterator <|.. ConcreteIterator
    Iterable <|.. ConcreteCollection
    ConcreteCollection --> ConcreteIterator : creates
    ConcreteIterator --> ConcreteCollection : traverses

The Iterator Pattern separates traversal logic (Iterator) from the collection structure (Iterable). The collection creates iterators on demand, and each iterator maintains its own traversal state.

Python Iterator Protocol Flow

sequenceDiagram
    participant Client
    participant Collection
    participant Iterator
    
    Client->>Collection: for item in collection
    Collection->>Collection: __iter__()
    Collection->>Iterator: create iterator
    Collection-->>Client: return iterator
    
    loop Until StopIteration
        Client->>Iterator: __next__()
        Iterator->>Iterator: get current item
        Iterator->>Iterator: advance position
        Iterator-->>Client: return item
    end
    
    Iterator-->>Client: raise StopIteration

When you use a for-loop in Python, it calls iter() to get an iterator, then repeatedly calls next() until StopIteration is raised.

Examples

Example 1: Basic Custom Iterator

Let’s create a simple collection that stores books and provides iteration:

class Book:
    def __init__(self, title, author):
        self.title = title
        self.author = author
    
    def __repr__(self):
        return f"Book('{self.title}' by {self.author})"


class BookCollection:
    def __init__(self):
        self._books = []
    
    def add_book(self, book):
        self._books.append(book)
    
    def __iter__(self):
        """Return an iterator for this collection."""
        return BookIterator(self._books)


class BookIterator:
    def __init__(self, books):
        self._books = books
        self._index = 0
    
    def __iter__(self):
        """Iterators must return themselves from __iter__."""
        return self
    
    def __next__(self):
        """Return the next book or raise StopIteration."""
        if self._index >= len(self._books):
            raise StopIteration
        
        book = self._books[self._index]
        self._index += 1
        return book


# Usage
library = BookCollection()
library.add_book(Book("1984", "Orwell"))
library.add_book(Book("Dune", "Herbert"))
library.add_book(Book("Foundation", "Asimov"))

# The iterator pattern allows us to use for-loops
for book in library:
    print(book)

# Output:
# Book('1984' by Orwell)
# Book('Dune' by Herbert)
# Book('Foundation' by Asimov)

Key Points:

BookCollection.__iter__() creates and returns a new iterator
BookIterator.__next__() returns items one at a time and raises StopIteration when done
The internal _books list is hidden from users
Multiple iterators can traverse the collection independently

Try it yourself: Add a reverse_iterator() method to BookCollection that returns an iterator traversing books in reverse order.

Example 2: Generator-Based Iterator (Pythonic Approach)

Python’s generators provide a simpler way to implement iterators:

class BookCollection:
    def __init__(self):
        self._books = []
    
    def add_book(self, book):
        self._books.append(book)
    
    def __iter__(self):
        """Use a generator for simpler iteration."""
        for book in self._books:
            yield book
    
    def by_author(self, author):
        """Custom iterator that filters by author."""
        for book in self._books:
            if book.author == author:
                yield book


# Usage
library = BookCollection()
library.add_book(Book("1984", "Orwell"))
library.add_book(Book("Animal Farm", "Orwell"))
library.add_book(Book("Dune", "Herbert"))

# Standard iteration
print("All books:")
for book in library:
    print(f"  {book}")

# Output:
# All books:
#   Book('1984' by Orwell)
#   Book('Animal Farm' by Orwell)
#   Book('Dune' by Herbert)

# Custom filtered iteration
print("\nOrwell's books:")
for book in library.by_author("Orwell"):
    print(f"  {book}")

# Output:
# Orwell's books:
#   Book('1984' by Orwell)
#   Book('Animal Farm' by Orwell)

Key Points:

The yield keyword makes a function a generator (automatic iterator)
No need to manually track position or raise StopIteration
Easy to create multiple iteration strategies (like by_author())
Generators are lazy: they produce values on-demand

Try it yourself: Add a by_title_length(min_length) method that yields books with titles longer than min_length characters.

Example 3: Custom Data Structure with Iterator

Let’s implement a binary tree with in-order traversal:

class TreeNode:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None


class BinaryTree:
    def __init__(self):
        self.root = None
    
    def insert(self, value):
        """Insert a value into the binary search tree."""
        if self.root is None:
            self.root = TreeNode(value)
        else:
            self._insert_recursive(self.root, value)
    
    def _insert_recursive(self, node, value):
        if value < node.value:
            if node.left is None:
                node.left = TreeNode(value)
            else:
                self._insert_recursive(node.left, value)
        else:
            if node.right is None:
                node.right = TreeNode(value)
            else:
                self._insert_recursive(node.right, value)
    
    def __iter__(self):
        """In-order traversal using a generator."""
        return self._inorder_traversal(self.root)
    
    def _inorder_traversal(self, node):
        """Recursively yield values in sorted order."""
        if node is not None:
            # Traverse left subtree
            yield from self._inorder_traversal(node.left)
            # Yield current node
            yield node.value
            # Traverse right subtree
            yield from self._inorder_traversal(node.right)


# Usage
tree = BinaryTree()
for value in [5, 3, 7, 1, 4, 6, 9]:
    tree.insert(value)

print("In-order traversal (sorted):")
for value in tree:
    print(value, end=" ")

# Output:
# In-order traversal (sorted):
# 1 3 4 5 6 7 9

Key Points:

The tree’s internal structure (nodes, pointers) is completely hidden
Users can traverse the tree without knowing it’s a tree
yield from delegates to another generator
The same pattern works for any tree traversal (pre-order, post-order, level-order)

Java/C++ Note: In Java, you’d implement the Iterator interface with hasNext() and next() methods. In C++, you’d typically use iterator classes with overloaded ++, *, and != operators to work with STL algorithms.

Try it yourself: Add a preorder() method that returns an iterator for pre-order traversal (root, left, right).

Example 4: Infinite Iterator

Iterators can represent infinite sequences:

class FibonacciIterator:
    """Generates Fibonacci numbers indefinitely."""
    def __init__(self):
        self.a = 0
        self.b = 1
    
    def __iter__(self):
        return self
    
    def __next__(self):
        current = self.a
        self.a, self.b = self.b, self.a + self.b
        return current


# Usage with limit
fib = FibonacciIterator()
print("First 10 Fibonacci numbers:")
for i, num in enumerate(fib):
    if i >= 10:
        break
    print(num, end=" ")

# Output:
# First 10 Fibonacci numbers:
# 0 1 1 2 3 5 8 13 21 34

# Using itertools for cleaner limiting
import itertools

fib = FibonacciIterator()
print("\nFirst 5 Fibonacci numbers:")
for num in itertools.islice(fib, 5):
    print(num, end=" ")

# Output:
# First 5 Fibonacci numbers:
# 0 1 1 2 3

Key Points:

Infinite iterators never raise StopIteration
Useful for mathematical sequences, data streams, or event loops
Must be manually limited (break, itertools.islice, etc.)
Memory-efficient: generates values on-demand

Try it yourself: Create a PrimeIterator that generates prime numbers indefinitely.

Common Mistakes

1. Forgetting to Raise StopIteration

# WRONG: Returns None instead of raising StopIteration
class BadIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0
    
    def __next__(self):
        if self.index < len(self.data):
            item = self.data[self.index]
            self.index += 1
            return item
        return None  # WRONG! Should raise StopIteration

# This causes infinite loops with None values
for item in BadIterator([1, 2, 3]):
    print(item)  # Prints: 1, 2, 3, None, None, None...

# CORRECT:
class GoodIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0
    
    def __next__(self):
        if self.index < len(self.data):
            item = self.data[self.index]
            self.index += 1
            return item
        raise StopIteration  # CORRECT!

Why it matters: Python’s for-loop relies on StopIteration to know when to stop. Returning None doesn’t signal completion.

2. Not Returning Self from iter

# WRONG: __iter__ doesn't return self
class BadIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0
    
    def __iter__(self):
        pass  # WRONG! Should return self
    
    def __next__(self):
        # ... implementation
        pass

# This breaks the iterator protocol
it = BadIterator([1, 2, 3])
for item in it:  # TypeError: iter() returned non-iterator
    print(item)

# CORRECT:
class GoodIterator:
    def __iter__(self):
        return self  # CORRECT!

Why it matters: The iterator protocol requires __iter__() to return an object with a __next__() method. For iterator objects themselves, this should be self.

3. Reusing Exhausted Iterators

class NumberIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.index < len(self.data):
            item = self.data[self.index]
            self.index += 1
            return item
        raise StopIteration

# WRONG: Trying to reuse an exhausted iterator
iterator = NumberIterator([1, 2, 3])

for num in iterator:
    print(num)  # Prints: 1, 2, 3

for num in iterator:
    print(num)  # Prints nothing! Iterator is exhausted

# CORRECT: Create a new iterator or reset
class ReusableCollection:
    def __init__(self, data):
        self.data = data
    
    def __iter__(self):
        return NumberIterator(self.data)  # New iterator each time

collection = ReusableCollection([1, 2, 3])
for num in collection:
    print(num)  # Prints: 1, 2, 3
for num in collection:
    print(num)  # Prints: 1, 2, 3 (works!)

Why it matters: Iterators maintain state and can’t be “rewound”. Collections should create fresh iterators in __iter__().

4. Modifying Collection During Iteration

# WRONG: Modifying collection while iterating
class BadCollection:
    def __init__(self):
        self.items = []
    
    def add(self, item):
        self.items.append(item)
    
    def __iter__(self):
        for item in self.items:
            yield item

collection = BadCollection()
for i in range(5):
    collection.add(i)

# WRONG: Adding items during iteration
for item in collection:
    if item < 3:
        collection.add(item * 10)  # Modifying during iteration!
    print(item)
# Behavior is unpredictable and may cause infinite loops

# CORRECT: Collect changes, apply after iteration
to_add = []
for item in collection:
    if item < 3:
        to_add.append(item * 10)
    print(item)

for item in to_add:
    collection.add(item)

Why it matters: Modifying a collection during iteration can cause skipped elements, infinite loops, or crashes. Always collect changes and apply them after iteration completes.

5. Not Making Iterators Stateless When Needed

# WRONG: Iterator shares state between iterations
class SharedStateIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0  # Shared across all iterations!
    
    def __iter__(self):
        return self  # Returns same object
    
    def __next__(self):
        if self.index < len(self.data):
            item = self.data[self.index]
            self.index += 1
            return item
        raise StopIteration

iterator = SharedStateIterator([1, 2, 3])
list1 = list(iterator)  # [1, 2, 3]
list2 = list(iterator)  # [] - exhausted!

# CORRECT: Create new iterator with fresh state
class FreshStateCollection:
    def __init__(self, data):
        self.data = data
    
    def __iter__(self):
        return iter(self.data)  # New iterator each time

collection = FreshStateCollection([1, 2, 3])
list1 = list(collection)  # [1, 2, 3]
list2 = list(collection)  # [1, 2, 3] - works!

Why it matters: If you want a collection to be iterable multiple times, __iter__() must return a new iterator with fresh state each time it’s called.

Interview Tips

What Interviewers Look For

1. Explain the pattern clearly in 30 seconds: “The Iterator Pattern provides a standard way to traverse a collection without exposing its internal structure. It separates the traversal logic from the collection itself, allowing different data structures to be accessed uniformly.”

2. Know when to use it vs. direct access: Interviewers may ask: “When would you use an iterator instead of just indexing into a list?”

When the internal structure should be hidden (encapsulation)
When you need multiple simultaneous traversals
When the collection doesn’t support random access (linked lists, trees)
When you want lazy evaluation or infinite sequences

3. Recognize built-in iterator usage: Be ready to identify iterator patterns in code:

# All of these use iterators under the hood:
for item in collection:  # Uses __iter__() and __next__()
list(collection)         # Consumes an iterator
sum(numbers)             # Iterates through numbers
max(values)              # Iterates to find maximum

4. Implement from scratch: Practice implementing a custom iterator without using generators:

class RangeIterator:
    def __init__(self, start, end):
        self.current = start
        self.end = end
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current >= self.end:
            raise StopIteration
        value = self.current
        self.current += 1
        return value

5. Discuss trade-offs:

Memory: Iterators can be memory-efficient (lazy evaluation) vs. materializing entire collections
Performance: Sequential access only vs. random access
Complexity: Additional abstraction layer vs. simpler direct access
Flexibility: Easy to add new traversal strategies vs. tight coupling

6. Connect to language features:

Python: __iter__(), __next__(), generators, itertools module
Java: Iterator interface, Iterable interface, enhanced for-loop
C++: Iterator categories (input, output, forward, bidirectional, random access)

7. Common interview questions:

“Implement an iterator for a custom data structure” (linked list, tree, graph)
“How would you iterate over two collections simultaneously?” (zip, multiple iterators)
“Design an iterator that filters elements” (conditional yielding)
“What’s the difference between an iterator and an iterable?” (iterable creates iterators, iterator does traversal)

8. Code review scenarios: Be prepared to spot issues:

# What's wrong with this code?
class MyIterator:
    def __init__(self, data):
        self.data = data
        self.index = 0
    
    def __next__(self):
        if self.index < len(self.data):
            return self.data[self.index]
            self.index += 1  # Never reached!
        raise StopIteration

(Answer: return exits the function before incrementing index)

9. Optimization discussions: “How would you optimize iteration over a large dataset?”

Use generators for lazy evaluation
Implement chunking for batch processing
Consider parallel iteration with multiple iterators
Use itertools for efficient iteration patterns

10. Real-world applications: Mention practical uses:

Database result sets (cursor pattern)
File reading (line-by-line iteration)
API pagination (fetching pages on-demand)
Stream processing (infinite data sources)
UI components (list views, data grids)

Key Takeaways

The Iterator Pattern separates traversal logic from collection structure, providing a uniform interface for accessing elements sequentially regardless of the underlying data structure
In Python, implement iterators using __iter__() (returns iterator) and __next__() (returns next item or raises StopIteration), or use generators with yield for simpler implementation
Iterators are single-use and maintain their own state; collections should create fresh iterators in __iter__() to support multiple iterations
The pattern enables lazy evaluation, memory efficiency with large datasets, multiple simultaneous traversals, and works seamlessly with language features like for-loops
Common pitfalls include forgetting to raise StopIteration, not returning self from __iter__(), modifying collections during iteration, and attempting to reuse exhausted iterators