Iterator Pattern: Traverse Collections in OOP
TL;DR
The Iterator Pattern provides a way to access elements of a collection sequentially without exposing the underlying representation. It decouples iteration logic from the collection, allowing you to traverse different data structures using a uniform interface. This pattern is fundamental to modern programming and is built into most languages.
Core Concept
What is the Iterator Pattern?
The Iterator Pattern is a behavioral design pattern that provides a standard way to traverse a collection without exposing its internal structure. Instead of accessing elements directly through indices or internal pointers, you use an iterator object that knows how to move through the collection one element at a time.
Why It Matters
Imagine you have different data structures: arrays, linked lists, trees, graphs. Each has a different internal structure, but you want to traverse them all using the same approach. The Iterator Pattern solves this by:
- Hiding complexity: Users don’t need to know if your collection is an array, linked list, or custom structure
- Single Responsibility: Iteration logic is separate from the collection’s main purpose
- Multiple traversals: You can have multiple iterators traversing the same collection simultaneously
- Uniform interface: All collections can be traversed the same way
Core Components
The pattern involves two main participants:
-
Iterator: An object that knows how to traverse the collection. It typically provides methods like
next()to get the next element andhas_next()to check if more elements exist. -
Iterable/Collection: The data structure being traversed. It provides a method to create an iterator (often called
iterator()or__iter__()).
In Python, this pattern is deeply integrated into the language through the iterator protocol: any object implementing __iter__() (returns an iterator) and __next__() (returns the next item) can be used in for-loops and other iteration contexts.
When to Use It
- You need to traverse a complex data structure without exposing its internals
- You want to provide multiple ways to traverse the same collection
- You’re designing a collection class that should work with standard language features (like for-loops)
- You need to support lazy evaluation or infinite sequences
Visual Guide
Iterator Pattern Structure
classDiagram
class Iterator {
<<interface>>
+has_next() bool
+next() Element
}
class ConcreteIterator {
-collection: Collection
-position: int
+has_next() bool
+next() Element
}
class Iterable {
<<interface>>
+create_iterator() Iterator
}
class ConcreteCollection {
-items: List
+create_iterator() Iterator
+add(item)
}
Iterator <|.. ConcreteIterator
Iterable <|.. ConcreteCollection
ConcreteCollection --> ConcreteIterator : creates
ConcreteIterator --> ConcreteCollection : traverses
The Iterator Pattern separates traversal logic (Iterator) from the collection structure (Iterable). The collection creates iterators on demand, and each iterator maintains its own traversal state.
Python Iterator Protocol Flow
sequenceDiagram
participant Client
participant Collection
participant Iterator
Client->>Collection: for item in collection
Collection->>Collection: __iter__()
Collection->>Iterator: create iterator
Collection-->>Client: return iterator
loop Until StopIteration
Client->>Iterator: __next__()
Iterator->>Iterator: get current item
Iterator->>Iterator: advance position
Iterator-->>Client: return item
end
Iterator-->>Client: raise StopIteration
When you use a for-loop in Python, it calls iter() to get an iterator, then repeatedly calls next() until StopIteration is raised.
Examples
Example 1: Basic Custom Iterator
Let’s create a simple collection that stores books and provides iteration:
class Book:
def __init__(self, title, author):
self.title = title
self.author = author
def __repr__(self):
return f"Book('{self.title}' by {self.author})"
class BookCollection:
def __init__(self):
self._books = []
def add_book(self, book):
self._books.append(book)
def __iter__(self):
"""Return an iterator for this collection."""
return BookIterator(self._books)
class BookIterator:
def __init__(self, books):
self._books = books
self._index = 0
def __iter__(self):
"""Iterators must return themselves from __iter__."""
return self
def __next__(self):
"""Return the next book or raise StopIteration."""
if self._index >= len(self._books):
raise StopIteration
book = self._books[self._index]
self._index += 1
return book
# Usage
library = BookCollection()
library.add_book(Book("1984", "Orwell"))
library.add_book(Book("Dune", "Herbert"))
library.add_book(Book("Foundation", "Asimov"))
# The iterator pattern allows us to use for-loops
for book in library:
print(book)
# Output:
# Book('1984' by Orwell)
# Book('Dune' by Herbert)
# Book('Foundation' by Asimov)
Key Points:
BookCollection.__iter__()creates and returns a new iteratorBookIterator.__next__()returns items one at a time and raisesStopIterationwhen done- The internal
_bookslist is hidden from users - Multiple iterators can traverse the collection independently
Try it yourself: Add a reverse_iterator() method to BookCollection that returns an iterator traversing books in reverse order.
Example 2: Generator-Based Iterator (Pythonic Approach)
Python’s generators provide a simpler way to implement iterators:
class BookCollection:
def __init__(self):
self._books = []
def add_book(self, book):
self._books.append(book)
def __iter__(self):
"""Use a generator for simpler iteration."""
for book in self._books:
yield book
def by_author(self, author):
"""Custom iterator that filters by author."""
for book in self._books:
if book.author == author:
yield book
# Usage
library = BookCollection()
library.add_book(Book("1984", "Orwell"))
library.add_book(Book("Animal Farm", "Orwell"))
library.add_book(Book("Dune", "Herbert"))
# Standard iteration
print("All books:")
for book in library:
print(f" {book}")
# Output:
# All books:
# Book('1984' by Orwell)
# Book('Animal Farm' by Orwell)
# Book('Dune' by Herbert)
# Custom filtered iteration
print("\nOrwell's books:")
for book in library.by_author("Orwell"):
print(f" {book}")
# Output:
# Orwell's books:
# Book('1984' by Orwell)
# Book('Animal Farm' by Orwell)
Key Points:
- The
yieldkeyword makes a function a generator (automatic iterator) - No need to manually track position or raise
StopIteration - Easy to create multiple iteration strategies (like
by_author()) - Generators are lazy: they produce values on-demand
Try it yourself: Add a by_title_length(min_length) method that yields books with titles longer than min_length characters.
Example 3: Custom Data Structure with Iterator
Let’s implement a binary tree with in-order traversal:
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
class BinaryTree:
def __init__(self):
self.root = None
def insert(self, value):
"""Insert a value into the binary search tree."""
if self.root is None:
self.root = TreeNode(value)
else:
self._insert_recursive(self.root, value)
def _insert_recursive(self, node, value):
if value < node.value:
if node.left is None:
node.left = TreeNode(value)
else:
self._insert_recursive(node.left, value)
else:
if node.right is None:
node.right = TreeNode(value)
else:
self._insert_recursive(node.right, value)
def __iter__(self):
"""In-order traversal using a generator."""
return self._inorder_traversal(self.root)
def _inorder_traversal(self, node):
"""Recursively yield values in sorted order."""
if node is not None:
# Traverse left subtree
yield from self._inorder_traversal(node.left)
# Yield current node
yield node.value
# Traverse right subtree
yield from self._inorder_traversal(node.right)
# Usage
tree = BinaryTree()
for value in [5, 3, 7, 1, 4, 6, 9]:
tree.insert(value)
print("In-order traversal (sorted):")
for value in tree:
print(value, end=" ")
# Output:
# In-order traversal (sorted):
# 1 3 4 5 6 7 9
Key Points:
- The tree’s internal structure (nodes, pointers) is completely hidden
- Users can traverse the tree without knowing it’s a tree
yield fromdelegates to another generator- The same pattern works for any tree traversal (pre-order, post-order, level-order)
Java/C++ Note: In Java, you’d implement the Iterator interface with hasNext() and next() methods. In C++, you’d typically use iterator classes with overloaded ++, *, and != operators to work with STL algorithms.
Try it yourself: Add a preorder() method that returns an iterator for pre-order traversal (root, left, right).
Example 4: Infinite Iterator
Iterators can represent infinite sequences:
class FibonacciIterator:
"""Generates Fibonacci numbers indefinitely."""
def __init__(self):
self.a = 0
self.b = 1
def __iter__(self):
return self
def __next__(self):
current = self.a
self.a, self.b = self.b, self.a + self.b
return current
# Usage with limit
fib = FibonacciIterator()
print("First 10 Fibonacci numbers:")
for i, num in enumerate(fib):
if i >= 10:
break
print(num, end=" ")
# Output:
# First 10 Fibonacci numbers:
# 0 1 1 2 3 5 8 13 21 34
# Using itertools for cleaner limiting
import itertools
fib = FibonacciIterator()
print("\nFirst 5 Fibonacci numbers:")
for num in itertools.islice(fib, 5):
print(num, end=" ")
# Output:
# First 5 Fibonacci numbers:
# 0 1 1 2 3
Key Points:
- Infinite iterators never raise
StopIteration - Useful for mathematical sequences, data streams, or event loops
- Must be manually limited (break,
itertools.islice, etc.) - Memory-efficient: generates values on-demand
Try it yourself: Create a PrimeIterator that generates prime numbers indefinitely.
Common Mistakes
1. Forgetting to Raise StopIteration
# WRONG: Returns None instead of raising StopIteration
class BadIterator:
def __init__(self, data):
self.data = data
self.index = 0
def __next__(self):
if self.index < len(self.data):
item = self.data[self.index]
self.index += 1
return item
return None # WRONG! Should raise StopIteration
# This causes infinite loops with None values
for item in BadIterator([1, 2, 3]):
print(item) # Prints: 1, 2, 3, None, None, None...
# CORRECT:
class GoodIterator:
def __init__(self, data):
self.data = data
self.index = 0
def __next__(self):
if self.index < len(self.data):
item = self.data[self.index]
self.index += 1
return item
raise StopIteration # CORRECT!
Why it matters: Python’s for-loop relies on StopIteration to know when to stop. Returning None doesn’t signal completion.
2. Not Returning Self from iter
# WRONG: __iter__ doesn't return self
class BadIterator:
def __init__(self, data):
self.data = data
self.index = 0
def __iter__(self):
pass # WRONG! Should return self
def __next__(self):
# ... implementation
pass
# This breaks the iterator protocol
it = BadIterator([1, 2, 3])
for item in it: # TypeError: iter() returned non-iterator
print(item)
# CORRECT:
class GoodIterator:
def __iter__(self):
return self # CORRECT!
Why it matters: The iterator protocol requires __iter__() to return an object with a __next__() method. For iterator objects themselves, this should be self.
3. Reusing Exhausted Iterators
class NumberIterator:
def __init__(self, data):
self.data = data
self.index = 0
def __iter__(self):
return self
def __next__(self):
if self.index < len(self.data):
item = self.data[self.index]
self.index += 1
return item
raise StopIteration
# WRONG: Trying to reuse an exhausted iterator
iterator = NumberIterator([1, 2, 3])
for num in iterator:
print(num) # Prints: 1, 2, 3
for num in iterator:
print(num) # Prints nothing! Iterator is exhausted
# CORRECT: Create a new iterator or reset
class ReusableCollection:
def __init__(self, data):
self.data = data
def __iter__(self):
return NumberIterator(self.data) # New iterator each time
collection = ReusableCollection([1, 2, 3])
for num in collection:
print(num) # Prints: 1, 2, 3
for num in collection:
print(num) # Prints: 1, 2, 3 (works!)
Why it matters: Iterators maintain state and can’t be “rewound”. Collections should create fresh iterators in __iter__().
4. Modifying Collection During Iteration
# WRONG: Modifying collection while iterating
class BadCollection:
def __init__(self):
self.items = []
def add(self, item):
self.items.append(item)
def __iter__(self):
for item in self.items:
yield item
collection = BadCollection()
for i in range(5):
collection.add(i)
# WRONG: Adding items during iteration
for item in collection:
if item < 3:
collection.add(item * 10) # Modifying during iteration!
print(item)
# Behavior is unpredictable and may cause infinite loops
# CORRECT: Collect changes, apply after iteration
to_add = []
for item in collection:
if item < 3:
to_add.append(item * 10)
print(item)
for item in to_add:
collection.add(item)
Why it matters: Modifying a collection during iteration can cause skipped elements, infinite loops, or crashes. Always collect changes and apply them after iteration completes.
5. Not Making Iterators Stateless When Needed
# WRONG: Iterator shares state between iterations
class SharedStateIterator:
def __init__(self, data):
self.data = data
self.index = 0 # Shared across all iterations!
def __iter__(self):
return self # Returns same object
def __next__(self):
if self.index < len(self.data):
item = self.data[self.index]
self.index += 1
return item
raise StopIteration
iterator = SharedStateIterator([1, 2, 3])
list1 = list(iterator) # [1, 2, 3]
list2 = list(iterator) # [] - exhausted!
# CORRECT: Create new iterator with fresh state
class FreshStateCollection:
def __init__(self, data):
self.data = data
def __iter__(self):
return iter(self.data) # New iterator each time
collection = FreshStateCollection([1, 2, 3])
list1 = list(collection) # [1, 2, 3]
list2 = list(collection) # [1, 2, 3] - works!
Why it matters: If you want a collection to be iterable multiple times, __iter__() must return a new iterator with fresh state each time it’s called.
Interview Tips
What Interviewers Look For
1. Explain the pattern clearly in 30 seconds: “The Iterator Pattern provides a standard way to traverse a collection without exposing its internal structure. It separates the traversal logic from the collection itself, allowing different data structures to be accessed uniformly.”
2. Know when to use it vs. direct access: Interviewers may ask: “When would you use an iterator instead of just indexing into a list?”
- When the internal structure should be hidden (encapsulation)
- When you need multiple simultaneous traversals
- When the collection doesn’t support random access (linked lists, trees)
- When you want lazy evaluation or infinite sequences
3. Recognize built-in iterator usage: Be ready to identify iterator patterns in code:
# All of these use iterators under the hood:
for item in collection: # Uses __iter__() and __next__()
list(collection) # Consumes an iterator
sum(numbers) # Iterates through numbers
max(values) # Iterates to find maximum
4. Implement from scratch: Practice implementing a custom iterator without using generators:
class RangeIterator:
def __init__(self, start, end):
self.current = start
self.end = end
def __iter__(self):
return self
def __next__(self):
if self.current >= self.end:
raise StopIteration
value = self.current
self.current += 1
return value
5. Discuss trade-offs:
- Memory: Iterators can be memory-efficient (lazy evaluation) vs. materializing entire collections
- Performance: Sequential access only vs. random access
- Complexity: Additional abstraction layer vs. simpler direct access
- Flexibility: Easy to add new traversal strategies vs. tight coupling
6. Connect to language features:
- Python:
__iter__(),__next__(), generators,itertoolsmodule - Java:
Iteratorinterface,Iterableinterface, enhanced for-loop - C++: Iterator categories (input, output, forward, bidirectional, random access)
7. Common interview questions:
- “Implement an iterator for a custom data structure” (linked list, tree, graph)
- “How would you iterate over two collections simultaneously?” (zip, multiple iterators)
- “Design an iterator that filters elements” (conditional yielding)
- “What’s the difference between an iterator and an iterable?” (iterable creates iterators, iterator does traversal)
8. Code review scenarios: Be prepared to spot issues:
# What's wrong with this code?
class MyIterator:
def __init__(self, data):
self.data = data
self.index = 0
def __next__(self):
if self.index < len(self.data):
return self.data[self.index]
self.index += 1 # Never reached!
raise StopIteration
(Answer: return exits the function before incrementing index)
9. Optimization discussions: “How would you optimize iteration over a large dataset?”
- Use generators for lazy evaluation
- Implement chunking for batch processing
- Consider parallel iteration with multiple iterators
- Use
itertoolsfor efficient iteration patterns
10. Real-world applications: Mention practical uses:
- Database result sets (cursor pattern)
- File reading (line-by-line iteration)
- API pagination (fetching pages on-demand)
- Stream processing (infinite data sources)
- UI components (list views, data grids)
Key Takeaways
- The Iterator Pattern separates traversal logic from collection structure, providing a uniform interface for accessing elements sequentially regardless of the underlying data structure
- In Python, implement iterators using
__iter__()(returns iterator) and__next__()(returns next item or raisesStopIteration), or use generators withyieldfor simpler implementation - Iterators are single-use and maintain their own state; collections should create fresh iterators in
__iter__()to support multiple iterations - The pattern enables lazy evaluation, memory efficiency with large datasets, multiple simultaneous traversals, and works seamlessly with language features like for-loops
- Common pitfalls include forgetting to raise
StopIteration, not returningselffrom__iter__(), modifying collections during iteration, and attempting to reuse exhausted iterators