Interpreter Pattern: Parse Grammars in OOP
TL;DR
The Interpreter Pattern defines a grammatical representation for a language and provides an interpreter to process sentences in that language. It’s used for building domain-specific languages (DSLs), expression evaluators, and rule engines where you need to parse and execute custom syntax.
Core Concept
What is the Interpreter Pattern?
The Interpreter Pattern is a behavioral design pattern that defines a representation for a language’s grammar along with an interpreter that uses this representation to interpret sentences in the language. Each grammar rule becomes a class, and the interpreter traverses a tree structure (Abstract Syntax Tree) to evaluate expressions.
Core Components
Abstract Expression: Declares an abstract interpret() method that all concrete expressions must implement.
Terminal Expression: Implements the interpret operation for terminal symbols in the grammar (leaf nodes that don’t contain other expressions).
Non-Terminal Expression: Implements interpret for non-terminal symbols (nodes that contain other expressions). These typically represent operators or compound expressions.
Context: Contains global information needed during interpretation, such as variable values or state.
Client: Builds the abstract syntax tree (AST) representing a particular sentence in the language.
When to Use This Pattern
Use the Interpreter Pattern when:
- You need to interpret a simple language or grammar
- The grammar is relatively simple and stable
- Efficiency is not a critical concern
- You’re building calculators, query languages, configuration parsers, or rule engines
Why It Matters
The Interpreter Pattern provides a structured way to evaluate expressions without writing complex parsing logic from scratch. It’s particularly useful for DSLs where business rules need to be expressed in a more readable format than raw code. However, for complex grammars, consider using parser generators like ANTLR or PLY instead.
Key Insight
The pattern essentially converts a problem into a tree structure where each node knows how to evaluate itself. This recursive evaluation is the core mechanism that makes interpretation work.
Visual Guide
Interpreter Pattern Class Structure
classDiagram
class AbstractExpression {
<<interface>>
+interpret(context) Result
}
class TerminalExpression {
-value
+interpret(context) Result
}
class NonTerminalExpression {
-left: AbstractExpression
-right: AbstractExpression
+interpret(context) Result
}
class Context {
-variables: dict
+get_variable(name)
+set_variable(name, value)
}
class Client {
+build_syntax_tree()
}
AbstractExpression <|.. TerminalExpression
AbstractExpression <|.. NonTerminalExpression
NonTerminalExpression o-- AbstractExpression
Client ..> AbstractExpression
AbstractExpression ..> Context
The Interpreter Pattern uses a hierarchy of expression classes. Non-terminal expressions contain references to other expressions, forming a tree structure.
Expression Evaluation Flow
graph TD
A[Client builds AST] --> B[Root Expression]
B --> C{Non-Terminal?}
C -->|Yes| D[Interpret left child]
C -->|Yes| E[Interpret right child]
C -->|No| F[Return terminal value]
D --> G[Combine results]
E --> G
G --> H[Return result]
F --> H
Interpretation flows recursively through the tree. Non-terminal expressions evaluate their children first, then combine the results.
Examples
Example 1: Boolean Expression Interpreter
Let’s build an interpreter for simple boolean expressions like “true AND false OR true”.
from abc import ABC, abstractmethod
# Abstract Expression
class BooleanExpression(ABC):
@abstractmethod
def interpret(self, context):
pass
# Terminal Expression for boolean constants
class Constant(BooleanExpression):
def __init__(self, value):
self.value = value
def interpret(self, context):
return self.value
def __str__(self):
return str(self.value)
# Terminal Expression for variables
class Variable(BooleanExpression):
def __init__(self, name):
self.name = name
def interpret(self, context):
return context.get(self.name, False)
def __str__(self):
return self.name
# Non-Terminal Expression for AND
class AndExpression(BooleanExpression):
def __init__(self, left, right):
self.left = left
self.right = right
def interpret(self, context):
return self.left.interpret(context) and self.right.interpret(context)
def __str__(self):
return f"({self.left} AND {self.right})"
# Non-Terminal Expression for OR
class OrExpression(BooleanExpression):
def __init__(self, left, right):
self.left = left
self.right = right
def interpret(self, context):
return self.left.interpret(context) or self.right.interpret(context)
def __str__(self):
return f"({self.left} OR {self.right})"
# Non-Terminal Expression for NOT
class NotExpression(BooleanExpression):
def __init__(self, expression):
self.expression = expression
def interpret(self, context):
return not self.expression.interpret(context)
def __str__(self):
return f"NOT {self.expression}"
# Client code
if __name__ == "__main__":
# Build AST for: (x AND y) OR (NOT z)
x = Variable("x")
y = Variable("y")
z = Variable("z")
expression = OrExpression(
AndExpression(x, y),
NotExpression(z)
)
print(f"Expression: {expression}")
# Test with different contexts
context1 = {"x": True, "y": True, "z": False}
print(f"Context {context1}: {expression.interpret(context1)}")
# Output: True (because x AND y is True)
context2 = {"x": False, "y": True, "z": False}
print(f"Context {context2}: {expression.interpret(context2)}")
# Output: True (because NOT z is True)
context3 = {"x": False, "y": False, "z": True}
print(f"Context {context3}: {expression.interpret(context3)}")
# Output: False (both sides are False)
Expected Output:
Expression: ((x AND y) OR NOT z)
Context {'x': True, 'y': True, 'z': False}: True
Context {'x': False, 'y': True, 'z': False}: True
Context {'x': False, 'y': False, 'z': True}: False
Try it yourself: Add an XOR (exclusive OR) expression class that returns True only when exactly one operand is True.
Example 2: Arithmetic Expression Calculator
Let’s build a calculator that can evaluate expressions like “5 + 3 * 2”.
from abc import ABC, abstractmethod
# Abstract Expression
class Expression(ABC):
@abstractmethod
def interpret(self):
pass
# Terminal Expression for numbers
class Number(Expression):
def __init__(self, value):
self.value = value
def interpret(self):
return self.value
def __str__(self):
return str(self.value)
# Non-Terminal Expression for Addition
class Add(Expression):
def __init__(self, left, right):
self.left = left
self.right = right
def interpret(self):
return self.left.interpret() + self.right.interpret()
def __str__(self):
return f"({self.left} + {self.right})"
# Non-Terminal Expression for Subtraction
class Subtract(Expression):
def __init__(self, left, right):
self.left = left
self.right = right
def interpret(self):
return self.left.interpret() - self.right.interpret()
def __str__(self):
return f"({self.left} - {self.right})"
# Non-Terminal Expression for Multiplication
class Multiply(Expression):
def __init__(self, left, right):
self.left = left
self.right = right
def interpret(self):
return self.left.interpret() * self.right.interpret()
def __str__(self):
return f"({self.left} * {self.right})"
# Non-Terminal Expression for Division
class Divide(Expression):
def __init__(self, left, right):
self.left = left
self.right = right
def interpret(self):
right_value = self.right.interpret()
if right_value == 0:
raise ValueError("Division by zero")
return self.left.interpret() / right_value
def __str__(self):
return f"({self.left} / {self.right})"
# Simple parser to build AST from postfix notation
class Calculator:
def evaluate_postfix(self, expression):
"""Evaluate postfix expression like '5 3 2 * +' (5 + 3 * 2)"""
stack = []
tokens = expression.split()
for token in tokens:
if token in ['+', '-', '*', '/']:
if len(stack) < 2:
raise ValueError("Invalid expression")
right = stack.pop()
left = stack.pop()
if token == '+':
stack.append(Add(left, right))
elif token == '-':
stack.append(Subtract(left, right))
elif token == '*':
stack.append(Multiply(left, right))
elif token == '/':
stack.append(Divide(left, right))
else:
stack.append(Number(float(token)))
if len(stack) != 1:
raise ValueError("Invalid expression")
return stack[0]
# Client code
if __name__ == "__main__":
calc = Calculator()
# Expression: 5 + (3 * 2) in postfix: 5 3 2 * +
expr1 = calc.evaluate_postfix("5 3 2 * +")
print(f"Expression: {expr1}")
print(f"Result: {expr1.interpret()}")
# Output: 11.0
# Expression: (15 - 5) / 2 in postfix: 15 5 - 2 /
expr2 = calc.evaluate_postfix("15 5 - 2 /")
print(f"\nExpression: {expr2}")
print(f"Result: {expr2.interpret()}")
# Output: 5.0
# Expression: 10 + 20 - 5 * 2 in postfix: 10 20 + 5 2 * -
expr3 = calc.evaluate_postfix("10 20 + 5 2 * -")
print(f"\nExpression: {expr3}")
print(f"Result: {expr3.interpret()}")
# Output: 20.0
Expected Output:
Expression: (5.0 + (3.0 * 2.0))
Result: 11.0
Expression: ((15.0 - 5.0) / 2.0)
Result: 5.0
Expression: ((10.0 + 20.0) - (5.0 * 2.0))
Result: 20.0
Java/C++ Note: In Java, you’d use an interface for Expression and implement it in concrete classes. In C++, you’d use an abstract base class with pure virtual functions. Both languages require explicit memory management for the AST nodes (smart pointers in C++, garbage collection in Java).
Try it yourself: Extend the calculator to support parentheses in infix notation (e.g., “(5 + 3) * 2”) by implementing a simple infix-to-postfix converter.
Common Mistakes
1. Building Complex Grammars with Interpreter Pattern
Mistake: Using the Interpreter Pattern for complex languages with many grammar rules.
Why it’s wrong: The pattern creates one class per grammar rule. For complex languages, this leads to an explosion of classes that are hard to maintain. Performance also suffers because interpretation happens at runtime through many method calls.
Solution: Use the Interpreter Pattern only for simple grammars (typically fewer than 20 rules). For complex languages, use parser generators like ANTLR, Yacc, or PLY that can generate efficient parsers from grammar specifications.
2. Ignoring Operator Precedence
Mistake: Building the AST without considering operator precedence, leading to incorrect evaluation order.
# Wrong: treats "2 + 3 * 4" as "(2 + 3) * 4" = 20
expr = Multiply(Add(Number(2), Number(3)), Number(4))
# Correct: treats it as "2 + (3 * 4)" = 14
expr = Add(Number(2), Multiply(Number(3), Number(4)))
Solution: Implement a proper parser that respects operator precedence when building the AST. Use techniques like recursive descent parsing or the shunting-yard algorithm.
3. Mixing Parsing and Interpretation
Mistake: Putting parsing logic inside the interpret methods instead of separating concerns.
Why it’s wrong: The Interpreter Pattern assumes you already have an AST. Mixing parsing with interpretation makes the code harder to test and reuse.
Solution: Separate parsing (building the AST) from interpretation (evaluating the AST). The client or a dedicated parser class should build the tree, and expression classes should only handle evaluation.
4. Not Handling Context Properly
Mistake: Using global variables instead of passing a context object, or modifying context in terminal expressions.
# Wrong: using global state
global_vars = {}
class Variable(Expression):
def interpret(self):
return global_vars[self.name] # Tightly coupled to global state
Solution: Always pass context as a parameter to interpret(). Keep terminal expressions immutable and only allow context modifications at the client level.
5. Forgetting to Implement str for Debugging
Mistake: Not implementing string representation for expression classes, making debugging nearly impossible.
Why it matters: When your interpreter produces wrong results, you need to see the AST structure. Without str methods, you can’t visualize what tree was actually built.
Solution: Always implement str or repr methods that show the structure of the expression tree in a readable format.
Interview Tips
What Interviewers Look For
Pattern Recognition: Interviewers want to see if you can identify when the Interpreter Pattern is appropriate. If asked to “build a calculator” or “evaluate expressions,” mention the Interpreter Pattern as one option and discuss trade-offs.
AST Construction: Be prepared to explain how to build an Abstract Syntax Tree. Interviewers often ask: “How would you convert the string ‘2 + 3 * 4’ into an expression tree?” Know at least one parsing technique (recursive descent or postfix evaluation).
Composite Pattern Connection: The Interpreter Pattern heavily uses the Composite Pattern. Be ready to explain how terminal expressions are leaves and non-terminal expressions are composites.
Common Interview Questions
“When would you NOT use the Interpreter Pattern?” Good answer: “I wouldn’t use it for complex grammars with many rules because it creates too many classes and has poor performance. For production systems with complex languages, I’d use a parser generator like ANTLR. I’d also avoid it if the grammar changes frequently, as each change requires modifying class hierarchies.”
“How would you add a new operation to your interpreter?” Good answer: “I’d create a new non-terminal expression class that implements the Expression interface. For example, to add exponentiation, I’d create a Power class with left and right expressions. The key is that existing code doesn’t need to change—I’m just adding a new class.”
“What’s the difference between Interpreter and Strategy Pattern?” Good answer: “Strategy encapsulates algorithms that can be swapped at runtime, while Interpreter defines a grammar and evaluates sentences in that grammar. Interpreter builds a tree structure of expressions, whereas Strategy typically works with a single algorithm object. Interpreter is about language processing; Strategy is about algorithm selection.”
Code Interview Tips
-
Start simple: If asked to build an interpreter, start with just numbers and addition. Get that working, then add more operators. Don’t try to build everything at once.
-
Clarify the input format: Ask whether you need to parse infix notation (“2 + 3”) or if you can assume the AST is already built. Parsing is a separate problem from interpretation.
-
Test edge cases: Show you’re thinking about division by zero, invalid expressions, and empty inputs. Interviewers appreciate defensive programming.
-
Explain time complexity: The interpretation time is O(n) where n is the number of nodes in the AST. Building the AST from a string is typically O(m) where m is the string length.
Red Flags to Avoid
- Don’t confuse the Interpreter Pattern with the Visitor Pattern (both traverse trees, but Visitor separates operations from structure)
- Don’t try to use the Interpreter Pattern for everything—know its limitations
- Don’t forget to mention that modern applications often use parser generators instead of hand-coding interpreters
Key Takeaways
- The Interpreter Pattern defines a grammar as a class hierarchy where each grammar rule becomes a class, and interpretation happens through recursive method calls on an Abstract Syntax Tree.
- Use it for simple, stable grammars only—typically DSLs with fewer than 20 rules. For complex languages, use parser generators like ANTLR instead.
- Separate parsing from interpretation—the client builds the AST, and expression classes only handle evaluation. Don’t mix these concerns.
- The pattern heavily relies on the Composite Pattern—terminal expressions are leaves, non-terminal expressions are composites that contain other expressions.
- Context objects manage state—pass context as a parameter to interpret() rather than using global variables, keeping expressions immutable and reusable.