DRY Principle: Don't Repeat Yourself in Code

TL;DR

DRY (Don’t Repeat Yourself) is a fundamental design principle that states every piece of knowledge should have a single, unambiguous representation in your codebase. Instead of copying and pasting code, extract common functionality into reusable functions, classes, or modules. This reduces bugs, improves maintainability, and makes your code easier to change.

Prerequisites: Basic Python syntax, understanding of functions and classes, familiarity with code organization concepts like modules and imports.

After this topic: Identify code duplication in existing codebases, extract repeated logic into reusable functions and classes, apply the DRY principle to reduce maintenance burden and improve code quality, and recognize when NOT to apply DRY (avoiding premature abstraction).

Core Concept

What is DRY?

DRY (Don’t Repeat Yourself) is a software development principle coined by Andy Hunt and Dave Thomas in The Pragmatic Programmer. The core idea: every piece of knowledge or logic should exist in exactly one place in your system.

When you violate DRY, you create WET code (Write Everything Twice, or We Enjoy Typing). WET code means the same logic appears in multiple places. If you need to fix a bug or change behavior, you must update every copy — and you’ll inevitably miss one, creating inconsistencies.

Why DRY Matters

Maintainability: When logic exists in one place, changes happen once. You don’t hunt through files looking for all the places you copied code.

Bug Reduction: Fix a bug once, and it’s fixed everywhere. With duplicated code, you might fix it in three places but miss the fourth.

Readability: DRY code is more concise. Readers understand the logic once, then see it reused, rather than parsing the same code repeatedly.

Single Source of Truth: Business rules, calculations, and algorithms should have one authoritative implementation. This prevents contradictory versions from existing simultaneously.

When to Apply DRY

Apply DRY when you have:

Identical or nearly identical code blocks appearing multiple times
The same business logic implemented in different places
Repeated data transformations or validation rules
Copy-pasted code with minor variations

When NOT to Apply DRY

Accidental duplication isn’t always bad. Two pieces of code might look similar now but represent different concepts that will evolve independently. Premature abstraction can create tight coupling between unrelated parts of your system. The rule of thumb: wait until you see duplication three times before extracting (the “Rule of Three”).

Visual Guide

Code Duplication Problem

graph TD
    A[Bug Found in Logic] --> B[Fix in Function A]
    A --> C[Fix in Function B]
    A --> D[Fix in Function C]
    A --> E[Forgot Function D!]
    E --> F[Bug Still Exists]
    
    style E fill:#ff6b6b
    style F fill:#ff6b6b

With duplicated code, fixing bugs requires updating multiple locations. Missing one location leaves bugs in production.

DRY Solution

graph TD
    A[Bug Found in Logic] --> B[Fix in Shared Function]
    B --> C[Function A Uses It]
    B --> D[Function B Uses It]
    B --> E[Function C Uses It]
    B --> F[Function D Uses It]
    
    style B fill:#51cf66

With DRY, fixing the shared function automatically fixes all callers. One change, everywhere fixed.

Abstraction Levels

graph BT
    A[Specific Implementation 1] --> D[Shared Abstraction]
    B[Specific Implementation 2] --> D
    C[Specific Implementation 3] --> D
    D --> E[Core Logic - Single Source of Truth]
    
    style E fill:#4dabf7
    style D fill:#51cf66

Extract common logic into a shared abstraction. Specific implementations call the shared code with different parameters.

Examples

Example 1: Basic Function Extraction

Before (WET Code):

# Calculating discounts in multiple places
def process_order_standard(price, quantity):
    subtotal = price * quantity
    tax = subtotal * 0.08
    discount = subtotal * 0.10  # 10% discount
    total = subtotal + tax - discount
    return total

def process_order_premium(price, quantity):
    subtotal = price * quantity
    tax = subtotal * 0.08
    discount = subtotal * 0.20  # 20% discount
    total = subtotal + tax - discount
    return total

def process_order_vip(price, quantity):
    subtotal = price * quantity
    tax = subtotal * 0.08
    discount = subtotal * 0.30  # 30% discount
    total = subtotal + tax - discount
    return total

# Usage
print(process_order_standard(100, 2))   # Output: 198.0
print(process_order_premium(100, 2))    # Output: 176.0
print(process_order_vip(100, 2))        # Output: 154.0

Problem: The calculation logic is duplicated three times. If tax rate changes from 8% to 9%, you must update three functions.

After (DRY Code):

def calculate_order_total(price, quantity, discount_rate):
    """Single source of truth for order calculation."""
    subtotal = price * quantity
    tax = subtotal * 0.08
    discount = subtotal * discount_rate
    total = subtotal + tax - discount
    return total

def process_order_standard(price, quantity):
    return calculate_order_total(price, quantity, 0.10)

def process_order_premium(price, quantity):
    return calculate_order_total(price, quantity, 0.20)

def process_order_vip(price, quantity):
    return calculate_order_total(price, quantity, 0.30)

# Usage - same output, but maintainable
print(process_order_standard(100, 2))   # Output: 198.0
print(process_order_premium(100, 2))    # Output: 176.0
print(process_order_vip(100, 2))        # Output: 154.0

Benefit: Now changing the tax rate requires editing one line in one function. The calculation logic exists in exactly one place.

Try it yourself: Add a shipping fee calculation to the order total. Notice how you only need to change calculate_order_total().

Example 2: Class-Based Extraction

Before (WET Code):

class UserValidator:
    def validate_email(self, email):
        if not email or '@' not in email:
            return False
        if len(email) < 5:
            return False
        return True

class AdminValidator:
    def validate_email(self, email):
        if not email or '@' not in email:
            return False
        if len(email) < 5:
            return False
        return True

class GuestValidator:
    def validate_email(self, email):
        if not email or '@' not in email:
            return False
        if len(email) < 5:
            return False
        return True

# Usage
user_val = UserValidator()
admin_val = AdminValidator()
print(user_val.validate_email("test@example.com"))   # Output: True
print(admin_val.validate_email("bad"))               # Output: False

Problem: Email validation logic is copied across three classes. A bug fix requires three updates.

After (DRY Code):

class EmailValidator:
    """Single source of truth for email validation."""
    @staticmethod
    def is_valid(email):
        if not email or '@' not in email:
            return False
        if len(email) < 5:
            return False
        return True

class UserValidator:
    def validate_email(self, email):
        return EmailValidator.is_valid(email)

class AdminValidator:
    def validate_email(self, email):
        return EmailValidator.is_valid(email)

class GuestValidator:
    def validate_email(self, email):
        return EmailValidator.is_valid(email)

# Usage - same behavior, single source of truth
user_val = UserValidator()
admin_val = AdminValidator()
print(user_val.validate_email("test@example.com"))   # Output: True
print(admin_val.validate_email("bad"))               # Output: False

Better Yet - Use Composition:

class EmailValidator:
    @staticmethod
    def is_valid(email):
        if not email or '@' not in email:
            return False
        if len(email) < 5:
            return False
        return True

class UserValidator:
    def __init__(self):
        self.email_validator = EmailValidator()
    
    def validate_email(self, email):
        return self.email_validator.is_valid(email)

# Or even simpler - direct usage
class UserService:
    def register_user(self, email, password):
        if not EmailValidator.is_valid(email):
            raise ValueError("Invalid email")
        # ... registration logic

# Usage
service = UserService()
try:
    service.register_user("test@example.com", "pass123")
    print("User registered")  # Output: User registered
except ValueError as e:
    print(e)

Try it yourself: Add phone number validation. Create a PhoneValidator class and use it across multiple validator classes.

Example 3: Configuration and Constants

Before (WET Code):

class PaymentProcessor:
    def process_credit_card(self, amount):
        fee = amount * 0.029 + 0.30  # Credit card fee
        return amount + fee

class RefundProcessor:
    def process_refund(self, amount):
        fee = amount * 0.029 + 0.30  # Same fee calculation
        return amount - fee

class ReportGenerator:
    def calculate_fees(self, transactions):
        total_fees = 0
        for amount in transactions:
            fee = amount * 0.029 + 0.30  # Duplicated again!
            total_fees += fee
        return total_fees

# Usage
processor = PaymentProcessor()
print(processor.process_credit_card(100))  # Output: 103.2

Problem: The fee calculation (2.9% + $0.30) is hardcoded in three places. When the payment provider changes fees, you must find and update all occurrences.

After (DRY Code):

class PaymentConfig:
    """Single source of truth for payment configuration."""
    CREDIT_CARD_RATE = 0.029
    CREDIT_CARD_FIXED = 0.30
    
    @classmethod
    def calculate_credit_card_fee(cls, amount):
        return amount * cls.CREDIT_CARD_RATE + cls.CREDIT_CARD_FIXED

class PaymentProcessor:
    def process_credit_card(self, amount):
        fee = PaymentConfig.calculate_credit_card_fee(amount)
        return amount + fee

class RefundProcessor:
    def process_refund(self, amount):
        fee = PaymentConfig.calculate_credit_card_fee(amount)
        return amount - fee

class ReportGenerator:
    def calculate_fees(self, transactions):
        total_fees = sum(
            PaymentConfig.calculate_credit_card_fee(amount) 
            for amount in transactions
        )
        return total_fees

# Usage - same output, centralized configuration
processor = PaymentProcessor()
print(processor.process_credit_card(100))  # Output: 103.2

report = ReportGenerator()
print(report.calculate_fees([100, 200, 50]))  # Output: 11.07

Java/C++ Note: In Java, you’d use a PaymentConfig class with public static final constants and a static method. In C++, use a namespace or class with static const members.

Try it yourself: Add a new payment method (PayPal) with different fees. Notice how easy it is to add to PaymentConfig without touching existing code.

Example 4: Data Validation Pattern

Before (WET Code):

def create_user(username, email, age):
    # Validation logic
    if not username or len(username) < 3:
        raise ValueError("Username too short")
    if not email or '@' not in email:
        raise ValueError("Invalid email")
    if age < 18:
        raise ValueError("Must be 18+")
    # Create user...
    return {"username": username, "email": email, "age": age}

def update_user(user_id, username, email, age):
    # Same validation logic duplicated!
    if not username or len(username) < 3:
        raise ValueError("Username too short")
    if not email or '@' not in email:
        raise ValueError("Invalid email")
    if age < 18:
        raise ValueError("Must be 18+")
    # Update user...
    return {"username": username, "email": email, "age": age}

# Usage
user = create_user("john_doe", "john@example.com", 25)
print(user)  # Output: {'username': 'john_doe', 'email': 'john@example.com', 'age': 25}

After (DRY Code):

class UserValidator:
    """Single source of truth for user validation rules."""
    @staticmethod
    def validate_username(username):
        if not username or len(username) < 3:
            raise ValueError("Username must be at least 3 characters")
    
    @staticmethod
    def validate_email(email):
        if not email or '@' not in email:
            raise ValueError("Invalid email format")
    
    @staticmethod
    def validate_age(age):
        if age < 18:
            raise ValueError("Must be 18 or older")
    
    @classmethod
    def validate_user_data(cls, username, email, age):
        """Validate all user fields at once."""
        cls.validate_username(username)
        cls.validate_email(email)
        cls.validate_age(age)

def create_user(username, email, age):
    UserValidator.validate_user_data(username, email, age)
    return {"username": username, "email": email, "age": age}

def update_user(user_id, username, email, age):
    UserValidator.validate_user_data(username, email, age)
    # Update logic...
    return {"username": username, "email": email, "age": age}

# Usage - same behavior, maintainable validation
user = create_user("john_doe", "john@example.com", 25)
print(user)  # Output: {'username': 'john_doe', 'email': 'john@example.com', 'age': 25}

try:
    create_user("ab", "invalid", 15)  # Multiple validation errors
except ValueError as e:
    print(e)  # Output: Username must be at least 3 characters

Try it yourself: Add password validation (minimum 8 characters, must contain a number). Notice you only add it to UserValidator and both functions automatically use it.

Common Mistakes

1. Over-Applying DRY (Premature Abstraction)

Mistake: Extracting code into a shared function the first time you see similarity, before understanding if the duplication is accidental or essential.

# Bad: Premature abstraction
def calculate_area(length, width):
    return length * width

# Used for both rectangle area AND unrelated price calculation
rectangle_area = calculate_area(5, 10)
order_total = calculate_area(price, quantity)  # Confusing!

Why it’s wrong: These calculations happen to use the same formula now, but represent different concepts. When rectangle area needs to handle circles (different formula) or order totals need tax (different logic), this abstraction breaks down.

Better approach: Wait for the Rule of Three — extract after you see duplication three times, when you’re confident the logic truly represents the same concept.

2. Creating Overly Generic Abstractions

Mistake: Making functions so generic they become hard to understand and use.

# Bad: Too generic
def process_data(data, operation, config, flags, options):
    # 50 lines of conditional logic
    if flags['mode'] == 'A':
        # ...
    elif flags['mode'] == 'B':
        # ...
    # Handles too many cases

Why it’s wrong: This function tries to do everything. It’s hard to test, hard to understand, and changes for one use case risk breaking others.

Better approach: Create specific functions for specific use cases. Shared logic should be extracted to small, focused helper functions.

# Good: Specific functions with shared helpers
def _validate_input(data):
    # Shared validation logic
    pass

def process_user_data(user_data):
    _validate_input(user_data)
    # User-specific logic

def process_order_data(order_data):
    _validate_input(order_data)
    # Order-specific logic

3. Ignoring the “Knowledge” Part of DRY

Mistake: Focusing only on code duplication while ignoring duplicated knowledge, business rules, or configuration.

# Bad: Business rule duplicated in multiple places
class OrderService:
    def can_apply_discount(self, user):
        return user.orders_count > 5  # Business rule: 5+ orders

class EmailService:
    def send_loyalty_email(self, user):
        if user.orders_count > 5:  # Same rule, different place!
            # Send email

Why it’s wrong: The business rule “loyalty customers have 5+ orders” exists in two places. If marketing changes this to 10 orders, you must update both.

Better approach: Extract the business rule into a single location.

class User:
    LOYALTY_THRESHOLD = 5
    
    def is_loyalty_customer(self):
        return self.orders_count > self.LOYALTY_THRESHOLD

class OrderService:
    def can_apply_discount(self, user):
        return user.is_loyalty_customer()

class EmailService:
    def send_loyalty_email(self, user):
        if user.is_loyalty_customer():
            # Send email

4. Not Recognizing Incidental Duplication

Mistake: Treating code that looks similar as duplication, even when it represents different concepts that will evolve independently.

# These look similar but represent different concepts
def calculate_employee_bonus(salary):
    return salary * 0.10  # 10% of salary

def calculate_sales_tax(price):
    return price * 0.10  # 10% tax rate

# Bad: Combining them
def calculate_percentage(amount):
    return amount * 0.10

# Now they're coupled! When tax changes to 8%, bonus logic breaks

Why it’s wrong: Bonus calculation and tax calculation are different business concepts. They happen to use the same percentage today, but will change independently.

Better approach: Keep them separate. If the percentage becomes configurable later, extract it then.

5. Creating Hidden Dependencies

Mistake: Extracting code into a shared function that creates unexpected dependencies between unrelated parts of the system.

# Bad: Shared function with hidden side effects
def process_and_log(data, user_type):
    result = transform_data(data)
    # Hidden dependency: always logs to database
    log_to_database(user_type, result)
    return result

# Used in two places
def handle_user_request(data):
    return process_and_log(data, 'user')  # Logs to DB

def generate_report(data):
    return process_and_log(data, 'admin')  # Also logs to DB (unexpected!)

Why it’s wrong: The report generation now has an unexpected dependency on database logging. If the database is down, reports fail.

Better approach: Separate concerns. Extract the transformation logic, but let callers decide whether to log.

def transform_data(data):
    # Pure transformation, no side effects
    return result

def handle_user_request(data):
    result = transform_data(data)
    log_to_database('user', result)  # Explicit logging
    return result

def generate_report(data):
    return transform_data(data)  # No logging needed

Interview Tips

What Interviewers Look For

1. Recognition of Duplication: Interviewers often present code with obvious duplication to see if you notice. When reviewing code (yours or theirs), explicitly call out repeated logic: “I notice this validation logic appears in three places. We should extract it into a shared function.”

2. Appropriate Abstraction Level: Don’t immediately jump to complex abstractions. Start simple: “I’d extract this into a helper function first. If we see more patterns emerge, we can create a class.” This shows you understand progressive refinement.

3. Balancing DRY with Readability: Sometimes a little duplication is clearer than a complex abstraction. Show judgment: “These two functions look similar, but they represent different business concepts. I’d keep them separate for now and revisit if they truly need to evolve together.”

Common Interview Scenarios

Scenario 1: Code Review Question

Interviewer shows you code with duplication and asks: “What would you improve?”

Strong answer: “I see the email validation logic is duplicated in three methods. I’d extract it into a separate EmailValidator class with a static is_valid() method. This gives us a single source of truth — if we need to add domain blacklist checking later, we change it once. I’d also add unit tests specifically for the validator.”

Weak answer: “I’d make it DRY.” (Too vague — show HOW you’d apply the principle.)

Scenario 2: Design Question

Interviewer asks: “Design a system for processing different types of payments (credit card, PayPal, Bitcoin).”

Strong answer: “Each payment type has unique processing logic, so I’d use the Strategy pattern with a PaymentProcessor interface. However, they all share common concerns like logging, validation, and retry logic. I’d extract those into shared utilities that all processors use. For example, a TransactionLogger and RetryHandler that each processor composes. This keeps payment-specific logic separate while applying DRY to cross-cutting concerns.”

Weak answer: “I’d create one big function that handles all payment types.” (This violates both DRY and Single Responsibility.)

Scenario 3: Refactoring Exercise

Interviewer gives you duplicated code and asks you to refactor it.

Approach:

Identify the duplication: “I see these three functions all calculate subtotal, tax, and total the same way.”
Explain the extraction: “I’ll create a calculate_order_total() function that takes the discount rate as a parameter.”
Show the refactor: Write the code, demonstrating the before and after.
Discuss trade-offs: “This makes the calculation logic easier to maintain, but adds a level of indirection. The trade-off is worth it because order calculation is complex and appears in multiple places.”
Mention testing: “I’d add unit tests for the extracted function to ensure all discount rates work correctly.”

Key Phrases to Use

“This creates a single source of truth for…”
“If we need to change this logic, we only update one place.”
“I see this pattern repeated three times, which suggests we should extract it.”
“These look similar but represent different concepts, so I’d keep them separate.”
“Let’s extract the common logic while keeping the specific variations parameterized.”

Red Flags to Avoid

Don’t say: “We should never have any duplicate code.” (Shows you don’t understand trade-offs.)
Don’t: Extract code into a shared function without explaining why or what problem it solves.
Don’t: Create overly complex abstractions to eliminate minor duplication.
Don’t: Ignore the interviewer’s hints about whether duplication is intentional or accidental.

Advanced Interview Topics

When asked about DRY in larger systems:

Discuss shared libraries for common utilities across microservices
Mention configuration management to avoid duplicating settings
Talk about code generation for repetitive boilerplate (like ORMs)
Explain inheritance vs. composition trade-offs for sharing behavior

When asked about DRY violations you’ve fixed:

Have a real example ready: “In my last project, we had payment fee calculations duplicated across five services. I extracted them into a shared PaymentUtils library. This reduced bugs — we had fixed a rounding error in three places but missed two. After the refactor, we caught several edge cases in the shared code that would have required fixing in five places.”

Key Takeaways

DRY means every piece of knowledge exists in exactly one place — not just code, but business rules, configuration, and algorithms. Duplication creates maintenance burden and inconsistency.
Apply the Rule of Three: Wait until you see duplication three times before extracting. Premature abstraction can create unnecessary complexity and coupling between unrelated concepts.
Extract common logic into focused, reusable functions or classes — but keep abstractions simple and specific. A good abstraction should have a clear, single purpose.
Distinguish between accidental and essential duplication — code that looks similar today might represent different concepts that will evolve independently. Don’t couple unrelated parts of your system.
DRY improves maintainability and reduces bugs — changes happen in one place, tests cover one implementation, and you can’t forget to update a copy. This is especially valuable in interviews when discussing code quality and design decisions.