Deadlock, Livelock & Starvation in Java

TL;DR

Deadlock occurs when threads wait indefinitely for each other to release resources. Livelock happens when threads keep changing state in response to each other without making progress. Starvation means a thread never gets CPU time because other threads monopolize resources.

Prerequisites: Understanding of threads and processes, basic knowledge of locks and mutexes, familiarity with concurrent programming concepts like race conditions and critical sections.

After this topic: Identify deadlock conditions using Coffman’s four conditions, distinguish between deadlock, livelock, and starvation in code examples, implement deadlock prevention strategies like lock ordering and timeouts, and diagnose concurrency hazards in multi-threaded applications.

Core Concept

What Are Concurrency Hazards?

Concurrency hazards are problems that arise when multiple threads access shared resources. Three critical hazards are deadlock, livelock, and starvation.

Deadlock

Deadlock occurs when two or more threads are blocked forever, each waiting for the other to release a resource. Think of two cars at a narrow bridge, each waiting for the other to back up first—neither can proceed.

Coffman’s Four Conditions

Deadlock happens only when ALL four conditions are met:

Mutual Exclusion: Resources cannot be shared (only one thread can hold a lock at a time)
Hold and Wait: Threads hold resources while waiting for others
No Preemption: Resources cannot be forcibly taken from threads
Circular Wait: A circular chain of threads exists, each waiting for a resource held by the next

Breaking ANY one condition prevents deadlock.

Livelock

Livelock is like deadlock, but threads are actively doing work—just not useful work. Threads keep changing state in response to each other without making progress. Imagine two people in a hallway, both stepping left, then both stepping right, repeatedly blocking each other while trying to be polite.

Starvation

Starvation occurs when a thread never gets access to resources because other threads keep taking them. The thread isn’t blocked permanently (like deadlock), but it never gets scheduled. This happens with unfair scheduling or priority inversion.

Prevention Strategies

Deadlock Prevention: Lock ordering (always acquire locks in the same order), timeouts (give up after waiting too long), or use trylock instead of blocking locks.

Livelock Prevention: Add randomness to retry logic, use exponential backoff.

Starvation Prevention: Fair scheduling algorithms, priority aging (gradually increase waiting thread priority).

Visual Guide

Deadlock Scenario

graph LR
    A[Thread 1] -->|holds| L1[Lock A]
    A -->|waits for| L2[Lock B]
    B[Thread 2] -->|holds| L2
    B -->|waits for| L1
    style A fill:#f9f,stroke:#333
    style B fill:#f9f,stroke:#333
    style L1 fill:#ff9,stroke:#333
    style L2 fill:#ff9,stroke:#333

Classic deadlock: Thread 1 holds Lock A and waits for Lock B, while Thread 2 holds Lock B and waits for Lock A. Both threads are blocked indefinitely.

Coffman’s Four Conditions

graph TD
    D[Deadlock Occurs] --> C1[Mutual Exclusion]
    D --> C2[Hold and Wait]
    D --> C3[No Preemption]
    D --> C4[Circular Wait]
    C1 --> P1[Prevention: Use shareable resources]
    C2 --> P2[Prevention: Acquire all locks at once]
    C3 --> P3[Prevention: Allow lock timeout]
    C4 --> P4[Prevention: Lock ordering]
    style D fill:#f66,stroke:#333
    style P1 fill:#6f6,stroke:#333
    style P2 fill:#6f6,stroke:#333
    style P3 fill:#6f6,stroke:#333
    style P4 fill:#6f6,stroke:#333

All four conditions must be present for deadlock. Breaking any one condition prevents deadlock.

Livelock vs Deadlock

stateDiagram-v2
    [*] --> Running
    Running --> Deadlock: Both threads blocked
    Running --> Livelock: Both threads active but no progress
    Deadlock --> Deadlock: Waiting forever (CPU idle)
    Livelock --> Livelock: Constantly responding (CPU busy)
    note right of Deadlock
        Threads: BLOCKED
        CPU Usage: Low
        Progress: None
    end note
    note right of Livelock
        Threads: RUNNING
        CPU Usage: High
        Progress: None
    end note

Deadlock means threads are blocked and waiting. Livelock means threads are running but making no progress.

Examples

Example 1: Classic Deadlock

import threading
import time

lock_a = threading.Lock()
lock_b = threading.Lock()

def thread1_work():
    print("Thread 1: Acquiring Lock A")
    lock_a.acquire()
    print("Thread 1: Got Lock A")
    time.sleep(0.1)  # Simulate work
    
    print("Thread 1: Waiting for Lock B")
    lock_b.acquire()  # DEADLOCK: Thread 2 holds Lock B
    print("Thread 1: Got Lock B")
    
    lock_b.release()
    lock_a.release()

def thread2_work():
    print("Thread 2: Acquiring Lock B")
    lock_b.acquire()
    print("Thread 2: Got Lock B")
    time.sleep(0.1)  # Simulate work
    
    print("Thread 2: Waiting for Lock A")
    lock_a.acquire()  # DEADLOCK: Thread 1 holds Lock A
    print("Thread 2: Got Lock A")
    
    lock_a.release()
    lock_b.release()

t1 = threading.Thread(target=thread1_work)
t2 = threading.Thread(target=thread2_work)

t1.start()
t2.start()

t1.join(timeout=2)
t2.join(timeout=2)

if t1.is_alive() or t2.is_alive():
    print("\nDEADLOCK DETECTED: Threads are still running after timeout")

Expected Output:

Thread 1: Acquiring Lock A
Thread 1: Got Lock A
Thread 2: Acquiring Lock B
Thread 2: Got Lock B
Thread 1: Waiting for Lock B
Thread 2: Waiting for Lock A

DEADLOCK DETECTED: Threads are still running after timeout

Why it deadlocks: Thread 1 holds Lock A and waits for Lock B. Thread 2 holds Lock B and waits for Lock A. This creates a circular wait.

Try it yourself: Run this code and observe that it hangs. Press Ctrl+C to stop it.

Example 2: Deadlock Prevention with Lock Ordering

import threading
import time

lock_a = threading.Lock()
lock_b = threading.Lock()

def acquire_locks_ordered():
    """Always acquire locks in the same order: A then B"""
    lock_a.acquire()
    lock_b.acquire()

def release_locks_ordered():
    """Release in reverse order"""
    lock_b.release()
    lock_a.release()

def thread1_work():
    print("Thread 1: Acquiring locks in order")
    acquire_locks_ordered()
    print("Thread 1: Got both locks")
    time.sleep(0.1)
    print("Thread 1: Releasing locks")
    release_locks_ordered()

def thread2_work():
    print("Thread 2: Acquiring locks in order")
    acquire_locks_ordered()  # Same order as Thread 1
    print("Thread 2: Got both locks")
    time.sleep(0.1)
    print("Thread 2: Releasing locks")
    release_locks_ordered()

t1 = threading.Thread(target=thread1_work)
t2 = threading.Thread(target=thread2_work)

t1.start()
t2.start()

t1.join()
t2.join()

print("\nSUCCESS: Both threads completed without deadlock")

Expected Output:

Thread 1: Acquiring locks in order
Thread 1: Got both locks
Thread 2: Acquiring locks in order
Thread 1: Releasing locks
Thread 2: Got both locks
Thread 2: Releasing locks

SUCCESS: Both threads completed without deadlock

Why it works: Both threads acquire locks in the same order (A then B). This breaks the circular wait condition.

Try it yourself: Change thread2_work to acquire locks in reverse order (B then A) and observe the deadlock.

Example 3: Deadlock Prevention with Timeout

import threading
import time

lock_a = threading.Lock()
lock_b = threading.Lock()

def thread_work_with_timeout(name, first_lock, second_lock):
    max_retries = 5
    for attempt in range(max_retries):
        print(f"{name}: Attempt {attempt + 1}")
        
        # Try to acquire first lock
        if not first_lock.acquire(timeout=0.5):
            print(f"{name}: Timeout on first lock, retrying...")
            continue
        
        print(f"{name}: Got first lock")
        
        # Try to acquire second lock
        if not second_lock.acquire(timeout=0.5):
            print(f"{name}: Timeout on second lock, releasing first and retrying...")
            first_lock.release()
            time.sleep(0.1)  # Back off before retry
            continue
        
        print(f"{name}: Got both locks, doing work")
        time.sleep(0.1)  # Simulate work
        
        # Release locks
        second_lock.release()
        first_lock.release()
        print(f"{name}: Completed successfully")
        return
    
    print(f"{name}: Failed after {max_retries} attempts")

t1 = threading.Thread(target=thread_work_with_timeout, args=("Thread 1", lock_a, lock_b))
t2 = threading.Thread(target=thread_work_with_timeout, args=("Thread 2", lock_b, lock_a))

t1.start()
t2.start()

t1.join()
t2.join()

Expected Output:

Thread 1: Attempt 1
Thread 1: Got first lock
Thread 2: Attempt 1
Thread 2: Got first lock
Thread 1: Timeout on second lock, releasing first and retrying...
Thread 2: Timeout on second lock, releasing first and retrying...
Thread 1: Attempt 2
Thread 1: Got first lock
Thread 1: Got both locks, doing work
Thread 1: Completed successfully
Thread 2: Attempt 2
Thread 2: Got first lock
Thread 2: Got both locks, doing work
Thread 2: Completed successfully

Why it works: Timeouts prevent indefinite waiting. If a thread can’t get a lock, it releases what it has and retries.

Note (Java): Use tryLock(timeout, TimeUnit.SECONDS) from ReentrantLock.

Try it yourself: Remove the timeout parameter and observe the deadlock.

Example 4: Livelock

import threading
import time

lock_a = threading.Lock()
lock_b = threading.Lock()

def polite_thread_work(name, first_lock, second_lock):
    """Thread that gives up immediately if it can't get both locks"""
    for attempt in range(10):
        print(f"{name}: Attempt {attempt + 1}")
        
        # Try first lock
        if first_lock.acquire(blocking=False):
            print(f"{name}: Got first lock")
            
            # Try second lock
            if second_lock.acquire(blocking=False):
                print(f"{name}: Got both locks, doing work")
                time.sleep(0.1)
                second_lock.release()
                first_lock.release()
                print(f"{name}: Completed")
                return
            else:
                # Be "polite" - immediately release and retry
                print(f"{name}: Can't get second lock, releasing first")
                first_lock.release()
        else:
            print(f"{name}: Can't get first lock")
        
        # No backoff - immediately retry (causes livelock)
        time.sleep(0.001)
    
    print(f"{name}: LIVELOCK - gave up after 10 attempts")

t1 = threading.Thread(target=polite_thread_work, args=("Thread 1", lock_a, lock_b))
t2 = threading.Thread(target=polite_thread_work, args=("Thread 2", lock_b, lock_a))

t1.start()
t2.start()

t1.join()
t2.join()

Expected Output (typical livelock pattern):

Thread 1: Attempt 1
Thread 1: Got first lock
Thread 2: Attempt 1
Thread 2: Got first lock
Thread 1: Can't get second lock, releasing first
Thread 2: Can't get second lock, releasing first
Thread 1: Attempt 2
Thread 1: Got first lock
Thread 2: Attempt 2
Thread 2: Got first lock
Thread 1: Can't get second lock, releasing first
Thread 2: Can't get second lock, releasing first
...(repeats)...
Thread 1: LIVELOCK - gave up after 10 attempts
Thread 2: LIVELOCK - gave up after 10 attempts

Why it’s livelock: Both threads are actively running (not blocked), but they keep responding to each other without making progress.

Fix: Add randomized backoff:

import random
time.sleep(random.uniform(0.01, 0.1))  # Random delay before retry

Try it yourself: Add the random backoff and observe that threads eventually succeed.

Example 5: Starvation

import threading
import time
import queue

resource_lock = threading.Lock()
work_queue = queue.Queue()

def high_priority_worker(worker_id):
    """Aggressive worker that keeps taking the lock"""
    for i in range(5):
        resource_lock.acquire()
        print(f"High Priority Worker {worker_id}: Using resource (iteration {i+1})")
        time.sleep(0.1)  # Hold lock for a while
        resource_lock.release()
        time.sleep(0.01)  # Very short break before retrying

def low_priority_worker(worker_id):
    """Worker that tries to get the lock but keeps losing to high priority"""
    start_time = time.time()
    attempts = 0
    
    while time.time() - start_time < 2:  # Try for 2 seconds
        attempts += 1
        if resource_lock.acquire(blocking=False):
            print(f"Low Priority Worker {worker_id}: FINALLY got resource after {attempts} attempts!")
            time.sleep(0.1)
            resource_lock.release()
            return
        time.sleep(0.05)  # Wait before retry
    
    print(f"Low Priority Worker {worker_id}: STARVED - never got resource after {attempts} attempts")

# Start high priority workers
high_threads = []
for i in range(3):
    t = threading.Thread(target=high_priority_worker, args=(i,))
    t.start()
    high_threads.append(t)

# Start low priority worker
low_thread = threading.Thread(target=low_priority_worker, args=(0,))
low_thread.start()

# Wait for all threads
for t in high_threads:
    t.join()
low_thread.join()

Expected Output:

High Priority Worker 0: Using resource (iteration 1)
High Priority Worker 1: Using resource (iteration 1)
High Priority Worker 2: Using resource (iteration 1)
High Priority Worker 0: Using resource (iteration 2)
High Priority Worker 1: Using resource (iteration 2)
High Priority Worker 2: Using resource (iteration 2)
...(high priority workers dominate)...
Low Priority Worker 0: STARVED - never got resource after 35 attempts

Why starvation occurs: High priority workers acquire and release the lock quickly, giving the low priority worker no chance.

Fix: Use a fair lock or queue-based approach:

# Python's threading.Lock is not fair by default
# Use a queue to ensure fairness:
request_queue = queue.Queue()

def fair_acquire():
    my_turn = threading.Event()
    request_queue.put(my_turn)
    my_turn.wait()  # Wait for my turn
    resource_lock.acquire()

def fair_release():
    resource_lock.release()
    if not request_queue.empty():
        next_turn = request_queue.get()
        next_turn.set()  # Signal next thread

Note (Java): Use ReentrantLock(true) for fair locking.

Try it yourself: Implement the fair lock approach and verify all threads get access.

Common Mistakes

1. Acquiring Locks in Different Orders

Mistake:

# Thread 1
lock_a.acquire()
lock_b.acquire()

# Thread 2
lock_b.acquire()  # Different order!
lock_a.acquire()

Why it’s wrong: This creates the circular wait condition for deadlock. Thread 1 holds A and waits for B, while Thread 2 holds B and waits for A.

Fix: Always acquire locks in the same global order. Document the lock ordering in your codebase.

2. Forgetting to Release Locks on Error Paths

Mistake:

lock.acquire()
if error_condition:
    return  # BUG: Lock never released!
do_work()
lock.release()

Why it’s wrong: If an exception or early return happens, the lock stays held forever. Other threads will deadlock waiting for it.

Fix: Use context managers (Python) or try-finally blocks:

with lock:  # Automatically releases on exit
    if error_condition:
        return
    do_work()

# Or use try-finally:
lock.acquire()
try:
    if error_condition:
        return
    do_work()
finally:
    lock.release()

3. Confusing Livelock with Deadlock

Mistake: Assuming high CPU usage means no deadlock. Seeing threads “running” and thinking progress is being made.

Why it’s wrong: Livelock threads are active but not making progress. They consume CPU while accomplishing nothing.

How to identify:

Deadlock: Threads in BLOCKED state, low CPU usage, stack traces show waiting on locks
Livelock: Threads in RUNNABLE state, high CPU usage, stack traces show repeated lock attempts

Fix: Add logging to track actual progress (not just activity). Use exponential backoff with randomization.

4. Using Sleep Instead of Proper Synchronization

Mistake:

while not resource_available:
    time.sleep(0.1)  # Polling - wastes CPU and causes starvation

Why it’s wrong: Polling wastes CPU cycles and can cause starvation if timing is unlucky. The sleep duration is arbitrary—too short wastes CPU, too long adds latency.

Fix: Use proper synchronization primitives:

condition = threading.Condition()

# Producer
with condition:
    resource_available = True
    condition.notify()  # Wake up waiting threads

# Consumer
with condition:
    while not resource_available:
        condition.wait()  # Efficient waiting

5. Ignoring Lock Timeouts

Mistake:

lock.acquire()  # Blocks forever if deadlock occurs

Why it’s wrong: No way to detect or recover from deadlock. The thread hangs indefinitely.

Fix: Use timeouts and handle failure:

if lock.acquire(timeout=5):
    try:
        do_work()
    finally:
        lock.release()
else:
    # Handle timeout - log error, retry, or fail gracefully
    logging.error("Failed to acquire lock - possible deadlock")
    raise TimeoutError("Lock acquisition timeout")

Interview tip: Explain that timeouts help with both detection (you know something is wrong) and recovery (you can retry or fail gracefully).

Interview Tips

1. Explain Coffman’s Conditions Clearly

Interviewers love asking: “What are the conditions for deadlock?” Memorize all four and be ready to explain how breaking each one prevents deadlock.

Practice answer: “Deadlock requires four conditions: mutual exclusion, hold and wait, no preemption, and circular wait. We can prevent deadlock by breaking any one. For example, lock ordering breaks circular wait, and timeouts break hold and wait.”

2. Draw the Circular Wait Diagram

When asked about deadlock, immediately offer to draw the scenario. Draw circles for threads and rectangles for locks, with arrows showing “holds” and “waits for.”

Why it works: Visual explanations show clear thinking. Interviewers can see you understand the problem structure.

3. Know the Difference Between Deadlock and Livelock

Interviewers often ask: “How would you diagnose if threads are deadlocked or livelocked?”

Strong answer: “I’d check thread states and CPU usage. Deadlock shows threads in BLOCKED state with low CPU usage. Livelock shows threads in RUNNABLE state with high CPU usage but no progress. I’d add logging to track actual work completed, not just activity.”

4. Discuss Real-World Prevention Strategies

Don’t just say “use lock ordering.” Explain trade-offs:

Example answer: “Lock ordering is simple but requires global coordination—every developer must know the order. Timeouts are more flexible but add complexity in error handling. For databases, I’d use deadlock detection with automatic rollback. For real-time systems, I’d use lock-free data structures to avoid locks entirely.”

5. Connect to Database Deadlocks

Many interviewers ask about database deadlocks. Know that databases use deadlock detection (not prevention) and automatically abort one transaction.

Example answer: “Databases detect deadlocks using wait-for graphs. When a cycle is detected, the database picks a victim transaction to abort, usually the one with the least work done. The application should retry the transaction.”

6. Mention Priority Inversion

For senior roles, mention priority inversion as a cause of starvation: “A high-priority thread can starve if it waits for a lock held by a low-priority thread that never gets scheduled. The solution is priority inheritance—temporarily boost the low-priority thread’s priority.”

7. Code a Deadlock Example Quickly

Be ready to write a simple deadlock scenario in 2 minutes:

# Thread 1
lock_a.acquire()
lock_b.acquire()

# Thread 2  
lock_b.acquire()
lock_a.acquire()

Then immediately explain the fix: “Both threads should acquire in the same order.”

8. Discuss Testing and Detection

Interviewers ask: “How would you test for deadlocks?”

Strong answer: “Deadlocks are hard to reproduce because they depend on timing. I’d use stress testing with many threads and random delays. Tools like ThreadSanitizer (C++) or Java’s jstack can detect deadlocks. In production, I’d add lock timeout monitoring and alerting.”

9. Know Language-Specific Tools

Python: No built-in deadlock detection. Use timeouts and logging.
Java: jstack shows thread dumps with deadlock detection. ThreadMXBean.findDeadlockedThreads() for programmatic detection.
C++: ThreadSanitizer detects deadlocks at runtime.

10. Relate to System Design

For system design interviews, mention: “To avoid deadlocks in distributed systems, I’d use techniques like two-phase locking with timeouts, or avoid distributed locks entirely by using message queues or event sourcing for coordination.”

Key Takeaways

Deadlock requires all four Coffman conditions: mutual exclusion, hold and wait, no preemption, and circular wait. Break any one condition to prevent deadlock.
Lock ordering (always acquire locks in the same global order) is the most common deadlock prevention strategy. Timeouts provide a safety net for detection and recovery.
Livelock differs from deadlock: threads are active (high CPU usage) but make no progress. Fix with randomized exponential backoff.
Starvation occurs when unfair scheduling prevents a thread from ever getting resources. Use fair locks, priority aging, or queue-based synchronization.
Always use context managers (Python with statement) or try-finally blocks to ensure locks are released, even on error paths. Test concurrent code with stress testing and use language-specific tools for deadlock detection.