Deadlock, Livelock & Starvation in Java
TL;DR
Deadlock occurs when threads wait indefinitely for each other to release resources. Livelock happens when threads keep changing state in response to each other without making progress. Starvation means a thread never gets CPU time because other threads monopolize resources.
Core Concept
What Are Concurrency Hazards?
Concurrency hazards are problems that arise when multiple threads access shared resources. Three critical hazards are deadlock, livelock, and starvation.
Deadlock
Deadlock occurs when two or more threads are blocked forever, each waiting for the other to release a resource. Think of two cars at a narrow bridge, each waiting for the other to back up first—neither can proceed.
Coffman’s Four Conditions
Deadlock happens only when ALL four conditions are met:
- Mutual Exclusion: Resources cannot be shared (only one thread can hold a lock at a time)
- Hold and Wait: Threads hold resources while waiting for others
- No Preemption: Resources cannot be forcibly taken from threads
- Circular Wait: A circular chain of threads exists, each waiting for a resource held by the next
Breaking ANY one condition prevents deadlock.
Livelock
Livelock is like deadlock, but threads are actively doing work—just not useful work. Threads keep changing state in response to each other without making progress. Imagine two people in a hallway, both stepping left, then both stepping right, repeatedly blocking each other while trying to be polite.
Starvation
Starvation occurs when a thread never gets access to resources because other threads keep taking them. The thread isn’t blocked permanently (like deadlock), but it never gets scheduled. This happens with unfair scheduling or priority inversion.
Prevention Strategies
Deadlock Prevention: Lock ordering (always acquire locks in the same order), timeouts (give up after waiting too long), or use trylock instead of blocking locks.
Livelock Prevention: Add randomness to retry logic, use exponential backoff.
Starvation Prevention: Fair scheduling algorithms, priority aging (gradually increase waiting thread priority).
Visual Guide
Deadlock Scenario
graph LR
A[Thread 1] -->|holds| L1[Lock A]
A -->|waits for| L2[Lock B]
B[Thread 2] -->|holds| L2
B -->|waits for| L1
style A fill:#f9f,stroke:#333
style B fill:#f9f,stroke:#333
style L1 fill:#ff9,stroke:#333
style L2 fill:#ff9,stroke:#333
Classic deadlock: Thread 1 holds Lock A and waits for Lock B, while Thread 2 holds Lock B and waits for Lock A. Both threads are blocked indefinitely.
Coffman’s Four Conditions
graph TD
D[Deadlock Occurs] --> C1[Mutual Exclusion]
D --> C2[Hold and Wait]
D --> C3[No Preemption]
D --> C4[Circular Wait]
C1 --> P1[Prevention: Use shareable resources]
C2 --> P2[Prevention: Acquire all locks at once]
C3 --> P3[Prevention: Allow lock timeout]
C4 --> P4[Prevention: Lock ordering]
style D fill:#f66,stroke:#333
style P1 fill:#6f6,stroke:#333
style P2 fill:#6f6,stroke:#333
style P3 fill:#6f6,stroke:#333
style P4 fill:#6f6,stroke:#333
All four conditions must be present for deadlock. Breaking any one condition prevents deadlock.
Livelock vs Deadlock
stateDiagram-v2
[*] --> Running
Running --> Deadlock: Both threads blocked
Running --> Livelock: Both threads active but no progress
Deadlock --> Deadlock: Waiting forever (CPU idle)
Livelock --> Livelock: Constantly responding (CPU busy)
note right of Deadlock
Threads: BLOCKED
CPU Usage: Low
Progress: None
end note
note right of Livelock
Threads: RUNNING
CPU Usage: High
Progress: None
end note
Deadlock means threads are blocked and waiting. Livelock means threads are running but making no progress.
Examples
Example 1: Classic Deadlock
import threading
import time
lock_a = threading.Lock()
lock_b = threading.Lock()
def thread1_work():
print("Thread 1: Acquiring Lock A")
lock_a.acquire()
print("Thread 1: Got Lock A")
time.sleep(0.1) # Simulate work
print("Thread 1: Waiting for Lock B")
lock_b.acquire() # DEADLOCK: Thread 2 holds Lock B
print("Thread 1: Got Lock B")
lock_b.release()
lock_a.release()
def thread2_work():
print("Thread 2: Acquiring Lock B")
lock_b.acquire()
print("Thread 2: Got Lock B")
time.sleep(0.1) # Simulate work
print("Thread 2: Waiting for Lock A")
lock_a.acquire() # DEADLOCK: Thread 1 holds Lock A
print("Thread 2: Got Lock A")
lock_a.release()
lock_b.release()
t1 = threading.Thread(target=thread1_work)
t2 = threading.Thread(target=thread2_work)
t1.start()
t2.start()
t1.join(timeout=2)
t2.join(timeout=2)
if t1.is_alive() or t2.is_alive():
print("\nDEADLOCK DETECTED: Threads are still running after timeout")
Expected Output:
Thread 1: Acquiring Lock A
Thread 1: Got Lock A
Thread 2: Acquiring Lock B
Thread 2: Got Lock B
Thread 1: Waiting for Lock B
Thread 2: Waiting for Lock A
DEADLOCK DETECTED: Threads are still running after timeout
Why it deadlocks: Thread 1 holds Lock A and waits for Lock B. Thread 2 holds Lock B and waits for Lock A. This creates a circular wait.
Try it yourself: Run this code and observe that it hangs. Press Ctrl+C to stop it.
Example 2: Deadlock Prevention with Lock Ordering
import threading
import time
lock_a = threading.Lock()
lock_b = threading.Lock()
def acquire_locks_ordered():
"""Always acquire locks in the same order: A then B"""
lock_a.acquire()
lock_b.acquire()
def release_locks_ordered():
"""Release in reverse order"""
lock_b.release()
lock_a.release()
def thread1_work():
print("Thread 1: Acquiring locks in order")
acquire_locks_ordered()
print("Thread 1: Got both locks")
time.sleep(0.1)
print("Thread 1: Releasing locks")
release_locks_ordered()
def thread2_work():
print("Thread 2: Acquiring locks in order")
acquire_locks_ordered() # Same order as Thread 1
print("Thread 2: Got both locks")
time.sleep(0.1)
print("Thread 2: Releasing locks")
release_locks_ordered()
t1 = threading.Thread(target=thread1_work)
t2 = threading.Thread(target=thread2_work)
t1.start()
t2.start()
t1.join()
t2.join()
print("\nSUCCESS: Both threads completed without deadlock")
Expected Output:
Thread 1: Acquiring locks in order
Thread 1: Got both locks
Thread 2: Acquiring locks in order
Thread 1: Releasing locks
Thread 2: Got both locks
Thread 2: Releasing locks
SUCCESS: Both threads completed without deadlock
Why it works: Both threads acquire locks in the same order (A then B). This breaks the circular wait condition.
Try it yourself: Change thread2_work to acquire locks in reverse order (B then A) and observe the deadlock.
Example 3: Deadlock Prevention with Timeout
import threading
import time
lock_a = threading.Lock()
lock_b = threading.Lock()
def thread_work_with_timeout(name, first_lock, second_lock):
max_retries = 5
for attempt in range(max_retries):
print(f"{name}: Attempt {attempt + 1}")
# Try to acquire first lock
if not first_lock.acquire(timeout=0.5):
print(f"{name}: Timeout on first lock, retrying...")
continue
print(f"{name}: Got first lock")
# Try to acquire second lock
if not second_lock.acquire(timeout=0.5):
print(f"{name}: Timeout on second lock, releasing first and retrying...")
first_lock.release()
time.sleep(0.1) # Back off before retry
continue
print(f"{name}: Got both locks, doing work")
time.sleep(0.1) # Simulate work
# Release locks
second_lock.release()
first_lock.release()
print(f"{name}: Completed successfully")
return
print(f"{name}: Failed after {max_retries} attempts")
t1 = threading.Thread(target=thread_work_with_timeout, args=("Thread 1", lock_a, lock_b))
t2 = threading.Thread(target=thread_work_with_timeout, args=("Thread 2", lock_b, lock_a))
t1.start()
t2.start()
t1.join()
t2.join()
Expected Output:
Thread 1: Attempt 1
Thread 1: Got first lock
Thread 2: Attempt 1
Thread 2: Got first lock
Thread 1: Timeout on second lock, releasing first and retrying...
Thread 2: Timeout on second lock, releasing first and retrying...
Thread 1: Attempt 2
Thread 1: Got first lock
Thread 1: Got both locks, doing work
Thread 1: Completed successfully
Thread 2: Attempt 2
Thread 2: Got first lock
Thread 2: Got both locks, doing work
Thread 2: Completed successfully
Why it works: Timeouts prevent indefinite waiting. If a thread can’t get a lock, it releases what it has and retries.
Note (Java): Use tryLock(timeout, TimeUnit.SECONDS) from ReentrantLock.
Try it yourself: Remove the timeout parameter and observe the deadlock.
Example 4: Livelock
import threading
import time
lock_a = threading.Lock()
lock_b = threading.Lock()
def polite_thread_work(name, first_lock, second_lock):
"""Thread that gives up immediately if it can't get both locks"""
for attempt in range(10):
print(f"{name}: Attempt {attempt + 1}")
# Try first lock
if first_lock.acquire(blocking=False):
print(f"{name}: Got first lock")
# Try second lock
if second_lock.acquire(blocking=False):
print(f"{name}: Got both locks, doing work")
time.sleep(0.1)
second_lock.release()
first_lock.release()
print(f"{name}: Completed")
return
else:
# Be "polite" - immediately release and retry
print(f"{name}: Can't get second lock, releasing first")
first_lock.release()
else:
print(f"{name}: Can't get first lock")
# No backoff - immediately retry (causes livelock)
time.sleep(0.001)
print(f"{name}: LIVELOCK - gave up after 10 attempts")
t1 = threading.Thread(target=polite_thread_work, args=("Thread 1", lock_a, lock_b))
t2 = threading.Thread(target=polite_thread_work, args=("Thread 2", lock_b, lock_a))
t1.start()
t2.start()
t1.join()
t2.join()
Expected Output (typical livelock pattern):
Thread 1: Attempt 1
Thread 1: Got first lock
Thread 2: Attempt 1
Thread 2: Got first lock
Thread 1: Can't get second lock, releasing first
Thread 2: Can't get second lock, releasing first
Thread 1: Attempt 2
Thread 1: Got first lock
Thread 2: Attempt 2
Thread 2: Got first lock
Thread 1: Can't get second lock, releasing first
Thread 2: Can't get second lock, releasing first
...(repeats)...
Thread 1: LIVELOCK - gave up after 10 attempts
Thread 2: LIVELOCK - gave up after 10 attempts
Why it’s livelock: Both threads are actively running (not blocked), but they keep responding to each other without making progress.
Fix: Add randomized backoff:
import random
time.sleep(random.uniform(0.01, 0.1)) # Random delay before retry
Try it yourself: Add the random backoff and observe that threads eventually succeed.
Example 5: Starvation
import threading
import time
import queue
resource_lock = threading.Lock()
work_queue = queue.Queue()
def high_priority_worker(worker_id):
"""Aggressive worker that keeps taking the lock"""
for i in range(5):
resource_lock.acquire()
print(f"High Priority Worker {worker_id}: Using resource (iteration {i+1})")
time.sleep(0.1) # Hold lock for a while
resource_lock.release()
time.sleep(0.01) # Very short break before retrying
def low_priority_worker(worker_id):
"""Worker that tries to get the lock but keeps losing to high priority"""
start_time = time.time()
attempts = 0
while time.time() - start_time < 2: # Try for 2 seconds
attempts += 1
if resource_lock.acquire(blocking=False):
print(f"Low Priority Worker {worker_id}: FINALLY got resource after {attempts} attempts!")
time.sleep(0.1)
resource_lock.release()
return
time.sleep(0.05) # Wait before retry
print(f"Low Priority Worker {worker_id}: STARVED - never got resource after {attempts} attempts")
# Start high priority workers
high_threads = []
for i in range(3):
t = threading.Thread(target=high_priority_worker, args=(i,))
t.start()
high_threads.append(t)
# Start low priority worker
low_thread = threading.Thread(target=low_priority_worker, args=(0,))
low_thread.start()
# Wait for all threads
for t in high_threads:
t.join()
low_thread.join()
Expected Output:
High Priority Worker 0: Using resource (iteration 1)
High Priority Worker 1: Using resource (iteration 1)
High Priority Worker 2: Using resource (iteration 1)
High Priority Worker 0: Using resource (iteration 2)
High Priority Worker 1: Using resource (iteration 2)
High Priority Worker 2: Using resource (iteration 2)
...(high priority workers dominate)...
Low Priority Worker 0: STARVED - never got resource after 35 attempts
Why starvation occurs: High priority workers acquire and release the lock quickly, giving the low priority worker no chance.
Fix: Use a fair lock or queue-based approach:
# Python's threading.Lock is not fair by default
# Use a queue to ensure fairness:
request_queue = queue.Queue()
def fair_acquire():
my_turn = threading.Event()
request_queue.put(my_turn)
my_turn.wait() # Wait for my turn
resource_lock.acquire()
def fair_release():
resource_lock.release()
if not request_queue.empty():
next_turn = request_queue.get()
next_turn.set() # Signal next thread
Note (Java): Use ReentrantLock(true) for fair locking.
Try it yourself: Implement the fair lock approach and verify all threads get access.
Common Mistakes
1. Acquiring Locks in Different Orders
Mistake:
# Thread 1
lock_a.acquire()
lock_b.acquire()
# Thread 2
lock_b.acquire() # Different order!
lock_a.acquire()
Why it’s wrong: This creates the circular wait condition for deadlock. Thread 1 holds A and waits for B, while Thread 2 holds B and waits for A.
Fix: Always acquire locks in the same global order. Document the lock ordering in your codebase.
2. Forgetting to Release Locks on Error Paths
Mistake:
lock.acquire()
if error_condition:
return # BUG: Lock never released!
do_work()
lock.release()
Why it’s wrong: If an exception or early return happens, the lock stays held forever. Other threads will deadlock waiting for it.
Fix: Use context managers (Python) or try-finally blocks:
with lock: # Automatically releases on exit
if error_condition:
return
do_work()
# Or use try-finally:
lock.acquire()
try:
if error_condition:
return
do_work()
finally:
lock.release()
3. Confusing Livelock with Deadlock
Mistake: Assuming high CPU usage means no deadlock. Seeing threads “running” and thinking progress is being made.
Why it’s wrong: Livelock threads are active but not making progress. They consume CPU while accomplishing nothing.
How to identify:
- Deadlock: Threads in BLOCKED state, low CPU usage, stack traces show waiting on locks
- Livelock: Threads in RUNNABLE state, high CPU usage, stack traces show repeated lock attempts
Fix: Add logging to track actual progress (not just activity). Use exponential backoff with randomization.
4. Using Sleep Instead of Proper Synchronization
Mistake:
while not resource_available:
time.sleep(0.1) # Polling - wastes CPU and causes starvation
Why it’s wrong: Polling wastes CPU cycles and can cause starvation if timing is unlucky. The sleep duration is arbitrary—too short wastes CPU, too long adds latency.
Fix: Use proper synchronization primitives:
condition = threading.Condition()
# Producer
with condition:
resource_available = True
condition.notify() # Wake up waiting threads
# Consumer
with condition:
while not resource_available:
condition.wait() # Efficient waiting
5. Ignoring Lock Timeouts
Mistake:
lock.acquire() # Blocks forever if deadlock occurs
Why it’s wrong: No way to detect or recover from deadlock. The thread hangs indefinitely.
Fix: Use timeouts and handle failure:
if lock.acquire(timeout=5):
try:
do_work()
finally:
lock.release()
else:
# Handle timeout - log error, retry, or fail gracefully
logging.error("Failed to acquire lock - possible deadlock")
raise TimeoutError("Lock acquisition timeout")
Interview tip: Explain that timeouts help with both detection (you know something is wrong) and recovery (you can retry or fail gracefully).
Interview Tips
1. Explain Coffman’s Conditions Clearly
Interviewers love asking: “What are the conditions for deadlock?” Memorize all four and be ready to explain how breaking each one prevents deadlock.
Practice answer: “Deadlock requires four conditions: mutual exclusion, hold and wait, no preemption, and circular wait. We can prevent deadlock by breaking any one. For example, lock ordering breaks circular wait, and timeouts break hold and wait.”
2. Draw the Circular Wait Diagram
When asked about deadlock, immediately offer to draw the scenario. Draw circles for threads and rectangles for locks, with arrows showing “holds” and “waits for.”
Why it works: Visual explanations show clear thinking. Interviewers can see you understand the problem structure.
3. Know the Difference Between Deadlock and Livelock
Interviewers often ask: “How would you diagnose if threads are deadlocked or livelocked?”
Strong answer: “I’d check thread states and CPU usage. Deadlock shows threads in BLOCKED state with low CPU usage. Livelock shows threads in RUNNABLE state with high CPU usage but no progress. I’d add logging to track actual work completed, not just activity.”
4. Discuss Real-World Prevention Strategies
Don’t just say “use lock ordering.” Explain trade-offs:
Example answer: “Lock ordering is simple but requires global coordination—every developer must know the order. Timeouts are more flexible but add complexity in error handling. For databases, I’d use deadlock detection with automatic rollback. For real-time systems, I’d use lock-free data structures to avoid locks entirely.”
5. Connect to Database Deadlocks
Many interviewers ask about database deadlocks. Know that databases use deadlock detection (not prevention) and automatically abort one transaction.
Example answer: “Databases detect deadlocks using wait-for graphs. When a cycle is detected, the database picks a victim transaction to abort, usually the one with the least work done. The application should retry the transaction.”
6. Mention Priority Inversion
For senior roles, mention priority inversion as a cause of starvation: “A high-priority thread can starve if it waits for a lock held by a low-priority thread that never gets scheduled. The solution is priority inheritance—temporarily boost the low-priority thread’s priority.”
7. Code a Deadlock Example Quickly
Be ready to write a simple deadlock scenario in 2 minutes:
# Thread 1
lock_a.acquire()
lock_b.acquire()
# Thread 2
lock_b.acquire()
lock_a.acquire()
Then immediately explain the fix: “Both threads should acquire in the same order.”
8. Discuss Testing and Detection
Interviewers ask: “How would you test for deadlocks?”
Strong answer: “Deadlocks are hard to reproduce because they depend on timing. I’d use stress testing with many threads and random delays. Tools like ThreadSanitizer (C++) or Java’s jstack can detect deadlocks. In production, I’d add lock timeout monitoring and alerting.”
9. Know Language-Specific Tools
- Python: No built-in deadlock detection. Use timeouts and logging.
- Java:
jstackshows thread dumps with deadlock detection.ThreadMXBean.findDeadlockedThreads()for programmatic detection. - C++: ThreadSanitizer detects deadlocks at runtime.
10. Relate to System Design
For system design interviews, mention: “To avoid deadlocks in distributed systems, I’d use techniques like two-phase locking with timeouts, or avoid distributed locks entirely by using message queues or event sourcing for coordination.”
Key Takeaways
- Deadlock requires all four Coffman conditions: mutual exclusion, hold and wait, no preemption, and circular wait. Break any one condition to prevent deadlock.
- Lock ordering (always acquire locks in the same global order) is the most common deadlock prevention strategy. Timeouts provide a safety net for detection and recovery.
- Livelock differs from deadlock: threads are active (high CPU usage) but make no progress. Fix with randomized exponential backoff.
- Starvation occurs when unfair scheduling prevents a thread from ever getting resources. Use fair locks, priority aging, or queue-based synchronization.
- Always use context managers (Python
withstatement) or try-finally blocks to ensure locks are released, even on error paths. Test concurrent code with stress testing and use language-specific tools for deadlock detection.