Scheduling Agent Supervisor Pattern

After this topic, you will be able to:

Implement scheduling agent supervisor pattern for distributed task coordination
Design failure detection and recovery mechanisms for distributed workflows
Evaluate when to use scheduling agent supervisor vs simpler retry mechanisms

TL;DR

The Scheduling Agent Supervisor pattern coordinates distributed actions as a single logical operation by using a supervisor to monitor agent progress, detect failures, and orchestrate recovery. Unlike simple retry logic, it provides centralized failure detection and compensating actions across multiple steps. Use it when you need transactional semantics across distributed services without distributed transactions.

Cheat Sheet: Scheduler dispatches work → Agents execute tasks → Supervisor monitors state → On failure: retry, compensate, or escalate → Achieves eventual consistency without 2PC.

The Problem It Solves

Distributed systems frequently need to coordinate multiple operations across different services as a single logical unit of work. Consider an e-commerce order fulfillment workflow: reserve inventory, charge payment, create shipment, send notification. Each step calls a different service, and any step can fail due to network issues, service unavailability, or business logic errors.

Simple retry logic doesn’t cut it here. If payment succeeds but shipment creation fails, you can’t just retry the entire workflow—you’d double-charge the customer. If inventory reservation times out but actually succeeded, retrying creates duplicate reservations. You need something that understands the workflow’s state, can detect which specific step failed, and knows whether to retry, compensate, or escalate.

The core problem is maintaining consistency across distributed operations without distributed transactions (which don’t scale and aren’t available in many cloud environments). You need a way to track progress, detect failures at any step, and recover intelligently—all while the underlying services may be temporarily unavailable or responding slowly. This is the pain point that keeps SREs awake at 3 AM when workflows get stuck in inconsistent states.

Solution Overview

The Scheduling Agent Supervisor pattern solves this by separating concerns into three distinct roles. The Scheduler breaks down complex workflows into discrete tasks and dispatches them to agents. Agents are lightweight workers that execute individual steps and report their status. The Supervisor continuously monitors agent progress, detects failures or timeouts, and orchestrates recovery actions.

This separation creates a control plane (supervisor) that’s independent of the data plane (agents doing actual work). The supervisor maintains a state machine for each workflow instance, tracking which steps completed, which are in-progress, and which failed. When an agent fails to report progress within expected timeframes, the supervisor can retry the operation, invoke compensating actions to undo previous steps, or escalate to human operators.

The pattern achieves transactional semantics through careful state management and idempotent operations rather than distributed locks. Each agent operation is designed to be safely retried, and the supervisor ensures that recovery actions match the failure mode. For example, if payment processing times out, the supervisor might query the payment service’s status before deciding whether to retry or compensate. This approach trades immediate consistency for eventual consistency with strong failure handling guarantees.

Scheduling Agent Supervisor Architecture

graph LR
    Client["Client<br/><i>Order Service</i>"]
    
    subgraph Control Plane
        Scheduler["Scheduler<br/><i>Workflow Dispatcher</i>"]
        Supervisor["Supervisor<br/><i>Monitors & Recovers</i>"]
        StateStore[("Workflow State<br/><i>PostgreSQL</i>")]
    end
    
    subgraph Data Plane
        Queue["Task Queue<br/><i>RabbitMQ/SQS</i>"]
        Agent1["Inventory Agent"]
        Agent2["Payment Agent"]
        Agent3["Shipment Agent"]
    end
    
    subgraph External Services
        InvSvc["Inventory Service"]
        PaySvc["Payment Service"]
        ShipSvc["Shipping Service"]
    end
    
    Client --"1. Submit Order"--> Scheduler
    Scheduler --"2. Create Workflow"--> StateStore
    Scheduler --"3. Dispatch Tasks"--> Queue
    Queue --"4. Pull Tasks"--> Agent1 & Agent2 & Agent3
    Agent1 --"5. Execute"--> InvSvc
    Agent2 --"6. Execute"--> PaySvc
    Agent3 --"7. Execute"--> ShipSvc
    Agent1 & Agent2 & Agent3 --"8. Report Status"--> StateStore
    Supervisor --"9. Monitor Progress"--> StateStore
    Supervisor --"10. Retry/Compensate"--> Queue

The pattern separates control plane (scheduler and supervisor managing workflow state) from data plane (agents executing tasks). The supervisor monitors state independently of task execution, enabling centralized failure detection and recovery.

How It Works

Step 1: Workflow Initiation and Task Scheduling

When a workflow begins (e.g., processing an order), the scheduler creates a workflow instance with a unique ID and persists its initial state to durable storage. It then breaks the workflow into a directed acyclic graph (DAG) of tasks, determining dependencies and execution order. For our order example: [reserve_inventory] → [charge_payment] → [create_shipment, send_notification] where the last two can run in parallel.

The scheduler dispatches the first task(s) to available agents via a message queue, including the workflow ID, task type, input parameters, and a correlation token. It records in the state store that these tasks are now “in-progress” with timestamps. This persistent state is critical—if the scheduler crashes, another instance can pick up where it left off.

Step 2: Agent Execution and Progress Reporting

Agents pull tasks from the queue and execute them against their respective services. An inventory agent calls the inventory service’s reserve API, a payment agent interacts with Stripe, etc. Crucially, agents are designed for idempotency—if they receive the same task twice (due to retries), they produce the same outcome without side effects.

As agents work, they send heartbeat messages to the supervisor: “Task X for workflow Y is still processing.” When a task completes, the agent reports success with any output data needed by downstream tasks. If a task fails with a known error (e.g., insufficient inventory), the agent reports a structured failure with error codes and context. The agent then acknowledges the message from the queue, removing it from the work backlog.

Step 3: Supervisor Monitoring and Failure Detection

The supervisor runs as a separate process, continuously scanning the workflow state store for anomalies. It checks for tasks that haven’t sent heartbeats within their expected duration (e.g., payment processing should complete in 30 seconds, but it’s been 45 seconds with no update). It also watches for tasks that reported failures or for agents that crashed without reporting anything.

When the supervisor detects a problem, it consults the workflow’s recovery policy. For transient failures (network timeout), it might retry the task up to N times with exponential backoff. For business logic failures (payment declined), it triggers compensating actions—canceling the inventory reservation and notifying the customer. For unknown failures (agent crashed), it might attempt one retry, then escalate to a dead-letter queue for human investigation.

Step 4: Recovery and Compensation

Recovery actions depend on the failure mode and workflow semantics. If the payment step timed out but the supervisor queries Stripe and finds the charge succeeded, it marks the task complete and proceeds to shipment creation—no retry needed. If inventory reservation failed due to a transient database error, the supervisor requeues the task for another agent to attempt.

Compensation is trickier. If shipment creation fails after payment succeeded, the supervisor must decide: retry shipment creation (maybe the shipping API was temporarily down) or refund the payment and cancel the order. This decision is encoded in the workflow definition. The supervisor dispatches compensation tasks (refund, cancel reservation) as new work items, creating a compensating sub-workflow. It tracks these compensation tasks with the same rigor as forward-progress tasks.

Step 5: Workflow Completion and Cleanup

Once all tasks reach a terminal state (success, permanently failed, or compensated), the supervisor marks the workflow complete. It may trigger final actions like sending a summary notification or updating analytics. The workflow state is archived for audit purposes but removed from the active monitoring set. If the workflow failed despite retries and compensation, it’s marked as requiring manual intervention, with full context preserved for debugging.

Order Fulfillment Workflow Execution Flow

sequenceDiagram
    participant Client
    participant Scheduler
    participant StateStore
    participant Queue
    participant InvAgent as Inventory Agent
    participant PayAgent as Payment Agent
    participant Supervisor
    
    Client->>Scheduler: 1. POST /orders {items, payment}
    Scheduler->>StateStore: 2. Create workflow_123<br/>[status: started]
    Scheduler->>Queue: 3. Enqueue reserve_inventory task
    Scheduler->>StateStore: 4. Update task status<br/>[reserve_inventory: in-progress]
    
    InvAgent->>Queue: 5. Pull task
    InvAgent->>StateStore: 6. Heartbeat (task alive)
    InvAgent->>InvAgent: 7. Call inventory.reserve()
    InvAgent->>StateStore: 8. Report success<br/>[reserve_inventory: completed]
    
    Scheduler->>Queue: 9. Enqueue charge_payment task
    PayAgent->>Queue: 10. Pull task
    PayAgent->>PayAgent: 11. Call stripe.charge()<br/>(times out)
    
    Note over Supervisor,StateStore: Supervisor detects timeout
    Supervisor->>StateStore: 12. Check task status<br/>(no heartbeat for 45s)
    Supervisor->>Supervisor: 13. Query payment service<br/>(charge succeeded!)
    Supervisor->>StateStore: 14. Mark task complete<br/>[charge_payment: completed]
    Supervisor->>Queue: 15. Enqueue create_shipment task
    
    Client->>Scheduler: 16. GET /orders/123/status
    Scheduler->>StateStore: 17. Read workflow state
    Scheduler-->>Client: 18. {status: processing}

This sequence shows how the scheduler dispatches tasks, agents execute and report status, and the supervisor detects and recovers from a timeout failure. The supervisor’s independent monitoring allows it to query the payment service and determine the actual outcome when the agent fails to report.

Supervisor Failure Detection and Recovery Decision Tree

flowchart TB
    Start(["Supervisor Scan Cycle"])
    CheckTasks["Query StateStore for<br/>in-progress tasks"]
    HasStuck{"Task exceeded<br/>expected duration?"}
    CheckHeartbeat{"Recent heartbeat<br/>within threshold?"}
    QueryService["Query target service<br/>for actual status"]
    ServiceStatus{"Service reports<br/>task status?"}
    
    MarkComplete["Mark task complete<br/>Proceed to next step"]
    RetryDecision{"Retry count < max?"}
    RetryTask["Requeue task with<br/>exponential backoff"]
    CompensateDecision{"Compensation<br/>defined?"}
    Compensate["Dispatch compensating<br/>tasks (refund, cancel)"]
    Escalate["Move to dead-letter queue<br/>Alert on-call engineer"]
    Continue(["Continue monitoring"])
    
    Start --> CheckTasks
    CheckTasks --> HasStuck
    HasStuck -->|No| Continue
    HasStuck -->|Yes| CheckHeartbeat
    CheckHeartbeat -->|Yes, still alive| Continue
    CheckHeartbeat -->|No heartbeat| QueryService
    
    QueryService --> ServiceStatus
    ServiceStatus -->|Completed| MarkComplete
    ServiceStatus -->|Failed| RetryDecision
    ServiceStatus -->|Unknown| RetryDecision
    
    RetryDecision -->|Yes| RetryTask
    RetryDecision -->|No| CompensateDecision
    RetryTask --> Continue
    
    CompensateDecision -->|Yes| Compensate
    CompensateDecision -->|No| Escalate
    Compensate --> Continue
    Escalate --> Continue
    MarkComplete --> Continue

The supervisor’s failure detection algorithm distinguishes between slow operations (still sending heartbeats) and actual failures. Recovery strategy depends on whether the service confirms completion, retry budget remains, and compensation is available.

Variants

Centralized Supervisor with Distributed Agents

The classic implementation uses a single supervisor process (or active-passive pair for HA) monitoring all workflows. Agents are stateless and horizontally scalable—you can run hundreds of payment agents pulling from the same queue. The supervisor maintains all state in a database like PostgreSQL or DynamoDB.

When to use: Most scenarios, especially when workflow volume is moderate (thousands per second) and you want operational simplicity. The centralized supervisor is easier to reason about and debug.

Pros: Simple mental model, easier to implement workflow-level policies (rate limiting, priority), straightforward monitoring.

Cons: Supervisor becomes a scaling bottleneck at very high workflow volumes. If the supervisor crashes, recovery time affects all workflows.

Distributed Supervisor with Sharding

For massive scale (millions of concurrent workflows), the supervisor itself is sharded. Workflows are partitioned by hash (e.g., workflow ID mod N), and each supervisor instance monitors its partition. This is how systems like Uber’s Cadence and Temporal operate.

When to use: When workflow volume exceeds what a single supervisor can monitor (typically >100K active workflows) or when you need geographic distribution.

Pros: Horizontally scalable supervision, fault isolation (one shard’s failure doesn’t affect others), lower latency for geographically distributed workflows.

Cons: Significantly more complex to implement and operate. Cross-shard workflows require coordination. Rebalancing shards during scaling is tricky.

Embedded Supervisor (Workflow Engine)

Some systems embed supervisor logic directly into a workflow engine like Temporal, Camunda, or AWS Step Functions. The engine provides the scheduler, supervisor, and state management as a platform service. You define workflows in code or DSL, and the engine handles all coordination.

When to use: When you don’t want to build coordination infrastructure yourself, or when workflow complexity justifies a full engine (dozens of steps, complex branching, long-running workflows spanning days).

Pros: Mature, battle-tested implementations with rich features (versioning, debugging tools, visual workflow editors). Operational burden shifts to the platform team or cloud provider.

Cons: Vendor lock-in, learning curve for the engine’s abstractions, potential cost at scale, less control over low-level behavior.

Centralized vs. Distributed Supervisor Architecture

graph TB
    subgraph Centralized Supervisor
        C_Client1["Client Requests"]
        C_Supervisor["Single Supervisor<br/><i>Monitors all workflows</i>"]
        C_StateStore[("Workflow State<br/><i>100K active workflows</i>")]
        C_Queue["Task Queue"]
        C_Agents["Agent Pool<br/><i>Horizontally scaled</i>"]
        
        C_Client1 --> C_Queue
        C_Supervisor <--> C_StateStore
        C_Queue <--> C_Agents
        C_Agents --> C_StateStore
    end
    
    subgraph Distributed Supervisor - Sharded
        D_Client1["Client Requests"]
        D_Router["Router<br/><i>Hash by workflow_id</i>"]
        
        subgraph Shard 0
            D_Super0["Supervisor 0<br/><i>workflow_id % 4 == 0</i>"]
            D_State0[("State Partition 0")]
        end
        
        subgraph Shard 1
            D_Super1["Supervisor 1<br/><i>workflow_id % 4 == 1</i>"]
            D_State1[("State Partition 1")]
        end
        
        subgraph Shard 2
            D_Super2["Supervisor 2<br/><i>workflow_id % 4 == 2</i>"]
            D_State2[("State Partition 2")]
        end
        
        D_Queue2["Task Queue<br/><i>Shared across shards</i>"]
        D_Agents2["Agent Pool"]
        
        D_Client1 --> D_Router
        D_Router --> D_Super0 & D_Super1 & D_Super2
        D_Super0 <--> D_State0
        D_Super1 <--> D_State1
        D_Super2 <--> D_State2
        D_Queue2 <--> D_Agents2
        D_Agents2 --> D_State0 & D_State1 & D_State2
    end

Centralized supervision (top) is simpler but the supervisor becomes a bottleneck at scale. Distributed supervision (bottom) shards workflows across multiple supervisor instances, each monitoring its partition independently. This enables horizontal scaling but adds complexity around partition management and cross-shard coordination.

Trade-offs

Consistency vs. Latency

Strong supervision (frequent heartbeats, aggressive failure detection) catches problems quickly but adds overhead. Each heartbeat is a network round-trip and a state store write. For a 5-second task, heartbeating every 500ms means 10x more I/O than the actual work.

Decision criteria: For critical workflows (payments, account creation), favor strong consistency—heartbeat every 1-2 seconds and fail fast. For batch processing or analytics workflows, relax to 30-60 second heartbeats and tolerate longer failure detection windows.

Retry vs. Compensation

When a task fails, you must decide: retry the operation or compensate previous steps. Retries are simpler but can amplify problems (retry storm). Compensation is safer but requires writing inverse operations for every task.

Decision criteria: Retry for transient failures (network blips, temporary service unavailability) with exponential backoff and jitter. Compensate for semantic failures (business rule violations, resource exhaustion) or when retries have been exhausted. Always set a maximum retry count to prevent infinite loops.

Centralized vs. Distributed Supervision

A single supervisor is simpler to build and reason about but becomes a bottleneck. Distributed supervision scales horizontally but introduces complexity around partition management and cross-partition workflows.

Decision criteria: Start centralized. Move to distributed supervision only when you’ve measured that the supervisor is your bottleneck (typically >50K active workflows) and you’ve exhausted vertical scaling options (faster database, more CPU). The operational complexity of distributed supervision is significant—don’t pay that cost prematurely.

When to Use (and When Not To)

Use Scheduling Agent Supervisor when:

You need to coordinate 3+ distributed operations as a logical unit without distributed transactions
Individual steps can fail independently and require different recovery strategies
Workflows are long-running (seconds to hours) where simple request-response patterns don’t work
You need audit trails and visibility into workflow progress for debugging or compliance
Operations must be idempotent and you can implement compensating actions

Avoid this pattern when:

Your workflow is just 1-2 steps—use simple retry logic with exponential backoff instead
You need strict ACID transactions—use a database transaction or see the Saga pattern for more sophisticated transactional semantics
Latency is critical and you can’t tolerate the overhead of state persistence and monitoring (sub-millisecond requirements)
Your operations aren’t idempotent and you can’t make them so—the pattern relies on safe retries
You’re working with legacy systems that can’t report status or accept correlation tokens

Red flags in interviews: Suggesting this pattern for simple request-response APIs, not discussing idempotency requirements, ignoring the operational complexity of running a supervisor, or claiming it provides ACID guarantees (it doesn’t—it provides eventual consistency with failure handling).

Real-World Examples

Uber’s Cadence (now Temporal)

Uber built Cadence to coordinate complex workflows across hundreds of microservices—driver dispatch, trip pricing, payment processing, fraud detection. Each workflow might involve 20+ steps with conditional logic and parallel execution. Cadence implements the scheduling agent supervisor pattern at massive scale: 100M+ workflow executions per day.

The interesting detail: Cadence uses event sourcing for workflow state. Every state change (task started, task completed, task failed) is appended to an immutable log. The supervisor reconstructs workflow state by replaying events, which enables time-travel debugging—you can replay a failed workflow from any point to understand what went wrong. This approach also makes the supervisor stateless, simplifying horizontal scaling.

AWS Step Functions

AWS Step Functions is a managed workflow orchestration service that embodies this pattern. You define workflows as state machines in JSON, specifying tasks (Lambda functions, ECS containers, API calls) and transitions. Step Functions acts as both scheduler and supervisor, managing execution and handling failures.

The interesting detail: Step Functions charges per state transition, not per workflow. This pricing model reflects the pattern’s overhead—each task completion, retry, or compensation is a state transition requiring persistence and monitoring. At scale, this can get expensive, which is why some companies build their own supervisors for high-volume workflows.

Netflix’s Conductor

Netflix built Conductor to orchestrate content encoding workflows. When you upload a video, Conductor coordinates: transcoding to multiple resolutions, generating thumbnails, extracting metadata, updating the content catalog, and invalidating CDN caches. Each step is a separate microservice, and failures are common (encoding can fail due to corrupted source files, catalog updates can timeout during deployments).

The interesting detail: Conductor uses a pull-based agent model where workers poll for tasks rather than receiving pushed messages. This design choice simplifies backpressure management—if workers are overloaded, they simply stop polling, and tasks queue up naturally. The supervisor detects stuck tasks by timestamp, not by missing heartbeats, which reduces network chatter.

Netflix Conductor Pull-Based Agent Model

graph LR
    subgraph Conductor Control Plane
        API["Conductor API<br/><i>Workflow Management</i>"]
        Supervisor["Supervisor<br/><i>Monitors by timestamp</i>"]
        StateDB[("Workflow State<br/><i>Elasticsearch</i>")]
        QueueMgr["Queue Manager<br/><i>Task queues per type</i>"]
    end
    
    subgraph Worker Pools - Pull Model
        subgraph Encoding Workers
            E1["Encoder 1"]
            E2["Encoder 2"]
            E3["Encoder 3"]
        end
        
        subgraph Thumbnail Workers
            T1["Thumbnail 1"]
            T2["Thumbnail 2"]
        end
        
        subgraph Catalog Workers
            C1["Catalog 1"]
        end
    end
    
    subgraph External Services
        S3["S3<br/><i>Video Storage</i>"]
        CatalogSvc["Content Catalog"]
        CDN["CDN Invalidation"]
    end
    
    API --"1. Submit workflow"--> StateDB
    API --"2. Create tasks"--> QueueMgr
    
    E1 & E2 & E3 --"3. Poll /tasks/encode"--> QueueMgr
    T1 & T2 --"4. Poll /tasks/thumbnail"--> QueueMgr
    C1 --"5. Poll /tasks/catalog"--> QueueMgr
    
    QueueMgr --"6. Return task or 204"--> E1 & E2 & E3 & T1 & T2 & C1
    
    E1 --"7. Process video"--> S3
    T1 --"8. Generate thumbnails"--> S3
    C1 --"9. Update metadata"--> CatalogSvc
    
    E1 & T1 & C1 --"10. POST /tasks/{id}/complete"--> StateDB
    
    Supervisor --"11. Check task timestamps<br/>(no heartbeat needed)"--> StateDB
    Supervisor --"12. Requeue stuck tasks"--> QueueMgr

Conductor uses pull-based polling where workers request tasks rather than receiving pushed messages. This naturally handles backpressure—overloaded workers stop polling, and tasks queue up. The supervisor detects stuck tasks by timestamp comparison rather than missing heartbeats, reducing network overhead while maintaining failure detection.

Interview Essentials

Mid-Level

Explain the three roles (scheduler, agent, supervisor) and how they interact. Describe how the supervisor detects failures (timeouts, missing heartbeats) and basic recovery strategies (retry with backoff). Discuss why operations must be idempotent. Walk through a simple example like order processing with 3-4 steps, showing what happens when one step fails. Understand the difference between this pattern and simple retry logic—the key is centralized failure detection and recovery across multiple steps.

Senior

Design a supervisor’s failure detection algorithm, including how to distinguish between slow operations and actual failures (hint: heartbeats + expected duration + percentile latencies). Explain compensation strategies and when to use each (retry vs. compensate vs. escalate). Discuss state management: what needs to be persisted, consistency requirements, and how to handle supervisor crashes. Compare this pattern to Saga pattern—both handle distributed transactions, but Saga focuses on compensating transactions while this pattern emphasizes supervision and recovery. Describe how to implement exactly-once semantics for task execution using idempotency keys and deduplication.

Staff+

Architect a distributed supervisor for 1M+ concurrent workflows, including sharding strategy, rebalancing during scaling, and handling cross-shard workflows. Design the observability stack: what metrics matter (workflow duration, task retry rates, compensation frequency), how to debug stuck workflows, and how to detect systemic issues (cascading failures, resource exhaustion). Discuss the trade-offs between pull-based and push-based agent models at scale. Explain how to evolve workflow definitions without breaking in-flight workflows (versioning, backward compatibility). Address the CAP theorem implications: the supervisor needs CP characteristics (consistent state) while agents can be AP (available, partition-tolerant). Describe failure modes: what happens when the state store is unavailable, when agents can’t reach the supervisor, or when the supervisor falls behind on monitoring.

Common Interview Questions

How does this pattern differ from Saga pattern? (Saga focuses on compensating transactions with choreography or orchestration; this pattern emphasizes supervision and failure detection with centralized coordination)

What happens if the supervisor crashes? (Another supervisor instance takes over by reading persisted workflow state; in-flight tasks may be retried if they don’t report completion)

How do you prevent duplicate task execution? (Idempotency keys, deduplication windows, and agents checking if work was already completed before starting)

When would you use this instead of a message queue with retries? (When you need cross-step coordination, compensation logic, or visibility into multi-step workflow progress)

How do you handle long-running tasks (hours/days)? (Periodic heartbeats, checkpoint/resume mechanisms, and supervisor persistence to survive restarts)

Red Flags to Avoid

Claiming this pattern provides ACID transactions (it provides eventual consistency with failure handling, not atomicity)

Not discussing idempotency requirements for agents—retries will cause duplicate side effects

Ignoring the operational complexity of running a supervisor (monitoring, scaling, failure handling)

Suggesting synchronous communication between supervisor and agents (use async messaging for decoupling)

Not considering what happens when the supervisor falls behind or the state store becomes unavailable

Treating all failures the same way (transient vs. permanent failures require different recovery strategies)

Key Takeaways

The Scheduling Agent Supervisor pattern coordinates distributed operations through separation of concerns: scheduler dispatches work, agents execute tasks, supervisor monitors progress and orchestrates recovery.

Failure detection relies on timeouts, heartbeats, and persistent state tracking. Recovery strategies (retry, compensate, escalate) depend on failure mode and workflow semantics.

Operations must be idempotent to safely handle retries. Compensation actions (inverse operations) enable rollback when forward progress isn’t possible.

The pattern trades immediate consistency for eventual consistency with strong failure handling. It’s more complex than simple retries but less complex than distributed transactions.

At scale, the supervisor can become a bottleneck. Distributed supervision with sharding addresses this but adds significant operational complexity—start centralized and scale only when measured as necessary.