A Practical Introduction to Circuit Breakers and Fallback Design in FastAPI: Real-World Patterns for Preventing External API Failures from Becoming System-Wide Failures
Summary
- A circuit breaker is a design pattern that temporarily stops calls to an external dependency after repeated failures, in order to prevent your entire API from becoming slow or failing as it gets dragged down by that dependency. Martin Fowler describes the basic form as opening the breaker after a certain number of failures, after which subsequent calls fail immediately instead of being attempted.
- FastAPI is strong at asynchronous I/O and works very well with external API integrations, but if a slow dependency drags on, the event loop and worker processes can suffer across the whole application. That is why it is important to think about timeouts, retries, connection pooling, and fault isolation together.
- HTTPX provides features such as shared
AsyncClientinstances,Timeout,Limits, and transport-level support for connection retries, which makes it a good foundation for external API client design. Timeouts are split intoconnect,read,write, andpool, and connection counts can also be controlled explicitly. - In Python, libraries such as PyBreaker exist and can be used to implement the Circuit Breaker pattern. PyBreaker describes itself as a “Python implementation of the Circuit Breaker pattern.”
- A fallback is the idea that, instead of simply failing when an outage occurs, you return cached data, degrade functionality, or defer retrying until later through some alternative path. In practice, combining a circuit breaker with fallback behavior and monitoring tends to produce more stable systems for both user experience and operations than using a circuit breaker alone.
Who will benefit from reading this
Individual developers and learners
This is for people who have started using one or two external APIs and are struggling with situations like “sometimes it is slow” or “sometimes it fails.”
It is especially useful if you are calling httpx.get() or AsyncClient.get() directly inside FastAPI and have experienced cases where your own API becomes slow just because the external service is unhealthy. FastAPI is well suited for async code and makes it easy to design for external I/O, but unless you add mechanisms to absorb delays from external dependencies, that strength is hard to fully use.
Backend engineers in small teams
This is for teams calling multiple external services such as shipping, payments, notifications, or authentication platforms from FastAPI.
If you are asking questions like “How far should we retry?” “Right now, when an external service is down, all we can do is return 500s.” or “Timeouts and connection limits differ depending on who wrote the code,” this article will help you organize those issues from the perspective of circuit breakers and fallbacks. HTTPX provides AsyncClient, timeout controls, connection limit controls, and transport configuration, making it a natural fit for a shared client layer.
SaaS teams and startups
This is for teams where trouble with external APIs directly affects your product’s SLOs and support ticket volume.
If you are at the stage where you want to say, “Even if one external API is unhealthy, we do not want it to become a system-wide outage,” or “We want to survive by returning cached results or degrading gracefully,” or “We also want proper observability during failures,” then circuit breakers and fallbacks are worth treating as core infrastructure, on the same level as authorization, auditing, and job queues. Circuit Breaker is widely known as a representative pattern for isolating failure, and it is also frequently discussed as an important pattern in microservice architectures.
Accessibility notes
- The article starts with a summary, then proceeds step by step through “why this is necessary,” “how to design it,” and “how to implement it.”
- Technical terms are explained briefly on first use, and the same terminology is used consistently afterward to make the flow easier to follow.
- Code examples are split into short sections so that each block shows just one responsibility.
- Each chapter is written so it can be read independently, with the needed context provided where relevant.
- The target level is roughly equivalent to WCAG AA readability expectations.
1. What is a circuit breaker?
A circuit breaker is a mechanism that temporarily stops calls to an external dependency when failures continue to occur.
Martin Fowler explains the basic form as wrapping a protected function call with a breaker object, and when the number of failures reaches a threshold, the breaker opens the circuit and stops making further calls, returning errors immediately instead. In practice, this is usually considered together with monitoring and alerting.
The electrical analogy makes this easy to understand.
If something is close to short-circuiting, you do not keep letting current flow until everything burns out. You cut it off once to prevent the damage from spreading. The same applies to applications. If an external API is taking dozens of seconds to respond, returning 5xx continuously, or causing connections to pile up, then if you keep calling it, your entire FastAPI application can get dragged down. The Circuit Breaker pattern exists specifically to stop that chain reaction.
2. Why this matters especially in FastAPI
FastAPI is built around async def and asynchronous I/O, so it is very well suited to I/O-bound work such as external APIs and databases.
At the same time, if an external API becomes very slow or starts failing repeatedly, the number of tasks waiting on the event loop increases, and this can affect connection pools and the perceived performance of the entire application. FastAPI’s documentation explains that async processing is especially useful for I/O-bound workloads, and external API calls are exactly that kind of work.
FastAPI’s lifespan feature is also well suited for managing resources at application startup and shutdown, which makes it a natural place to create and clean up shared HTTP clients. Its dependency injection system also makes it easy to assemble shared external API clients as common dependencies. In other words, FastAPI already provides the foundation needed to place circuit breakers and fallbacks cleanly.
3. The overall picture to understand first: timeout, retry, breaker, fallback
A circuit breaker alone is not enough to make your system resilient to external API failures.
In practice, it becomes much easier to reason about things if you think in terms of these four pieces together.
- Timeout
- Decide how long you are willing to wait in the first place
- Retry
- Retry briefly if the failure seems temporary
- Circuit breaker
- Temporarily stop calling a dependency that keeps failing
- Fallback
- Decide what to return when you cannot call it
HTTPX gives you timeouts, connection controls, and transport-level support for connection retries. Tenacity is a good fit for retry design with exponential backoff. PyBreaker can be used for the Circuit Breaker pattern itself. So the tooling around FastAPI is already quite mature.
4. Start with the foundation: keep a shared AsyncClient in lifespan
Before you even get to circuit breakers, it is easier to extend your design later if you first build a shared external API client foundation.
HTTPX recommends using AsyncClient in asynchronous environments and reusing the client instance. It also explicitly advises against creating clients repeatedly inside hot loops.
from contextlib import asynccontextmanager
import httpx
from fastapi import FastAPI
@asynccontextmanager
async def lifespan(app: FastAPI):
timeout = httpx.Timeout(connect=2.0, read=5.0, write=5.0, pool=1.0)
limits = httpx.Limits(
max_keepalive_connections=20,
max_connections=100,
keepalive_expiry=5.0,
)
app.state.http_client = httpx.AsyncClient(
timeout=timeout,
limits=limits,
)
try:
yield
finally:
await app.state.http_client.aclose()
app = FastAPI(lifespan=lifespan)
The Timeout and Limits used here are both official HTTPX features. Timeouts are divided into connect, read, write, and pool, and connection limits can be adjusted with max_connections and related settings.
5. Inject the shared client through a dependency
Using FastAPI dependencies, you can naturally pass the shared AsyncClient into each client class.
FastAPI describes its dependency system as powerful and intuitive, and it works very well for reusable components and testability.
import httpx
from fastapi import Request
def get_http_client(request: Request) -> httpx.AsyncClient:
return request.app.state.http_client
On top of that, you can create a class for a specific external service.
import httpx
class BillingClient:
def __init__(self, client: httpx.AsyncClient, base_url: str, api_key: str):
self.client = client
self.base_url = base_url.rstrip("/")
self.api_key = api_key
Once you structure it this way, you can keep the circuit breaker and fallback logic inside the client layer.
That prevents HTTPX-specific details from being scattered across routers and service layers, and it makes later improvements much easier.
6. Explicitly define timeouts: the minimum defense before a breaker
HTTPX enables timeouts by default, but in practice it is safer to define them explicitly.
The official documentation explains the four kinds of timeouts: connect, read, write, and pool. For example, pool timeout means how long to wait for an available connection from the pool.
import httpx
DEFAULT_TIMEOUT = httpx.Timeout(
connect=1.5,
read=3.0,
write=3.0,
pool=0.5,
)
What matters is recognizing that different external APIs deserve different waiting times.
A fast internal service might justify a much shorter timeout, while a generative AI API or heavy reporting API might need a bit longer. But a design that waits indefinitely, or simply “for a long time,” is dangerous even before you consider circuit breakers. If you keep waiting on a slow dependency, your own application becomes slow too.
7. Convert HTTPX exceptions into your own exception types
HTTPX has a well-structured exception hierarchy including RequestError, HTTPStatusError, and TimeoutException.
Instead of letting those bubble upward as-is, it is better to convert them into your own application exceptions so you can handle them consistently. That also makes it easier to define circuit breaker and fallback conditions. HTTPX’s quickstart and exception docs show how to use raise_for_status() and handle those exceptions.
class ExternalAPIError(Exception):
pass
class ExternalAPITimeoutError(ExternalAPIError):
pass
class ExternalAPIUnavailableError(ExternalAPIError):
pass
class ExternalAPIBadResponseError(ExternalAPIError):
pass
import httpx
async def safe_request(client: httpx.AsyncClient, method: str, url: str, **kwargs) -> httpx.Response:
try:
response = await client.request(method, url, **kwargs)
response.raise_for_status()
return response
except httpx.TimeoutException as exc:
raise ExternalAPITimeoutError(str(exc)) from exc
except httpx.HTTPStatusError as exc:
raise ExternalAPIBadResponseError(str(exc)) from exc
except httpx.RequestError as exc:
raise ExternalAPIUnavailableError(str(exc)) from exc
Once your exceptions are organized at this level, it becomes much easier to express decisions like “open the breaker after this many timeouts” or “do not open the breaker on 4xx errors.”
8. Understand the breaker states: Closed / Open / Half-Open
A circuit breaker is easiest to reason about if you think in terms of these three states.
This is the common basic form used by many implementations and explanations, and it aligns with Martin Fowler’s description.
- Closed
- Normal state. Calls to the external API are allowed
- Open
- Failures have continued, so the breaker is now blocking calls. Requests fail immediately
- Half-Open
- A small number of trial calls are allowed to check whether the dependency has recovered
Because of these three states, you avoid both extremes:
not “stop forever because it failed once,”
and not “wait for the full timeout every single time forever.”
Instead, you get a balanced recovery behavior.
9. A minimal implementation idea using PyBreaker
PyBreaker describes itself as a “Python implementation of the Circuit Breaker pattern.”
It also supports things like Redis-backed state storage, which is evident from its package description and common usage patterns.
Conceptually, you wrap risky calls with a breaker like this:
import pybreaker
billing_breaker = pybreaker.CircuitBreaker(
fail_max=5,
reset_timeout=30,
)
However, it is important to note that PyBreaker is most commonly used in synchronous call contexts.
If you want to use it directly in FastAPI with asynchronous HTTPX calls, you need to think a little more carefully about the surrounding implementation. In practice, one of these two approaches tends to work best.
- Manage only the breaker state while wrapping async calls through a compatible abstraction
- Start with a small custom “simple breaker” implementation that is easy to swap out later
In this article, to make the idea easier to understand in FastAPI, I will show the second approach first.
10. Designing a simple circuit breaker for FastAPI
If you want to start small, even a minimal self-built breaker can be very useful.
The core idea is this:
- Count failures
- Once the threshold is exceeded, move to Open for a fixed time
- While Open, fail immediately
- After the Open period expires, allow a trial request
- If the trial succeeds, return to Closed
from dataclasses import dataclass
from datetime import datetime, timedelta, timezone
@dataclass
class SimpleCircuitBreaker:
fail_max: int
reset_timeout_sec: int
failure_count: int = 0
opened_at: datetime | None = None
def is_open(self) -> bool:
if self.opened_at is None:
return False
now = datetime.now(timezone.utc)
return now < self.opened_at + timedelta(seconds=self.reset_timeout_sec)
def allow_request(self) -> bool:
return not self.is_open()
def record_success(self) -> None:
self.failure_count = 0
self.opened_at = None
def record_failure(self) -> None:
self.failure_count += 1
if self.failure_count >= self.fail_max:
self.opened_at = datetime.now(timezone.utc)
This is not a complete production-ready implementation, but it is more than enough to understand the basic behavior of a circuit breaker.
Even just adding the idea that you should stop seriously hitting a dependency that keeps failing goes a long way toward preventing system-wide outages.
11. Put the breaker into the client layer
Next, use the simple breaker inside your external API client.
class CircuitOpenError(Exception):
pass
class ShippingClient:
def __init__(self, client, base_url: str, api_key: str, breaker: SimpleCircuitBreaker):
self.client = client
self.base_url = base_url.rstrip("/")
self.api_key = api_key
self.breaker = breaker
async def get_quote(self, payload: dict) -> dict:
if not self.breaker.allow_request():
raise CircuitOpenError("shipping circuit is open")
try:
response = await safe_request(
self.client,
"POST",
f"{self.base_url}/quotes",
json=payload,
headers={"Authorization": f"Bearer {self.api_key}"},
)
self.breaker.record_success()
return response.json()
except (ExternalAPITimeoutError, ExternalAPIUnavailableError):
self.breaker.record_failure()
raise
Once you do this, decisions like
“Which exceptions count as failures?”
or “Should HTTP 4xx open the breaker?”
can stay inside the client layer.
For example, if a 400 is caused by a user input mistake, that is not a dependency outage.
So in many cases, treating every HTTPStatusError as a breaker failure would not be the right design.
12. What is fallback? Decide what to do when you cannot call the dependency
A circuit breaker is a defensive mechanism that says “do not call it,” but fallback is the idea of deciding how your system should behave when it cannot call it.
Typical fallback options include:
- Return the last successful cached result
- Return a degraded response
- Omit some information but still keep the screen or endpoint usable
- Clearly say “currently unavailable” while offering a way to try again later
- Stop doing the work synchronously and switch to background job submission
The Circuit Breaker pattern as described by Martin Fowler also assumes it will be combined with monitoring and operational behaviors, not just raw blocking logic. In microservice contexts it is often discussed together with things like timeouts and bulkheads, and fallback is the practical expression of that idea at the application level.
13. Fallback implementation example 1: return a cached response
One of the most practical forms of fallback is to return the most recent successful result for a limited time.
from dataclasses import dataclass
from datetime import datetime, timedelta, timezone
@dataclass
class CacheEntry:
value: dict
expires_at: datetime
class SimpleResponseCache:
def __init__(self):
self.data: dict[str, CacheEntry] = {}
def get(self, key: str) -> dict | None:
entry = self.data.get(key)
if not entry:
return None
if datetime.now(timezone.utc) > entry.expires_at:
return None
return entry.value
def set(self, key: str, value: dict, ttl_sec: int) -> None:
self.data[key] = CacheEntry(
value=value,
expires_at=datetime.now(timezone.utc) + timedelta(seconds=ttl_sec),
)
class ShippingClient:
def __init__(self, client, base_url: str, api_key: str, breaker: SimpleCircuitBreaker, cache: SimpleResponseCache):
self.client = client
self.base_url = base_url.rstrip("/")
self.api_key = api_key
self.breaker = breaker
self.cache = cache
async def get_quote_with_fallback(self, payload: dict) -> dict:
cache_key = f"quote:{payload.get('zip')}:{payload.get('weight')}"
if not self.breaker.allow_request():
cached = self.cache.get(cache_key)
if cached is not None:
return {"source": "cache", "data": cached}
raise CircuitOpenError("shipping circuit is open")
try:
response = await safe_request(
self.client,
"POST",
f"{self.base_url}/quotes",
json=payload,
headers={"Authorization": f"Bearer {self.api_key}"},
)
data = response.json()
self.cache.set(cache_key, data, ttl_sec=60)
self.breaker.record_success()
return {"source": "live", "data": data}
except (ExternalAPITimeoutError, ExternalAPIUnavailableError):
self.breaker.record_failure()
cached = self.cache.get(cache_key)
if cached is not None:
return {"source": "cache", "data": cached}
raise
With this design, even if the shipping quote API is temporarily unhealthy, the screen can still remain usable if a recent successful result exists.
That said, you always need to decide carefully whether returning slightly stale data is acceptable in that specific use case.
14. Fallback implementation example 2: stop doing it synchronously and move to a job
Some types of external APIs are not a good fit for cache-based fallback.
For example, “report generation,” “heavy external aggregation,” or “large file conversion” may not complete quickly enough for synchronous request-response flows.
In those cases, a fallback can be to give up on synchronous completion and switch to an asynchronous job model.
- During normal operation
- Call the external API during the request and return the result
- During outages
- Return “accepted for processing” and retry later through a job
This is often kinder from a UX perspective than total failure, and it also protects your API better when the dependency is unhealthy.
FastAPI includes BackgroundTasks, which can be used for small post-response tasks. For heavier work, a dedicated job queue is a better fit.
15. The relationship with retries: adding a breaker does not make retries unnecessary
Adding a circuit breaker does not mean retries are no longer needed.
In practice, their roles are different.
- Retry
- Absorb small temporary glitches
- Circuit breaker
- Stop calling a dependency that is continuously failing
- Fallback
- Decide what to return when you cannot call it
HTTPX transport-level retries can be used for ConnectError and ConnectTimeout. For broader retry behavior or exponential backoff, Tenacity is usually more suitable.
A practical mental model is something like this:
- First, let HTTPX do a single connection retry
- If that still fails, do limited retry with Tenacity
- If failures continue, open the breaker
- While the breaker is open, route requests to a fallback
This lets you build resilience gradually, without jumping immediately to something extreme.
16. Monitoring and metrics: a breaker is dangerous if nobody can see when it opens
Circuit breakers are useful, but they become dangerous if they open and nobody notices.
At minimum, it is good to make these kinds of metrics visible:
- Success rate per external API
- Timeout rate
- Retry count
- Number of times the breaker opens
- Number of times fallback is used
- Cache return rate
Martin Fowler’s description also notes that when a breaker opens, you typically want monitoring and alerting. In other words, a circuit breaker is not something you “just add and forget.” It belongs together with observability.
On the FastAPI side, if you are already using structured logs and metrics, leaving fields like circuit_state="open" or fallback="cache" makes later diagnosis much easier.
17. Logging design: record breaker state transitions
Your logs should capture not only external API errors themselves, but also changes in breaker state.
import logging
logger = logging.getLogger("circuit_breaker")
def log_breaker_open(name: str) -> None:
logger.warning("circuit opened", extra={"circuit": name})
def log_breaker_closed(name: str) -> None:
logger.info("circuit closed", extra={"circuit": name})
If you can see when the breaker opened, when it recovered from Half-Open, and when a response was served via fallback,
it becomes much easier to answer questions like,
“Why did the user receive a degraded response here?”
18. Testing strategy: the minimum cases worth protecting
Circuit breakers and fallbacks are fragile if you only test the happy path.
At minimum, these cases are worth testing:
- The breaker opens after repeated timeouts
- While Open, the external API is not actually called, and the system fails immediately or uses fallback
- If fallback cache exists, it is returned
- After recovery, a successful trial request returns the breaker to Closed
- User-caused errors such as 4xx do not open the breaker
Code close to policy logic, such as record_failure() or allow_request(), becomes much easier to test if you keep it in small classes or nearly pure functions.
19. Common failure patterns
19.1 Adding only a breaker, but no timeouts
If one failure takes too long to be recognized, the whole system can become slow before the breaker has any chance to help. It is safer to set up HTTPX timeouts first.
19.2 Counting both 4xx and 5xx as the same kind of failure
If user mistakes or authorization errors also count toward the breaker, it can open even when the dependency itself is healthy.
Usually it is more natural to count only errors that truly represent dependency instability.
19.3 Using careless fallbacks that return stale data too freely
Returning cached values is powerful, but not every kind of data can safely be stale.
For things like payment status or inventory, freshness matters much more.
19.4 Letting the breaker open without anyone noticing
Without alerts or metrics, you can end up in a state where the feature has been quietly degraded for a long time. Fowler also treats breaker monitoring as an important part of the pattern.
19.5 Different rules for every client
If your shipping API, payment API, and notification API each behave in completely different ways, team operations become painful.
Exceptions are fine, but your basic pattern should still be consistent.
20. A roadmap by reader type
Individual developers and learners
- First add a shared
AsyncClient, timeouts, and connection limits - Convert external API exceptions into your own exceptions
- Introduce a small simple breaker for just one client
- Try cache-based fallback in one place
- Leave the results in logs
Engineers in small teams
- Inventory each external API by importance and freshness requirements
- Build a shared client layer
- Align timeout, retry, and breaker rules across the team
- Turn breaker-open counts and fallback counts into metrics
- Revisit write operations with idempotency and job-queue options in mind
SaaS teams and startups
- Define per-dependency policies for “hard fail,” “degraded mode,” and “cache return”
- Centralize circuit breaker and fallback logic in the client layer
- Build audit logs, alerts, and dashboards
- Run outage drills or chaos-style tests to verify behavior when dependencies go down
- If needed, move toward Redis-backed shared breaker state or more production-grade implementations
Reference links
-
FastAPI
-
HTTPX
-
Circuit Breaker
-
Retry
Conclusion
- A circuit breaker is a very practical pattern for preventing your entire FastAPI application from being dragged down by repeatedly calling an external API that continues to fail. As Martin Fowler explains, opening the circuit after a failure threshold and stopping further calls helps prevent cascading damage.
- In FastAPI, a realistic approach is to build on top of shared
AsyncClient, timeout settings, connection limits, and exception conversion, and then gradually layer in retries, breakers, and fallbacks. HTTPX already provides much of the groundwork for this. - A fallback is not just “swallowing errors.” It is a design choice that protects both user experience and system stability through things like cached responses, degraded operation, or background-job handoff. In practice, circuit breakers are much stronger when designed together with fallbacks and monitoring.
- You do not need a perfect implementation from day one. Even if you start with just one external API client, adding timeouts, exception conversion, a simple breaker, and a cache fallback will make the design much more tangible and easier to understand.
A natural next article after this would be something like “Design Patterns for Internal Admin APIs in FastAPI” or “Job Queue Design and Graceful Degradation in FastAPI.”

