Practical Background Processing with FastAPI: A Job Queue Design Guide with BackgroundTasks and Celery
Summary (high-level overview first)
- For operations you don’t want to make users wait for (sending emails, image processing, report generation, etc.), the basic policy is to process them in the background, separately from the HTTP response.
- FastAPI’s built-in
BackgroundTasksis suited to “running small tasks after sending the response, within the same process.” - On the other hand, for heavy or long-running jobs, or when you need retries and queue management, combining FastAPI with the distributed task queue Celery gives you a more robust setup.
- In this article, we’ll integrate both
BackgroundTasksand Celery into FastAPI, and concretely整理 which one to choose for which use case. - Finally, we’ll summarize operational concerns and patterns around testing, monitoring, retries, scaling, etc., and provide a step-by-step rollout roadmap.
Who will benefit from this (concrete images of readers)
-
Individual developers and learners
“After a file upload I want to generate a thumbnail, but I want to return a response immediately.” “Every time I send an email, the response gets slow…”
→ You’ll learn how to implement simple background processing withBackgroundTasks, and understand up to what point it’s sufficient. -
Backend engineers in small teams
You’ve accumulated time-consuming processes like PDF report generation and external API integrations, and things are getting painful mixed into the main API flow.
→ You’ll get a feel for how to move to a job queue + worker architecture by combining Celery with FastAPI. -
SaaS dev teams and startups
You have tens of thousands of emails, notifications, or batch processes per day, and proper retries and monitoring are becoming essential.
→ You’ll grasp the basics of Celery’s architecture, monitoring (Flower, etc.), retry design, and scaling.
Accessibility assessment (readability and consideration)
-
Information structure
The article uses an “inverted pyramid” structure: first explain the concepts and differences in broad strokes, then dive intoBackgroundTasks, then Celery, and finally present a comparison table and roadmap. -
Terminology and tone
Technical terms are briefly explained when first introduced, and then used consistently. The tone is polite and relaxed, but not overly casual. -
Code blocks
All code is shown in fixed-width blocks with concise comments. Extra blank lines are added so your eyes don’t get lost. -
Intended readers
It assumes readers have touched Python or FastAPI a little, but sections are structured to be readable independently and progressively.
Overall, it’s written with accessibility roughly equivalent to AA level in mind, so a wide variety of readers can follow along comfortably.
1. Why do we need background processing?
First, let’s整理 “why we even need background processing” at all.
1.1 HTTP requests and the “waiting time” problem
Normally, a FastAPI endpoint follows this flow:
- Client sends a request
- Server processes it
- When processing finishes, it returns a response
If heavy processing is mixed into this flow…
- Users are forced to wait for seconds
- The browser may hit a request timeout
- When concurrent access increases, workers get clogged up
and similar issues arise.
1.2 Tasks suited to background processing
Here are some typical examples of tasks that are much nicer if offloaded to the background:
- Sending emails, Slack notifications, push notifications
- Generating image thumbnails, creating PDFs, video transcoding
- Aggregating large CSV/Excel files and generating reports
- Calling external APIs multiple times (respecting rate limits while executing sequentially)
- Large-scale data import/export
In most cases, as long as users quickly receive a “we accepted your request” result, the rest can proceed slowly behind the scenes.
2. Using FastAPI’s built-in BackgroundTasks
Let’s start with BackgroundTasks, which FastAPI provides out of the box. This is “a mechanism to run small tasks in the same process after the response has been sent.”
2.1 The simplest example: email notification
# app/main.py
from fastapi import FastAPI, BackgroundTasks
app = FastAPI(title="BackgroundTasks sample")
def send_email(to: str, subject: str, body: str) -> None:
# In practice, you’d write the logic to connect to and send via a mail server
print(f"[MAIL] to={to}, subject={subject}, body={body}")
@app.post("/users/{user_id}/welcome")
async def send_welcome(user_id: int, background_tasks: BackgroundTasks):
# We’ll omit user lookup here
email = f"user{user_id}@example.com"
# Register the email to be sent after the response
background_tasks.add_task(
send_email,
to=email,
subject="Welcome!",
body="Thank you for signing up.",
)
return {"status": "ok", "message": "Your registration has been accepted"}
In this endpoint:
- The HTTP response is returned immediately
- After that,
send_emailis executed within the same process
That’s the basic flow.
2.2 You can also register async functions
BackgroundTasks can register not only def functions but also async def functions.
import httpx
from fastapi import BackgroundTasks, FastAPI
app = FastAPI()
async def notify_webhook(payload: dict) -> None:
async with httpx.AsyncClient(timeout=5.0) as client:
await client.post("https://example.com/webhook", json=payload)
@app.post("/events")
async def create_event(background_tasks: BackgroundTasks):
# Assume some event has been saved
background_tasks.add_task(notify_webhook, {"type": "created"})
return {"status": "accepted"}
Even when registering async functions, the execution timing is still “after the response is sent,” but FastAPI/Starlette will handle them properly on the event loop.
2.3 Roughly understanding the mechanism
It’s helpful to think of BackgroundTasks simply as “a list of functions to run after the response, within the same app process.”
- Tasks do not run in other processes or on other machines
- There is no queue or retry mechanism
- If the worker process dies, so do the tasks
So it’s really suited for “lightweight tasks that might take a few seconds, which you’d like to run after the response.”
3. Design points and limits of BackgroundTasks
BackgroundTasks is convenient, but it has strengths and weaknesses. If you整理 these, it becomes much easier to judge when you need to move to a “serious job queue” like Celery.
3.1 Cases where it fits well
- Light operations that take on the order of a few to several seconds
- Task volume is not that high (tens to hundreds per minute)
- It’s okay for the same number of instances as the REST API to handle the load
- It’s not fatal if a task fails, or logging alone is sufficient
Concrete examples:
- Sending a single email after user registration
- Sending lightweight logs or adding audit records
- Generating small image thumbnails
3.2 Cases where it doesn’t fit well (watch out)
- You need to process a huge number of tasks (e.g., tens of thousands or more)
- Tasks take several minutes to tens of minutes
- You need retries, delayed execution, or scheduled execution
- You want to be able to check task status (success/failure/progress) after the fact
Why?
- If the process goes down, tasks are lost
- If the number of running tasks grows, your FastAPI workers get choked
- Implementing state management and retries yourself is a lot of work
That’s the reason.
3.3 Common pitfalls and countermeasures
-
Putting CPU-heavy tasks into BackgroundTasks
→ They will block the event loop and slow down other requests.
→ CPU-bound tasks should be offloaded to a separate process (Celery, etc.) for safety. -
Not handling exceptions
→ If an exception occurs inside a task, it just ends up in the logs and is hard to see from the caller side.
→ Make sure to log exceptions properly inside tasks, and send out notifications if needed. -
Throwing massive numbers of tasks into it
→ Responses will still be returned, but the backend can’t keep up with the background load, slowly consuming more memory and CPU.
→ Decide an upper bound like “this much volume is okay,” and once things get seriously heavy, it’s time to move to Celery.
4. Basics of the distributed task queue Celery
When BackgroundTasks becomes insufficient, Celery enters the picture. Celery is a representative distributed task queue for Python, used in many production systems.
4.1 Components of Celery
Celery roughly consists of three parts:
-
Broker
Manages task queues. Redis and RabbitMQ are commonly used. -
Worker
Processes that take tasks off the broker and execute them. They can be distributed across multiple machines. -
Result backend
Stores task results and status. Redis or a database can be used.
FastAPI acts as “the entry point that enqueues tasks,” while the actual work is performed by separate Celery worker processes.
4.2 Typical directory layout
myapp/
├─ app/
│ ├─ main.py # FastAPI
│ ├─ celery_app.py # Celery app definition
│ ├─ tasks.py # Celery task definitions
│ └─ ...
├─ docker-compose.yml
└─ requirements.txt
With this structure, you’ll often run FastAPI and Celery workers as separate containers.
5. Building the minimal FastAPI + Celery setup
From here, let’s try wiring up FastAPI and Celery together with a simple example.
5.1 Defining the Celery app
# app/celery_app.py
from celery import Celery
celery_app = Celery(
"myapp",
broker="redis://redis:6379/0", # Example assuming docker-compose
backend="redis://redis:6379/1",
)
celery_app.conf.update(
task_track_started=True,
result_expires=3600, # expire results after 1 hour
)
Here we’re using Redis both as broker and result backend.
5.2 Defining a task
# app/tasks.py
import time
from app.celery_app import celery_app
@celery_app.task(name="tasks.long_add")
def long_add(x: int, y: int) -> int:
# Pretend this is a time-consuming process
time.sleep(10)
return x + y
The @celery_app.task decorator registers a function as a task, making it callable via delay or apply_async.
5.3 Enqueuing tasks from FastAPI
# app/main.py
from fastapi import FastAPI
from pydantic import BaseModel
from app.tasks import long_add
app = FastAPI(title="FastAPI + Celery sample")
class AddRequest(BaseModel):
x: int
y: int
@app.post("/jobs/add")
def enqueue_add(req: AddRequest):
# Enqueue as a Celery task
async_result = long_add.delay(req.x, req.y)
return {"task_id": async_result.id}
@app.get("/jobs/{task_id}")
def get_result(task_id: str):
result = long_add.AsyncResult(task_id)
if not result.ready():
return {"task_id": task_id, "status": result.status}
if result.failed():
return {"task_id": task_id, "status": "FAILURE"}
return {
"task_id": task_id,
"status": "SUCCESS",
"result": result.result,
}
In this example:
- POSTing
{"x": 1, "y": 2}to/jobs/addenqueues the task and returns a task ID. - GET
/jobs/{task_id}to check the status or result.
5.4 Example docker-compose setup
In real projects you’ll often use docker-compose to spin up Redis and Celery workers along with the API.
# docker-compose.yml (example)
version: "3.9"
services:
api:
build: .
command: uvicorn app.main:app --host 0.0.0.0 --port 8000
depends_on:
- redis
ports:
- "8000:8000"
worker:
build: .
command: celery -A app.celery_app.celery_app worker --loglevel=INFO
depends_on:
- redis
redis:
image: redis:7
ports:
- "6379:6379"
By running FastAPI and Celery workers as separate containers/processes:
- You decouple API responses from heavy processing
- You can scale workers horizontally with ease
Those are the key benefits.
6. Comparing BackgroundTasks and Celery
Now that we’ve seen FastAPI’s BackgroundTasks and the distributed task queue Celery, let’s整理 the differences in a comparison table.
| Aspect | BackgroundTasks | Celery |
|---|---|---|
| Execution location | Same process as FastAPI app | Separate worker processes; can be on separate hosts |
| Dependencies | None (FastAPI only) | Broker (Redis / RabbitMQ) and workers |
| Execution timing | After sending the HTTP response | Once enqueued, runs when a worker is free |
| Retries & delayed execution | Must be hand-crafted | Supported out of the box (retry, countdown, etc.) |
| Task state management | Must be implemented by you | Task IDs, state, and results are manageable |
| Scaling | Tied to API worker scaling | Workers can be scaled independently |
| Best suited for | Light, short-running tasks | Heavy tasks, large volumes, batch-style processing |
Very roughly:
- Start with
BackgroundTasks - When load or requirements grow, move to (or add) Celery
That tends to be the practical approach in real projects.
7. Some more practical Celery features
Since we came this far, let’s sample a few Celery features that are particularly handy in production.
7.1 Retries
External APIs or mail servers sometimes fail temporarily. With Celery, you can easily configure retries in the task definition.
# app/tasks.py
from celery import shared_task
from app.celery_app import celery_app
import httpx
@celery_app.task(bind=True, max_retries=3, default_retry_delay=10)
def send_webhook(self, url: str, payload: dict):
try:
resp = httpx.post(url, json=payload, timeout=5.0)
resp.raise_for_status()
except Exception as exc:
# On failure, retry 10 seconds later (up to 3 times)
raise self.retry(exc=exc)
7.2 Delayed and scheduled execution
Needs like “run 5 minutes from now” or “run every day at midnight” are also easy to handle.
# Run 5 minutes later
send_webhook.apply_async(
args=["https://example.com/webhook", {"foo": "bar"}],
countdown=300,
)
For regular schedules, you typically combine Celery with celery beat or other schedulers.
7.3 Monitoring and management tools (Flower, etc.)
Flower is a common tool to visualize Celery’s state via a web UI.
- Lists of running and pending tasks
- History of successes and failures
- Worker status
If you use Celery in production, it’s well worth considering tools like Flower.
8. Tricks for testing and local development
More background processing means more impact on tests and local development. Here are small tricks to keep BackgroundTasks and Celery test-friendly.
8.1 Testing BackgroundTasks
BackgroundTasks simply “registers a function to call,” so in unit tests it’s best to test the task function itself directly.
# app/tasks_local.py
def write_audit_log(user_id: int, action: str) -> None:
...
# app/main.py
from fastapi import BackgroundTasks
@app.post("/do-something")
async def do_something(background_tasks: BackgroundTasks):
# We’ll omit the main logic
background_tasks.add_task(write_audit_log, 123, "do_something")
return {"ok": True}
For tests:
- Use unit tests on
write_audit_logto verify its logic - In endpoint tests, just lightly confirm that
BackgroundTasksis being called correctly
That two-layered approach is realistic.
8.2 Testing Celery tasks (eager mode)
Celery has a task_always_eager option. If you enable it, tasks execute immediately in the local process without using workers.
# tests/conftest.py or similar
from app.celery_app import celery_app
def pytest_configure():
celery_app.conf.update(task_always_eager=True)
With this:
res = long_add.delay(1, 2)
assert res.result == 3
You can check results synchronously during tests. (Just remember to set task_always_eager=False again for production.)
9. Operational points to watch (more Celery-oriented)
When running Celery in production, it helps to decide on the following points in advance to reduce trouble.
9.1 Task “granularity” and idempotency
- Avoid tasks that run for long periods (tens of minutes to hours). Split them into smaller tasks where possible.
- Design tasks to be idempotent, i.e., safe even if the same task is executed twice (beware of double-calling external APIs).
If you keep idempotency in mind, it’s much easier to reason about “how far we got” when doing retries or disaster recovery.
9.2 Retries, timeouts, and dead letters
- Decide on retry counts and intervals (e.g., up to 3 times, exponential backoff).
- Set maximum runtime (timeouts) for tasks to automatically stop abnormal long-running processes.
- Route tasks that keep failing even after retries into a “dead letter” queue for special handling.
These are connected to your system’s SLOs (how often operations should succeed), so it’s good to align on them as a team.
9.3 Logging and monitoring
- For important tasks, log start, success, and failure clearly (JSON logs are recommended).
- Visualize key metrics such as task volume per type, worker counts, and failure rates on dashboards.
- Trigger alerts when failure rates suddenly spike.
If you design logs and metrics together with FastAPI’s, troubleshooting becomes much easier when issues arise.
10. Introduction patterns by target reader
Based on everything so far, let’s整理 “where to start” depending on your situation.
10.1 Individual developers and learners
- Step 1: Use
BackgroundTasksto move email sending and small tasks behind the response. - Step 2: When you get a long-running or high-volume task, try the Celery sample setup locally.
- Step 3: Use docker-compose to run FastAPI + Celery + Redis together.
By this point, “what you need to manage yourself, and to what extent” should be much clearer.
10.2 Backend engineers in small teams
- Step 1: Scan existing APIs for parts that can be offloaded to
BackgroundTasks. - Step 2: For heavy or high-volume operations, start by carving them out into Celery workers.
- Step 3: Design your job queues (queue names and priorities), add audit logging, and visualize everything with Flower or similar tools.
A gradual approach of “Celery for some functionalities first” is realistic. There is no need to migrate everything at once.
10.3 SaaS dev teams and startups
- Separate queues by task type (e.g.,
emails,reports,integrations). - Adjust worker counts and resources per queue to control priorities.
- Build monitoring, alerts, and dashboards so you can quickly see “which task is causing the bottleneck.”
At this stage, responsibilities become clear: “Keep the FastAPI app light and simple; move all heavy lifting to Celery.”
11. Rollout roadmap (it’s okay to go step by step)
Finally, here’s a roadmap for introducing and improving background processing from now on:
- Use
BackgroundTasksto offload light tasks like email sending and webhook notifications. - Measure processing times and volumes to identify bottlenecks.
- Extract heavy tasks into Celery tasks and try running FastAPI + Celery locally with docker-compose.
- Provide a task ID and status API (like
/jobs/{task_id}) so you can integrate with frontends and other services. - Set up retries, timeouts, idempotency, and monitoring, then gradually route production traffic through Celery.
- As needed, split queues, scale workers, and introduce monitoring tools like Flower.
Doing “everything at once” is genuinely hard, so starting with BackgroundTasks and taking small steps is highly recommended.
Reference links (official docs and articles)
Note: Contents are accurate as of the time of writing. Please check each site for the latest information.
-
FastAPI
-
Starlette
-
Celery
-
Examples of FastAPI × Celery integration
-
Background knowledge
Summary
- As a rule of thumb, you should offload tasks that you don’t want users to wait for to the background whenever possible.
- FastAPI’s built-in
BackgroundTasksis very convenient and easy to use for running lightweight tasks after the response, but it’s not suited to heavy or high-volume jobs. - For large-scale, high-load, or retry- and monitoring-sensitive work, combining FastAPI with the distributed task queue Celery makes stable operations much easier.
- A phased approach is practical and safe in real projects: start with
BackgroundTasks, and introduce Celery once you truly need it.
This article got a bit long, but hopefully it helps you decide “which tasks to offload to the background” and “from where Celery becomes appropriate.”
It’s totally fine to go slowly, one step at a time—maybe start by trying background processing for something close at hand, like email sending or thumbnail generation.
