Table of Contents

Fail-Proof DB Migrations: Evolving Schemas with FastAPI × SQLAlchemy × Alembic — Autogenerate, Manual Tuning, Data Migrations, Transactions, Testing & CI

Summary (big picture first)

Manage schema versions with Alembic and safely reflect FastAPI + SQLAlchemy changes.
Combine autogenerate with manual edits; handle multi-environment switching, data-migration scripts, and transaction semantics.
From local to production: checklists, how to wire into tests and CI, and common operational pitfalls.

Who benefits

Learner A (solo dev)
Tweaked models and it runs locally, but breaks elsewhere. Wants to run migrations correctly.
Small team B (3-person contract shop)
Frequent spec changes; tracking DB diffs by hand. Wants to standardize autogenerate + review.
SaaS team C (startup)
Wants safe rollout to staging and prod, and instant rollback if something goes wrong.

Accessibility assessment

Structured with short paragraphs and bullet points per section; code is monospaced with light comments.
Brief explanations for first-use terms; numbered steps to make order easy to follow.
Target level roughly AA.

1. Assumptions & goals

ORM: SQLAlchemy 2.x style; schema migrations with Alembic.
You can learn on SQLite, but production parity assumes an RDB like PostgreSQL.
Goal: standardize model change → diff generation → review → apply → verify so anyone gets the same result.

Decision points

Always use a transactional DB in production.
Autogenerate is handy, but index names, default values, and server-side functions often need manual touch-ups.

2. Initialize: bring in Alembic and get it running

2.1 Dependencies

SQLAlchemy>=2.0
alembic
psycopg[binary]  # Example when using PostgreSQL

2.2 Directory layout (example)

fastapi-app/
├─ app/
│  ├─ main.py
│  ├─ db.py
│  ├─ models.py
│  └─ settings.py
├─ alembic/
│  ├─ env.py
│  ├─ script.py.mako
│  └─ versions/
└─ alembic.ini

2.3 Init command

alembic init alembic

What gets generated

alembic.ini includes a DB URL template.
alembic/env.py is the entrypoint during migrations. Load SQLAlchemy Base here so autogenerate can “see” your target tables.

3. DB settings in the app and Base

3.1 Settings

# app/settings.py
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
    database_url: str = "postgresql+psycopg://user:pass@localhost:5432/appdb"
    class Config:
        env_file = ".env"
def get_settings() -> Settings:
    return Settings()

3.2 Session & Base

# app/db.py
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, DeclarativeBase

class Base(DeclarativeBase): pass

def get_engine(url: str):
    return create_engine(url, pool_pre_ping=True)

def make_session_factory(url: str):
    return sessionmaker(bind=get_engine(url), autoflush=False, autocommit=False)

3.3 Model example

# app/models.py
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy import String, Integer, DateTime, func
from app.db import Base

class Article(Base):
    __tablename__ = "articles"
    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    title: Mapped[str] = mapped_column(String(200), index=True, nullable=False)
    body: Mapped[str] = mapped_column(String, nullable=False)
    created_at: Mapped["DateTime"] = mapped_column(DateTime(timezone=True), server_default=func.now(), nullable=False)

4. Wire `env.py` to the FastAPI project

Add code in alembic/env.py to import the app’s Base and DB URL.

# alembic/env.py (excerpt)
from logging.config import fileConfig
from sqlalchemy import engine_from_config, pool
from alembic import context

# 1) Import the app's Base and settings
import os, sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))
from app.db import Base
from app.settings import get_settings

config = context.config
if config.get_main_option("sqlalchemy.url") in (None, "", "sqlite:///"):
    config.set_main_option("sqlalchemy.url", get_settings().database_url)

target_metadata = Base.metadata

fileConfig(config.config_file_name)

def run_migrations_offline():
    url = config.get_main_option("sqlalchemy.url")
    context.configure(
        url=url,
        target_metadata=target_metadata,
        literal_binds=True,
        compare_type=True,      # detect type changes
        compare_server_default=True  # detect server defaults
    )
    with context.begin_transaction():
        context.run_migrations()

def run_migrations_online():
    connectable = engine_from_config(
        config.get_section(config.config_ini_section), prefix="sqlalchemy.", poolclass=pool.NullPool
    )
    with connectable.connect() as connection:
        context.configure(
            connection=connection,
            target_metadata=target_metadata,
            compare_type=True,
            compare_server_default=True
        )
        with context.begin_transaction():
            context.run_migrations()

if context.is_offline_mode():
    run_migrations_offline()
else:
    run_migrations_online()

Key points

Enable compare_type=True and compare_server_default=True to stabilize diff detection.
Pull URL from env or .env, overriding alembic.ini defaults.

5. Create the initial migration

5.1 Create base

alembic revision --autogenerate -m "init"
# Review → apply
alembic upgrade head

alembic/versions/<rev>_init.py gets generated. Review tables and indexes before applying.

5.2 Common manual tweaks

Mind DB specifics when changing string lengths (shrinking may lose data).
Expressions like server_default=func.now() differ by dialect; hand-specify if needed.
For composite indexes and unique constraints, set naming conventions and make them explicit.

6. Make a change and run diffs

As an example, add a slug column to Article with a unique constraint.

6.1 Model change

# app/models.py (addition)
from sqlalchemy import UniqueConstraint
class Article(Base):
    __tablename__ = "articles"
    # existing columns omitted
    slug: Mapped[str] = mapped_column(String(200), nullable=False, unique=True)
    __table_args__ = (UniqueConstraint("slug", name="uq_articles_slug"),)

6.2 Create diff → review → apply

alembic revision --autogenerate -m "add slug to articles"
alembic upgrade head

The script will contain op.add_column and op.create_unique_constraint. Check names, ordering, and the sanity of the downgrade step.

7. Data migrations (how to write them)

If a schema change needs initial values or transforms for existing rows, put data ops in the same revision.

# alembic/versions/xxxx_add_slug.py (excerpt)
from alembic import op
import sqlalchemy as sa

def upgrade():
    op.add_column("articles", sa.Column("slug", sa.String(length=200), nullable=True))
    # Safe order: allow NULL → backfill existing rows → make NOT NULL
    conn = op.get_bind()
    articles = conn.execute(sa.text("SELECT id, title FROM articles")).mappings().all()
    for row in articles:
        slug = row["title"].lower().replace(" ", "-")
        conn.execute(sa.text("UPDATE articles SET slug=:s WHERE id=:i"), {"s": slug, "i": row["id"]})
    op.alter_column("articles", "slug", nullable=False)
    op.create_unique_constraint("uq_articles_slug", "articles", ["slug"])

def downgrade():
    op.drop_constraint("uq_articles_slug", "articles", type_="unique")
    op.drop_column("articles", "slug")

Notes

With existing data, apply in stages: add column → backfill → add constraints.
Large updates can cause long locks; split into batches or schedule off-hours.

8. Transactions & rollback

In PostgreSQL and similar, each migration generally runs in a transaction and auto-rolls back on failure.
For heavyweight ops (e.g., type changes on huge tables), investigate impact in advance.
SQLite’s DDL transaction behavior is weaker; autogenerate results can vary with constraints. Validate on a production-parity RDB.

9. Operating across environments (dev/stg/prod)

dev: freely create/apply migrations; validate with seed data.
stg: apply tests on prod-equivalent DB; measure duration and lock time.
prod: coordinate with maintenance windows or rolling deploys; watch monitors and alerts as you run.

Tips

alembic current for current version; alembic history for revision history.
Emergency rollback via alembic downgrade -1, but avoid data-losing downgrades. Prefer forward fixes.

10. Handling branches & merges

When multiple development lines create revisions in parallel, the version graph can split.

To avoid conflicts, merge to the main branch often and use alembic revision --autogenerate -m "merge heads" to unify.
In the merge revision, ensure both diffs apply in a correct order.
Tight naming conventions and review, and aligning generation timing among devs reduces trouble.

11. Should FastAPI auto-apply migrations at startup?

For local development, “upgrade head on startup” is convenient.
In production, prefer human review + manual execution, or an explicit CI/CD pipeline step.
Running alembic upgrade head in a container entrypoint and aborting startup on failure works well with IaC.

Sample (shell):

set -e
alembic upgrade head
exec uvicorn app.main:app --host 0.0.0.0 --port 8000

12. Integrate into testing

For unit tests: create a test DB → alembic upgrade head → run tests → rollback/teardown to keep it clean each time.
For migrations that touch existing data, prepare snapshot-like dummy data and verify upgrade success and post-state consistency.
Confirm reentrancy: failing then rerunning shouldn’t corrupt results.

pytest sketch:

import subprocess, os
def setup_module():
    os.environ["DATABASE_URL"] = "postgresql+psycopg://user:pass@localhost:5432/testdb"
    subprocess.check_call(["alembic", "upgrade", "head"])

13. CI/CD flow

If a PR includes alembic revision --autogenerate, have CI run alembic upgrade head to validate.
After deploying to staging, record migration runtime and logs; relax alert thresholds before prod.
Clarify the release order: DB migration → app rollout.

14. Small operational tips

Naming: make intent obvious—create_<table>, add_<col>_to_<table>, drop_<col>_from_<table>—so history reads well.
Split big changes into multiple steps; don’t do everything at once.
For heavy index builds, consider parallel/online options (PostgreSQL CONCURRENTLY), though Alembic will need manual SQL.
Always measure on a staging DB close to production size.

15. Common pitfalls and remedies

Symptom	Cause	Remedy
Autogenerate sees nothing	Base not imported; `target_metadata` unset	Ensure `Base.metadata` is passed in `env.py`
Shrinking string length fails	Existing data too long	Query for overflow rows; agree on trimming policy; migrate in stages
Server default keeps diffing	Dialect differences / function forms	Keep `compare_server_default`, but pin function names by hand
Long locks	Rewriting huge tables	Consider online methods, staged runs, off-hours, concurrent index builds
Downgrade breaks things	Destructive ops	Prefer forward fixes; if rollback is mandatory, take backups beforehand

16. Concrete example: introducing a tag table and associations

We’ll do many-to-many between articles and tags in stages:

Create tags table
Create join table article_tags (composite PK)
Attach tags to existing articles (data migration)
Add required indexes

Model changes (excerpt):

# app/models.py (added example)
from sqlalchemy import Table, Column, ForeignKey
from sqlalchemy.orm import relationship

article_tags = Table(
    "article_tags",
    Base.metadata,
    Column("article_id", ForeignKey("articles.id", ondelete="CASCADE"), primary_key=True),
    Column("tag_id", ForeignKey("tags.id", ondelete="CASCADE"), primary_key=True),
)

class Tag(Base):
    __tablename__ = "tags"
    id: Mapped[int] = mapped_column(primary_key=True)
    name: Mapped[str] = mapped_column(String(50), unique=True, nullable=False)

class Article(Base):
    __tablename__ = "articles"
    # existing columns omitted
    tags = relationship("Tag", secondary=article_tags, backref="articles")

Generate the diff → review join-table indexes and ondelete policy → add data-load SQL to upgrade if needed.

17. Roadmap

Initialize Alembic and tidy up env.py.
Create the first table set via autogenerate, review, and apply.
Make model changes → generate diffs → hand-tune → apply as the team standard.
Write data migrations in a safe order; measure time/locks in tests and staging.
Integrate with CI/CD; document branch merges and emergency response steps.

References

Alembic
SQLAlchemy
- SQLAlchemy 2.0 ORM
- Migration from 1.x to 2.x
PostgreSQL
- PostgreSQL Documentation
FastAPI
- FastAPI SQL Databases

Wrap-up

Version-control your schema with Alembic; start with autogenerate and finish by hand.
For existing data, design staged changes paired with data migrations; always measure transaction/lock impact.
Make local, staging, and production run the same steps; wire checks into CI/CD.
Keep branch merges and naming conventions explicit so history stays readable. This way, model evolution directly maps to safe releases.