Amazon DynamoDB Deep Dive
A “Scalable NoSQL Design” Guide Through Comparison with Cloud Bigtable, Cloud Firestore, and Azure Cosmos DB
Introduction (Key Takeaways)
- In this article, we focus on Amazon DynamoDB and compare it with Google Cloud Bigtable / Cloud Firestore and Azure Cosmos DB to clarify how to think about designing scalable NoSQL databases.
- Amazon DynamoDB is a fully managed, serverless key-value / document NoSQL database, designed to horizontally scale to internet scale with single-digit millisecond latency.
- On GCP, Cloud Bigtable is a wide-column NoSQL database for large-scale data, while Cloud Firestore is a document database aimed at mobile and web apps. Azure’s Cosmos DB is positioned as a globally distributed, multi-model NoSQL / vector database.
- The core of the design is:
- Start from the use case and access patterns and think in terms of “single-table design.”
- Use the partition key and sort key to decide how you will scale.
- Use GSIs/LSIs, indexes, and caches to optimize queries.
- Control costs using on-demand / provisioned capacity and auto scaling.
- Secure future extensibility with global tables and event-driven integration.
- Intended readers:
- Backend developers for web / mobile applications
- Engineers working on games, e-commerce, SaaS, and other services with fast-growing users and traffic
- Architects considering migration from RDBMS
- Technical leaders / SREs who want to evaluate NoSQL options across GCP / Azure as well
- By the end, the goal is for you to be able to explain in your own words “Given these requirements, which of DynamoDB / Bigtable / Firestore / Cosmos DB should we use, and how should we design it?”
1. What Is DynamoDB? — AWS’s “Internet-Scale NoSQL Engine”
1.1 Service Overview and Characteristics
Amazon DynamoDB is a fully managed, serverless NoSQL database provided by AWS.
Its main characteristics can be summarized as follows:
- Supports key-value and document data models (up to about 400 KB per item).
- Designed for single-digit millisecond latency, so it can maintain predictable performance even as requests and data volume grow.
- Fully managed and serverless: no need to manage instances, patching, or sharding. Capacity is controlled via on-demand billing or provisioned capacity plus auto scaling.
- Supports multi-region, multi-active global tables, with automatic replication across regions.
- Offers automatic backups, point-in-time recovery, encryption, and an in-memory cache (DAX) for acceleration.
In short, DynamoDB embodies a world where you move away from rigid RDBMS schemas and joins, and instead design tables around access patterns.
1.2 Positioning Relative to Other Cloud NoSQL Services
Let’s roughly place it side by side with other services:
- DynamoDB (AWS)
- Key-value + document; strong at internet-scale online workloads.
- Cloud Bigtable (GCP)
- Wide-column NoSQL. Ideal for time-series data, metrics, IoT, analytics backends—large, continuous datasets.
- Cloud Firestore (GCP / Firebase)
- A document DB specialized in real-time sync and offline support for mobile / frontend apps.
- Azure Cosmos DB
- A globally distributed, multi-model NoSQL / vector database with multiple consistency models as a key feature.
You can think of DynamoDB as optimal for very high-throughput OLTP and user-facing online workloads, Bigtable for analytics-ish, large time-series data, Firestore for tight client-app integration, and Cosmos DB for multi-region distribution and multi-model flexibility.
2. DynamoDB’s Data Model and Core Concepts
2.1 Tables, Partition Keys, and Sort Keys
DynamoDB tables are built around a partition key (required) and an optional sort key.
- Partition key
- The core factor in data distribution and scalability. If values are skewed, you get hot partitions, leading to performance degradation.
- Sort key
- Defines the order within a single partition key. If you use timestamps or types here, range queries become much easier.
Each item can have flexible attributes (columns) and is effectively schema-less. When creating a table, you mainly define the keys; other attributes can vary from item to item.
2.2 How to Design from Access Patterns
In traditional RDBMS design, the mindset is “normalize → figure out queries later.”
With DynamoDB, the mindset is “list access patterns first, then derive table and key design from them.”
For example, in an e-commerce system for products and orders, you might list patterns such as:
- “Get a user’s order history in descending chronological order”
- “Get order details by order ID”
- “Get inventory records by product ID”
For each access pattern, you then decide on a partition key + sort key, and additional indexes if needed.
2.3 GSI / LSI (Secondary Indexes)
DynamoDB is not designed for arbitrary ad-hoc queries as in an RDBMS. Instead, you rely on secondary indexes to create additional views.
- Local Secondary Index (LSI)
- Shares the same partition key as the base table but uses a different sort key, giving you multiple sorted views per partition.
- Global Secondary Index (GSI)
- Think of it as a logically separate table with its own key definition, backed by the base table. You can freely define both partition and sort keys.
Firestore and Cosmos DB also require careful index design, but with DynamoDB there is an especially strong need to draw the full picture upfront—access patterns and their indexes.
3. Typical Use Cases — When Should You Choose DynamoDB?
3.1 User Profiles, Sessions, and Auth Data
- Ideal for SNS, games, SaaS, and other domains with large numbers of users and random read/write access.
- Use the user ID as the partition key and store:
- Profile
- Settings
- Session
together via a “single-table design” pattern that’s popular in the DynamoDB world.
3.2 Shopping Carts and Order History
- Carts and orders are a textbook case where per-user, time-ordered access is common.
- If you design your keys like
PK = USER#<userid>,SK = ORDER#<timestamp>,
you can easily fetch a user’s orders in chronological order. - One reason DynamoDB is widely used in retail and e-commerce is that it scales horizontally under peak loads, such as sales or campaigns.
3.3 IoT, Logs, and Event Stores
- Great for workloads involving high-speed ingestion and recent data queries, such as sensor data streams and event logs.
- For “whole-history” analytics, however, it is often more realistic to combine DynamoDB with other stores like Bigtable, BigQuery, or S3 + Athena.
3.4 When Bigtable / Firestore / Cosmos DB Shine
- Bigtable: Large-scale, continuous data requiring fast, low-latency processing, such as metrics, time-series data, logs, and ML feature stores.
- Firestore: Mobile / frontend apps requiring real-time sync and offline support.
- Cosmos DB: Large SaaS or enterprise applications needing global distribution, multi-model support, and flexible consistency options.
4. Performance and Scalability — Partition Design Is Everything
4.1 Per-Request Throughput
DynamoDB internally splits tables into partitions (physical shards), each with fixed throughput capacity.
- If your partition key is skewed, all requests tied to that key hit a single partition, leading to throttling (
ProvisionedThroughputExceededException). - Therefore, “how you distribute keys” is the heart of your scale design.
4.2 Best Practice Examples
- Avoid auto-increment IDs; instead, use IDs that combine hashing + time.
- Shard by something like “user ID × date” so that even one user’s data is distributed across partitions.
- Embed “bucket numbers” into partition keys to spread load across multiple partitions.
Bigtable similarly requires you to carefully design row-key prefixes for even load distribution. Cosmos DB also heavily depends on partition key design for throughput and cost.
5. Capacity Models and Cost Design
5.1 On-Demand vs Provisioned Capacity
DynamoDB offers two main capacity models:
-
On-Demand Capacity Mode
- No need to set throughput in advance. It automatically scales with the request volume, and you pay per request.
- Great for unpredictable traffic patterns and early-stage workloads / PoCs.
-
Provisioned Capacity Mode
- You predefine read/write capacity units (RCU/WCU).
- In combination with auto scaling, it’s easier to optimize cost under stable load.
Cloud Bigtable also charges based on nodes and throughput; Firestore and Cosmos DB bill based on request counts or RUs. Across these systems, cost is controlled by how accurately you size throughput.
5.2 Cost Optimization Design Points
- Don’t go overboard creating GSIs/LSIs—they incur read/write charges too.
- If you pack too much into a single item, you’ll hit the 400 KB item size limit more easily; if you fragment too much, request counts go up. Finding a balance is key.
- For hot read paths, use DAX (DynamoDB Accelerator) or application-level caching to reduce RCUs.
6. Reliability, Security, and Global Distribution
6.1 Fault Tolerance and Backups
- Within an AWS region, DynamoDB is replicated across multiple AZs, making it resilient to single-AZ failures.
- It supports on-demand backups and point-in-time recovery (PITR), making it easy to recover from accidental deletes or updates.
Bigtable and Cosmos DB also assume redundant setups within and across regions, and Firestore lets you choose regional / multi-regional configurations. High availability is essentially a given for modern managed NoSQL services.
6.2 Global Tables and Multi-Region
- DynamoDB global tables automatically replicate data across selected regions and allow local writes in any region (“multi-active” architecture).
- Cosmos DB likewise offers global distribution and several consistency levels, making it a popular choice for globally deployed SaaS.
6.3 Security and Access Control
- You can use IAM and resource-based policies to control access at the table and item level.
- Encryption with KMS and private access via VPC endpoints are also available.
Firestore and Cosmos DB provide IAM roles, role-based access control, and key-based auth. In all cases, the design principle is “grant least privilege per application identity.”
7. Design Example: Single-Table E-Commerce
Let’s walk through a simple design example using a single table.
7.1 Table Structure Overview
Table name: ecommerce
PK(partition key):PKSK(sort key):SK
Example items:
| PK | SK | type | attributes |
|---|---|---|---|
USER#123 |
PROFILE |
USER |
name, email, created_at |
USER#123 |
ORDER#2025-11-20T10:00Z |
ORDER |
total, status, items… |
USER#123 |
ORDER#2025-11-21T09:00Z |
ORDER |
… |
PRODUCT#ABC |
META |
PRODUCT |
title, price, stock |
PRODUCT#ABC |
INVENTORY#TOKYO |
INV |
quantity |
PRODUCT#ABC |
INVENTORY#OSAKA |
INV |
quantity |
With this setup:
- “Order history for user 123”:
- Query items where
PK = USER#123, sort bySKdescending.
- Query items where
- “Product ABC info and inventory”:
- Query items where
PK = PRODUCT#ABCand get all item types for that product.
- Query items where
This design allows you to express common access patterns in a single table naturally.
7.2 GSI Example: Recent Orders per Product
If you want to see recent orders by product ID, you can create a GSI to provide a different “view.”
- GSI1
GSI1PK = PRODUCT#<productId>GSI1SK = ORDER#<timestamp>
Add GSI1PK and GSI1SK attributes to order items, and query GSI1 when you need product-centric order lists.
In Firestore, you would use collections / subcollections + composite indexes; in Cosmos DB, partition keys + indexes can achieve similar views.
8. Common Pitfalls and How to Avoid Them
8.1 Designing the Schema with an RDB Mindset
-
If you apply “normalize tables first and figure out queries later” directly, you’ll often end up with:
- No joins
- Exploding numbers of GSIs for each access pattern
- Soaring throughput and costs
-
Countermeasure:
- List all access patterns per screen or API before designing your schema.
- Then derive table structure, keys, and indexes from those patterns.
8.2 Skewed Partition Keys
- This is hard to catch in small dev or test environments with few users, but once you go to production, key hot spots will cause throttling and higher latency.
- Countermeasure:
- Use CloudWatch metrics and logs to visualize key distribution.
- Revisit key design early, using hashing or sharding buckets as needed.
8.3 Overusing GSIs
- GSIs are powerful, but every write also updates the GSIs, increasing cost and latency.
- Countermeasure:
- Ask whether you truly need a given view for real-time OLTP, or if analytics could be offloaded to another store.
- For infrequent access patterns, consider separate tables or S3 + Athena instead of extra GSIs.
9. Who Benefits and How? (Concrete Benefits by Role)
9.1 Backend / Application Developers
- You’ll understand DynamoDB, Firestore, Cosmos DB, and other NoSQL systems not as “mysterious NoSQL” but as databases that must be designed around access patterns.
- This lets you more confidently propose data models that are:
- Robust under scale
- Resilient to change
- Cost-predictable
9.2 SREs, Infrastructure Engineers, and Data Platform Teams
- You’ll have a clearer picture of operational concerns: partition design, capacity models, global tables, caching, etc.
- By comparing with Bigtable and Cosmos DB, you’ll more easily decide which workload belongs on which service.
9.3 Architects, Tech Leads, and CTOs
-
You’ll gain a perspective on dividing a formerly RDBMS-centric system into:
- “Strongly transactional areas that stay on RDS / Aurora” and
- “High-throughput, scale-focused areas that move to DynamoDB / Bigtable / Cosmos DB”
—in other words, a hybrid data layer architecture.
-
For multi-cloud strategies, you’ll be able to reason about similarities and differences among the clouds’ NoSQL offerings in a structured way.
10. Three Things You Can Do Today
-
Write down 10 read/write patterns for your own service.
- For each screen or API, roughly note: “Which key? How many reads/writes?”
-
Pick 1–2 of those patterns and map them to a DynamoDB table design.
- On paper, sketch
PK/SK/ candidate GSIs / example items. This alone helps you grasp the design space.
- On paper, sketch
-
Ask yourself how the same design would look in Bigtable / Firestore / Cosmos DB.
- Try mapping to row keys, collections/documents, partition keys/containers, etc. This builds a cloud-agnostic sense of good design.
11. Conclusion: DynamoDB as an Entry Point to “Access-Pattern-First” Design
Amazon DynamoDB is:
- A serverless NoSQL engine that scales to internet scale, and
- A system where performance and cost are largely determined by partition key / sort key / index design, and
- A database that requires you to design tables from access patterns, a shift from the RDB mindset.
At the same time, GCP’s Bigtable / Firestore and Azure Cosmos DB each have their strengths. There’s no single “right answer”—the best choice depends on your use case and organizational context.
What matters most is:
- “How often, by whom, and in what form will this data be read and written?”
- “How far do we expect this to scale, and how strong does consistency need to be?”
Think about those questions carefully and then work backward to choose DynamoDB or other NoSQL services.
If you start small—with a tiny table or a PoC—you’ll gradually develop a feel for “data modeling that isn’t scary even at scale.”
Take your time, enjoy the process, and turn DynamoDB and the other cloud NoSQL offerings into tools that work for you, not against you.
References (Official Docs and Guides)
- Amazon DynamoDB Product Page
- What is Amazon DynamoDB? (Developer Guide)
- DynamoDB Features (Key-Value & Document Models)
- Google Cloud Bigtable Overview
- Cloud Firestore Overview
- Azure Cosmos DB Overview
Note: Pricing, limits, and new features can change over time. When designing or deploying real systems, always check the latest official documentation and pricing pages.
