AWS SNS
AWS SNS
Table of Contents

Amazon SNS Deep Dive: A Practical “Pub/Sub Design” Guide Compared with Cloud Pub/Sub and Azure Event Grid / Service Bus

Introduction (Key Takeaways)

  • This article focuses on Amazon Simple Notification Service (Amazon SNS) and,
    in comparison with Google Cloud Pub/Sub, Azure Event Grid, and Azure Service Bus Topics,
    整理 how to design event-driven, Pub/Sub-style messaging.
  • Amazon SNS is a fully managed Pub/Sub service that supports both A2A (application-to-application) and A2P (application-to-person) messaging.
    Messages published to a topic can be fanned out to multiple SQS queues, Lambda functions, HTTPS endpoints, email, SMS, and mobile push notifications.
  • On GCP, Cloud Pub/Sub is widely used as a high-throughput asynchronous messaging backbone for microservice integration and data pipelines.
  • On Azure:
    • Event Grid: a serverless Pub/Sub broker focused on event routing
    • Service Bus Topics: an enterprise-grade, feature-rich message broker (queues + topics)
      roughly map to the same layer as SNS.
  • This article is written for:
    • Backend engineers who want to adopt an event-driven architecture with microservices or serverless designs
    • SREs and architects who understand SQS and queues to some extent, but want to整理 the differences between SNS / Pub/Sub / Event Grid / Service Bus Topics
    • Product developers who want to send user notifications (email / SMS / mobile push) smartly via the cloud
  • By the end of this article, the goal is that you can explain in your own words decisions like:
    • “This event goes through SNS topic → SQS fan-out”
    • “For log streams and analytics, we’ll use Pub/Sub / Event Grid / Service Bus Topics”
    • “User-facing alerts are sent via SNS email / SMS / mobile push”

1. What Is Amazon SNS? AWS’s Pub/Sub Hub Bridging A2A and A2P

1.1 Role and Characteristics of SNS

Amazon SNS is a fully managed Pub/Sub (Publish / Subscribe) messaging service provided by AWS.

To put it a bit more casually:

“You publish events or notifications to a ‘topic,’
and SNS acts like a delivery center that broadcasts them to multiple destinations (services or humans) that subscribed to that topic.”

Its main characteristics are:

  • A2A (Application-to-Application) messaging
    • Share events between microservices
    • Fan-out from SNS to SQS / Lambda / Kinesis / HTTP(S) endpoints, etc.
  • A2P (Application-to-Person) notifications
    • Deliver notifications to end users via email, SMS, and mobile push (APNs / FCM, etc.)
  • Fully managed and scalable
    • AWS handles scaling and redundancy of topics and subscriptions
  • Deep integration
    • Many AWS services (CloudWatch Alarms, Auto Scaling, S3, Lambda, EventBridge, etc.) can directly send notifications to SNS topics.

1.2 Endpoint Types Supported by SNS

The main types of endpoints you can configure as subscribers to an SNS topic are:

  • Amazon SQS queues
  • AWS Lambda functions
  • HTTP / HTTPS endpoints (e.g., webhooks)
  • Email (SMTP / JSON)
  • SMS (mobile text messages)
  • Mobile push notifications (APNs / FCM / ADM, etc.)

On GCP and Azure, the A2P side (email / SMS / push) is often handled by separate services, so
it’s fair to say that “one service handling both A2A and A2P” is a distinctive feature of SNS.


2. Positioning SNS vs. Equivalent Services on Other Clouds

Let’s briefly align mental models by looking at comparable services across clouds.

2.1 Google Cloud Pub/Sub

  • Google Cloud Pub/Sub is GCP’s representative Pub/Sub messaging backbone.
    • Publishers publish events to topics.
    • Subscribers receive messages from subscriptions via pull or push.
  • Key characteristics:
    • Very high throughput (used internally for Google Search, Ads, etc.)
    • Supports both push and pull
    • Strong integration with BigQuery, Dataflow, Cloud Functions, Cloud Run

In terms of intent, SNS (+ SQS) and Cloud Pub/Sub are quite similar as systems that
“loosely couple applications around events.”

2.2 Azure Event Grid & Service Bus Topics

  • Azure Event Grid
    • An event broker that routes events from Azure resources or SaaS to various handlers
      (Azure Functions, Webhooks, Queues, Event Hubs, etc.).
  • Azure Service Bus Topics
    • The Pub/Sub feature of Azure Service Bus. Messages sent to a topic can be received by multiple subscriptions that each apply filters.

Roughly mapping them:

  • SNS topics → Azure Service Bus Topics / GCP Pub/Sub topics
  • EventBridge → Azure Event Grid / (and sometimes) Cloud Pub/Sub

Remembering this mapping helps keep the picture clear.


3. Core Concepts in SNS: Topics, Subscriptions, and Message Attributes

3.1 Topics

  • The central unit in SNS is the topic.
  • It’s common to split topics along “event type” or “domain” axes, such as:
    • orders (order-related events)
    • user-signup (user registration completed)
    • system-alerts (alert notifications)

Publishers (senders) call the Publish API on these topics.

3.2 Subscriptions

  • A subscription is how you tell SNS “please deliver messages from this topic to this destination.”
  • For a single topic, you can subscribe multiple endpoints, such as:
    • SQS Queue A, SQS Queue B
    • A Lambda function
    • A Slack webhook
    • An admin notification email

This enables a pattern where “one event can be processed by multiple systems and humans in their own ways.”

3.3 Message Attributes and Filtering

SNS lets you attach message attributes to messages, separate from the message body.

Example:

  • event_type = "order_created"
  • priority = "high"
  • region = "ap-northeast-1"

Using this, you can configure filter policies on subscriptions so that:

  • Only messages with priority = "high" go to an alert SQS queue
  • Only messages with region = "eu-west-1" go to an EU processing system

This makes fine-grained routing possible.
Conceptually it’s very similar to subscription filters in Azure Service Bus Topics or event filters in Azure Event Grid.


4. Representative Use Cases: How to Use SNS

4.1 Application-to-Application Event Distribution (A2A)

Use case example: order event fan-out

  • When an order is placed on an e-commerce site, an event is published to the orders topic.
  • Subscriptions might include:
    • billing-queue SQS (billing process)
    • shipping-queue SQS (shipping process)
    • analytics-queue SQS (analytics data pipeline)
    • send-order-email Lambda (user-facing order confirmation emails)

With a single order event,
multiple backend processes can independently kick off.

On GCP, you’d implement a similar design with a Pub/Sub topic plus multiple subscriptions;
on Azure, Service Bus Topics + multiple subscriptions or Event Grid + Functions/Queues would achieve the same pattern.

4.2 System Alerts and Monitoring Integration

  • When a CloudWatch Alarm breaches a threshold:
    • It sends a notification to an SNS topic,
    • which is subscribed by admin emails, Slack webhooks, on-call SMS numbers, etc.
  • Many services (Auto Scaling, RDS, S3, CodeBuild, CodeDeploy, etc.) can emit events via SNS.

On GCP, you can do something similar with Monitoring alerts → Cloud Pub/Sub.
On Azure, Monitor alerts → Event Grid / Service Bus works in a similar fashion.

4.3 User Notifications (Email, SMS, Mobile Push)

SNS also covers A2P scenarios:

  • Email notifications
    • Order confirmation emails from e-commerce sites
    • Password reset emails
  • SMS notifications
    • Two-factor authentication codes or critical alerts
  • Mobile push notifications
    • App update announcements, campaign messages, push-based engagement

On GCP/Azure, email/SMS/mobile push usually require separate services or external providers,
so one realistic approach is to “let SNS handle all notification flows in early development stages.”

4.4 IoT, Logs, and Event Stream Hubs

You can aggregate events from IoT devices and microservices into an SNS topic and then:

  • Send some of them to batch processing via SQS
  • Route some to real-time detection via Lambda
  • Stream some to a data lake via Kinesis / Firehose

This is a hub-and-spoke event distribution pattern.
On GCP the equivalent would be Pub/Sub → Dataflow / BigQuery; on Azure, Event Grid / Event Hubs / Service Bus → Stream Analytics / Synapse play a similar role.


5. Architecture Patterns: Combining SNS with SQS / Lambda / EventBridge

5.1 SNS + SQS: Fan-out + Buffering

The classic pattern with SNS and SQS is “SNS for distribution, SQS for buffering and retries.”

  • SNS topic orders
    • Multiple SQS queues subscribed to it
  • Workers (Lambda / ECS / EKS) pull from each queue
    • Each system processes messages at its own pace
  • With visibility timeouts and DLQs configured on SQS:
    • If one consumer is slow or failing, the impact on others is minimized.

On GCP, you often build a similar pattern using Pub/Sub → Cloud Run / GKE / Cloud Functions, choosing between push/pull.

5.2 SNS + Lambda: Lightweight Event-Driven Processing

If you subscribe a Lambda function directly to an SNS topic:

  • When an event occurs, it’s published to an SNS topic
  • A corresponding Lambda function is immediately triggered to run lightweight processing

This yields a fully serverless event-driven flow.

Example: for a user-signup topic that receives registration-completed events, a Lambda function can send welcome emails, register the user in CRM, and forward events to analytics tools.

Azure Functions + Event Grid and Cloud Functions + Pub/Sub on GCP implement the same model.

5.3 SNS + EventBridge: Dividing Roles for Complex Routing

EventBridge is also an event broker, but its role differs somewhat from SNS:

  • SNS
    • Simple topic-based Pub/Sub
    • Covers A2P notifications (email/SMS/push)
  • EventBridge
    • An “event bus” for events from AWS services, SaaS, and custom apps
    • Routes events based on pattern matching

You can adopt a division of labor where:

  • Simple Pub/Sub stays on SNS,
  • Complex routing and SaaS integration migrate to EventBridge.

On Azure, Event Grid is closer to EventBridge, while Service Bus Topics are closer to SNS + SQS.
On GCP, Cloud Pub/Sub plays a role that overlaps with both SNS and a part of EventBridge’s functionality.


6. Design Considerations: Message Modeling, Retries, and Security

6.1 Message Structure and Schema

The SNS message body is arbitrary text (usually JSON).

Some design tips:

  • Always include at least an event ID and a schema version
    • e.g., event_id, event_type, occurred_at, schema_version
  • Consider whether heavy payloads should be referenced by ID instead of embedded
    • Either embed all details in the message, or
    • Just include IDs and let consumers fetch details from a database
    • Balance this against throughput and size limits (SNS messages are up to about 256 KB).
  • For future changes:
    • Plan how you evolve schema_version.
    • Decide how long you’ll support older versions.

Thinking about schema evolution early saves a lot of pain later.

6.2 Idempotency and Retries

With SNS → SQS or SNS → Lambda combinations,
you must design assuming messages can be delivered more than once due to network glitches or failures.

  • Give each event an event_id, and
    • Store processed IDs in a DB or cache to avoid double-processing.
  • When calling external APIs:
    • Use a request ID so retries aren’t charged or applied twice.
  • Write Lambda handlers such that receiving the same event multiple times doesn’t change the final outcome (idempotency).

GCP Pub/Sub and Azure Service Bus are also fundamentally “at-least-once” delivery, so the exact same mindset applies.

6.3 Security and Access Control

SNS allows fine-grained access control via IAM policies and resource policies:

  • Who is allowed to publish to a topic
  • Which accounts or services can subscribe
  • Cross-account subscriptions (e.g., sending to SQS queues in another AWS account)

You can also enable encryption with KMS and use VPC endpoints (PrivateLink) for private access,
which helps meet stricter enterprise security requirements.

On Azure, Service Bus / Event Grid use Entra ID (formerly Azure AD) and RBAC;
on GCP, Pub/Sub uses IAM roles. The common principle is:
“explicitly define who can publish/subscribe per topic/event bus.”


7. High-Level Comparison with Other Clouds: Different “Personalities”

Let’s compare them in words:

  • Amazon SNS
    • Pub/Sub + A2P notifications (email / SMS / mobile push)
    • Fan-out to SQS / Lambda / HTTP / Kinesis, etc.
    • Simple topic-based design
  • AWS SQS (quick recap)
    • 1-to-1 queuing for worker load balancing
    • Combined with SNS for buffered fan-out to many consumers
  • Google Cloud Pub/Sub
    • High-throughput, low-latency Pub/Sub backbone
    • Supports push and pull
    • Deep integration with many GCP services
  • Azure Event Grid
    • An event bus for events from Azure resources and SaaS
    • Handlers include Functions, Webhooks, Queues, etc.
  • Azure Service Bus Topics
    • A more “message broker”-style service emphasizing transactions, sessions, and high reliability

For everyday web service development:

  • On AWS: SNS + SQS + Lambda
  • On GCP: Cloud Pub/Sub + Cloud Run / Functions
  • On Azure: Event Grid + Functions / Service Bus Topics

are good “standard patterns” to keep in mind,
and they help you reason consistently across clouds.


8. Common Pitfalls and How to Avoid Them

8.1 Trying to “Do Everything with One SNS Topic”

  • If you attach too many things to a single SNS topic—email, SMS, push, webhooks, SQS, Lambda, etc.—
    it becomes harder to see:
    • Which events are going where
    • Where failures are happening
  • Mitigation:
    • Split topics by usage (user notifications, inter-service events, alerts, etc.)
    • Use clear naming and tagging so operations remain understandable.

8.2 Splitting Topics Too Much Instead of Using Message Attributes

  • If you create a separate topic for every variant like
    order-created-high-priority, order-created-low-priority, etc.,
    management becomes a nightmare.
  • Mitigation:
    • First split topics by domain,
    • Then use message attributes + subscription filters for finer branching.

8.3 Ignoring Idempotency

  • If you assume messages will arrive exactly once and code accordingly,
    you risk double billing, duplicate emails, and other nasty incidents when duplicates occur.
  • Mitigation:
    • Treat every event handler as idempotent from the start.
    • At minimum, implement event_id-based “already processed?” checks.

8.4 Not Watching Logs and Metrics

  • If you aren’t looking at SNS metrics (delivery success/failure), or SQS/Lambda metrics,
    you may overlook that:
    • One of the destinations is failing continuously
    • Consumers are falling behind
  • Mitigation:
    • Set up CloudWatch metrics and alarms so that:
      • You get notified when delivery failures spike
      • You get notified when DLQs are receiving messages

9. Who Benefits and How? (Concrete Benefits per Reader Persona)

9.1 For Backend and API Developers

  • Instead of connecting services purely via synchronous APIs,
    you gain the option to:
    • “Emit an event to SNS and let each system process it on its own.”
  • This makes it easier to:
    • Add new services by simply subscribing them to existing events, without changing existing APIs
    • Smooth out traffic spikes by inserting SQS buffers behind SNS

In other words, it helps you implement highly extensible designs.

9.2 For SREs and Platform Engineers

  • Once you understand SNS / SQS / EventBridge / Cloud Pub/Sub / Event Grid / Service Bus Topics side by side,
    it becomes easier to define standard rules for event flows across clouds:
    • “Alerts go here, business events go there, logs go over here.”
  • Combined with monitoring, logging, and tracing, this is foundational for creating a system where you can:
    • track which events traveled which paths and how far they got—i.e., an observable system.

9.3 For Product Managers and Tech Leads

  • Instead of adding new direct API integrations for every new feature or external service,
    you can shift to a style where:
    • “New services just subscribe to events that are already emitted.”
  • As your product grows and becomes more complex, this reduces architectural tangling.
  • When discussing multi-cloud strategies, being able to say
    “On AWS we use SNS / EventBridge, on GCP we use Pub/Sub, on Azure we use Event Grid / Service Bus”
    helps you meet the burden of explanation for technology choices.

9.4 For Startup CTOs and Small Team Leads

  • In early stages, a stack as simple as
    • “API Gateway + Lambda + SNS + SQS”
      can provide a powerful event-driven foundation.
  • With a small team, you can:
    • “First emit events into a single SNS topic, then add subscribers only when needed.”
  • This approach lets you keep the architecture simple for now,
    while still leaving a path open for future feature growth and microservice decomposition.

10. Three Steps You Can Take Today

  1. List out “things that could become events.”
    • e.g., user registration completed, password reset, order confirmed, payment succeeded, error occurred, etc.
    • Write down what the topic names might be for each one.
  2. Create one small SNS + SQS / Lambda setup.
    • Create a demo-events topic,
      • subscribe an SQS queue and a Lambda function to it,
      • publish a test message and confirm both endpoints react.
  3. Mentally port this design to GCP / Azure.
    • Ask yourself: “How would I build the same system with Cloud Pub/Sub or Event Grid / Service Bus?”
    • This nurtures a cloud-agnostic sense for Pub/Sub design.

11. Summary: SNS as an Entry Point to “Putting Events at the Center”

Amazon SNS is:

  • A Pub/Sub backbone for loosely coupling applications around events, and
  • At the same time, a hub for user notifications (email / SMS / mobile push).

Including GCP Cloud Pub/Sub and Azure Event Grid / Service Bus Topics,
shifting your mindset from:

  • “Services call each other’s APIs directly”
    to
  • “Events sit in the middle, and each system subscribes to what it needs.”

gradually transforms your architecture into something more resilient, extensible, and observable.

You don’t need a perfect event-driven architecture from day one.
Start by picking one spot in your existing system where you think,
“Maybe this flow alone could be event-ified,”
and try a small experiment with SNS + SQS / Lambda.

That first step will help you grow a “Pub/Sub mindset” that will hold up even in a multi-cloud era.


References (Official Docs, Mostly)

Note: Pricing, limits, and supported protocols change over time. Be sure to check the latest official documentation and pricing pages when designing and implementing your system.

By greeden

Leave a Reply

Your email address will not be published. Required fields are marked *

日本語が含まれない投稿は無視されますのでご注意ください。(スパム対策)