AWS VPC
AWS VPC
Table of Contents

Comprehensive Guide to Amazon VPC: Master “Unbreakable Network Design” with a Practical Comparison to GCP VPC and Azure VNet

Introduction (Key Takeaways)

  • This long-form, practitioner-focused guide centers on Amazon VPC (Virtual Private Cloud) and compares it with GCP VPC and Azure Virtual Network (VNet) across terminology mapping, design philosophy, secure internet reachability, hybrid connectivity, observability, and cost.
  • Bottom line first: the “five decisions to make in the first week” are (1) address planning (IPv4/IPv6), (2) routing (hub-and-spoke), (3) boundaries (egress/ingress), (4) ID-based permissions (SG/IAM), and (5) audit (Flow Logs/Traffic Mirroring/CloudTrail). With these set, apps can be deployed reliably with minimal friction.
  • AWS’s strength is a broad set of building blocksNAT Gateway, PrivateLink, Interface/Gateway Endpoints, Network Firewall, Transit Gateway—offering high flexibility. GCP’s “single global VPC” concept is straightforward, with Cloud NAT and Private Service Connect kept simple. Azure excels at cohesion among VNet/Subnet/NSG/Route Table and enterprise integration via Private Endpoint/Firewall/Virtual WAN.
  • Included: CIDR design templates, minimum secure posture, route table boilerplates, SG/NSG term mapping, three hub-and-spoke patterns (described), an incident runbook, and Terraform conceptual samples.
  • Audience: enterprise network/cloud migration teams, SRE/platform engineers, app developers, and security/governance personnel. Ideal for streamlining stepwise on-prem migration and multi-account/multi-cloud connectivity in one go.

1. VPC Core Concepts and Cross-Cloud Terminology

A VPC is your private virtual network in the cloud. You define the CIDR (address range), carve out subnets, control paths with route tables, and restrict reachability with Security Groups (SG) / Network ACLs (NACL).

  • AWS: Create a VPC within a region, place subnets per AZ, and connect inside/out via IGW/NAT GW/Endpoints/TGW, etc.
  • GCP: The VPC is global; subnets are regional. Connect via Cloud Router/Cloud NAT/PSC/VPN/Interconnect.
  • Azure: VNet is regional; subnets live inside the VNet. Connect via NAT Gateway/Private Endpoint/Firewall/ExpressRoute/VPN.

Mental model:

  • Security is ID-first (SG/NSG/identity), routing is route tables, boundaries are NAT/Firewall/Endpoints.
  • Replace the physical “inside/outside” mindset with: “Which identity may reach which destination, over which path/crypto/audit?”

2. Address Planning (CIDR) and Naming: Decide Once, Never Change

2.1 CIDR Strategy

  • Avoid overlaps: Plan RFC1918 usage with hybrid/multi-cloud/mergers in mind.
  • Growth room: Allocate /16 (65,536 IPs) to a VPC; slice /19–/24 subnets (rule of thumb).
  • Separation by role: Fix subnets by purpose—App/DB/Shared/Ingress/Egress/Admin, etc.

2.2 Naming (examples)

  • VPC: {org}-{env}-{region}-vpc (e.g., acme-prod-apne1-vpc)
  • Subnets: {vpc}-{az}-{purpose} (e.g., acme-prod-apne1a-app)
  • RouteTable/NACL/SG: {env}-{layer}-{purpose} so intent is visible at a glance.

2.3 Sample CIDR Allocation (described)

  • VPC: 10.64.0.0/16
    • Public (Ingress/Egress): 10.64.0.0/20 (evenly across three AZs a/c/e)
    • App (private): 10.64.16.0/19
    • Data (private): 10.64.48.0/20
    • Shared/Admin: 10.64.64.0/20
      Maintain an organization-wide “reservation ledger” to prevent overlaps with on-prem/other clouds.

3. Routing & Boundaries: Choose a Hub-and-Spoke Without Hesitation

3.1 Three Standard Patterns

  1. Minimal (small/single VPC)
    • Public subnets: Internet via IGW; place ALB/NLB here.
    • Private subnets: NAT GW for outbound; inbound only via ALB/NLB.
  2. Hub-and-Spoke (mid/large)
    • Centralize NAT GW/Firewall/audit in a network hub VPC.
    • Connect workload (spoke) VPCs via Transit Gateway (AWS)/VPC Peering to the hub.
  3. Multi-cloud/Hybrid
    • Terminate VPN/dedicated circuits (DX/Interconnect/ExpressRoute) into the hub for unified route aggregation and audit.

3.2 Route Table Boilerplate (example: spoke VPC/private)

Destination      Target
10.64.0.0/16     local
0.0.0.0/0        tgw-xxxxxxxx   # Delegate to hub (NAT/Firewall)
::/0             tgw-xxxxxxxx   # If using IPv6
10.20.0.0/16     tgw-xxxxxxxx   # On-prem/other-cloud prefix

Principle: All spoke outbound goes to the hub; egress control lives there. No direct internet from spokes, keeping control & audit centralized.

3.3 Cross-Cloud Boundary Mapping

  • Egress (NAT): AWS NAT Gateway / GCP Cloud NAT / Azure NAT Gateway
  • Private service reach: AWS Interface/Gateway Endpoints & PrivateLink / GCP Private Service Connect / Azure Private Endpoint
  • Central aggregation: AWS Transit Gateway / GCP VPC Peering + Cloud Router / Azure Virtual WAN/Hub
  • L7 exposure: Use ALB / Cloud Load Balancing / Application Gateway; don’t expose VMs/functions directly.

4. Security Model: SG vs. NACL vs. Firewall

4.1 Security Groups (SG)

  • Stateful filters attached to instances/ENIs.
  • Allowlist rules; prefer ID-based permissions (SG → SG referencing).
  • Express app tiers (Web → App → DB) as SG boundaries for cleaner ops.

4.2 Network ACLs (NACL)

  • Stateless subnet boundary filters.
  • Ordered explicit deny/allow. Best for special cases (temporary blocks, minimal external links).
  • Day-to-day allow via SGs; treat NACLs as additional insurance.

4.3 Network Firewalls/IDPS

  • Place AWS Network Firewall / Azure Firewall / Cloud Armor/Cloud IDS in the hub for L3–L7 control of N/S and E/W traffic.
  • Gateway Load Balancer (AWS) eases inline firewall appliances.

4.4 Cross-Cloud Mapping

  • SG ↔ Azure NSG ↔ GCP firewall rules
  • NACL ↔ (no exact Azure equivalent; use NSG) ↔ GCP firewall is VPC-scoped
  • Network Firewall ↔ Azure Firewall ↔ GCP Cloud Firewall/IDS

5. Internet Exposure: Default to “CDN & L7 LB”

  • S3/static: Use CloudFront (GCP: Cloud CDN; Azure: Front Door/Azure CDN) to front with TLS/WAF/cache.
  • API/app: ALB (GCP: HTTP(S) LB; Azure: App Gateway) for TLS termination/routing; put WAF at the LB/CDN.
  • NLB/GLB: L4 use (gRPC/very high throughput/fixed IP). AWS GWLB helps inline security appliances.
  • Avoid “public IP direct exposure” for compute: It scatters maintenance/governance/observability and complicates incident triage.

6. Private Connectivity: Where Endpoints/PrivateLink Shine

  • Gateway Endpoints (S3/DynamoDB): Reach S3/DynamoDB without NAT; reduce data exfil risk/egress cost.
  • Interface Endpoints (PrivateLink): Call AWS services/own services over private IP; powerful for Zero Trust.
  • Counterparts: GCP Private Service Connect; Azure Private Endpoint. The trend is “private access to external/SaaS” across all three clouds.

7. Hybrid Connectivity: VPN/Dedicated Lines/DNS

  • Site-to-Site VPN: Short-term/softer availability; use redundant tunnels for stability.
  • Dedicated lines (DX/Interconnect/ExpressRoute): For bandwidth/latency/SLA.
  • Name resolution: Use Private Hosted Zone (Route 53) / Cloud DNS / Private DNS to define “which names resolve to where.”
  • For split-horizon DNS (same FQDN inside/out), manage cache/TTL and boundary audits.

8. IPv6, Egress, Zero Trust: A 2025 Pragmatic View

  • IPv6: Beyond addressing headroom, fewer one-way NATs simplifies troubleshooting. Tighten egress control accordingly.
  • Egress control: Combine dest FQDN/tags (PrivateLink/Endpoints/WAF) and declare allowed destinations.
  • Zero Trust: ID-based permissions (SG/NSG/firewall) + mTLS/OIDC to verify who → where every time; assume bastion-less (SSM/IAP/Bastion JIT) access.

9. Observability: Flow Logs + the “Four Essentials”

  • VPC Flow Logs: Capture src/dst/port/allow/deny; detect anomalous egress/port scans.
  • Traffic Mirroring: Packet-level deep-dive; invaluable during incidents.
  • CloudWatch/Cloud Logging/Azure Monitor: Dashboards for connections/denies/bandwidth/latency.
  • CloudTrail/Audit Logs/Activity Log: Track who changed network configs.
    Aggregate logs in a separate account/project for better tamper resistance.

10. Cost Optimization: Be Smart with Egress/NAT/Audit

  • NAT count & pathing: Centralize in the hub for shortest paths/fewer instances.
  • Use Endpoints: Gateway Endpoints (S3/DynamoDB) to cut NAT/egress.
  • Log granularity/retention: Right-size Flow Logs/Traffic; aggregate → keep summaries to smooth cost.
  • CDN caching: Reduce outbound and origin load; lower egress fees.
  • Standardize architecture: IaC (Terraform/CloudFormation/Bicep/DM) for a reproducible minimum—prevents sprawl.

11. Cross-Cloud Phrasebook (Essentials)

  • Virtual network: VPC (AWS) / VPC (GCP: global) / VNet (Azure)
  • Security group: SG (AWS) / Firewall rule (GCP) / NSG (Azure)
  • Private service access: Interface/Gateway Endpoint & PrivateLink / Private Service Connect / Private Endpoint
  • NAT/Egress: NAT Gateway / Cloud NAT / NAT Gateway
  • Central connectivity: Transit Gateway / VPC Peering + Cloud Router / Virtual WAN (Hub & Spoke)
  • Firewall: Network Firewall / Cloud Firewall/IDS / Azure Firewall
  • Logs: VPC Flow Logs / VPC Flow Logs (GCP) / NSG Flow Logs/Diagnostics

12. Minimum Secure Posture (described)

  1. Spoke VPCs (App/Data) use private subnets only.
  2. ALB/NLB live in hub VPC public subnets.
  3. App → external goes TGW → hub NAT GW.
  4. S3/DynamoDB via Gateway Endpoints; other AWS/internal SaaS via Interface Endpoint/PrivateLink.
  5. SGs: minimal two-hopALB → App, App → DB.
  6. Flow Logs ON + CloudTrail centralized; Network Firewall in the hub.

13. Sample Configs

13.1 Security Groups (minimal; ALB-only inbound)

# App SG: allow only 80/443 from ALB SG; DB access handled by a separate SG
APP_SG=$(aws ec2 create-security-group --group-name app-sg --description "app" --vpc-id vpc-xxx --query GroupId --output text)
ALB_SG=$(aws ec2 create-security-group --group-name alb-sg --description "alb" --vpc-id vpc-xxx --query GroupId --output text)

aws ec2 authorize-security-group-ingress --group-id $APP_SG --protocol tcp --port 80  --source-group $ALB_SG
aws ec2 authorize-security-group-ingress --group-id $APP_SG --protocol tcp --port 443 --source-group $ALB_SG

13.2 Route Table (spoke private → hub)

aws ec2 create-route --route-table-id rtb-xxxx --destination-cidr-block 0.0.0.0/0 --transit-gateway-id tgw-xxxx
aws ec2 create-route --route-table-id rtb-xxxx --destination-ipv6-cidr-block ::/0 --transit-gateway-id tgw-xxxx

13.3 Flow Logs (to CloudWatch Logs)

aws ec2 create-flow-logs \
  --resource-type VPC --resource-ids vpc-xxxx \
  --traffic-type ALL \
  --log-destination-type cloud-watch-logs \
  --deliver-logs-permission-arn arn:aws:iam::123456789012:role/vpc-flow-logs-role \
  --log-group-name /vpc/flow/prod

14. Terraform Concept (Hub-and-Spoke Highlights)

# VPC (spoke)
resource "aws_vpc" "spoke" {
  cidr_block           = "10.64.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags = { Name = "acme-prod-apne1-spoke" }
}

# Private Subnet (example)
resource "aws_subnet" "spoke_app_a" {
  vpc_id                  = aws_vpc.spoke.id
  cidr_block              = "10.64.16.0/20"
  availability_zone       = "ap-northeast-1a"
  map_public_ip_on_launch = false
  tags = { Name = "app-a" }
}

# Transit Gateway attachment (to hub)
resource "aws_ec2_transit_gateway_vpc_attachment" "spoke_attach" {
  subnet_ids         = [aws_subnet.spoke_app_a.id]
  transit_gateway_id = aws_ec2_transit_gateway.hub.id
  vpc_id             = aws_vpc.spoke.id
}

15. Common Pitfalls & Remedies

  • CIDR overlap: Hard to undo. Create a company-wide reservation ledger on day one with a request workflow.
  • Direct public exposure: Don’t expose VMs/DBs; front with ALB/CloudFront/PrivateLink.
  • NAT sprawl: Too many NATs per VPC inflates cost; centralize in hub and use Endpoints.
  • SG vs. NACL confusion: Lead with SGs, NACLs as supplemental. Embrace “ID-based allow.”
  • Missing logs: Troubleshooting in the dark—make Flow Logs/CloudTrail standard kit.
  • Ad-hoc DNS: Define Private DNS/Hosted Zones and naming/TTL upfront.

16. Case Studies (3 Patterns)

16.1 Enterprise Portal (internet-facing + internal API)

  • Design: Hub VPC (ALB/Firewall/NAT) + Spoke VPC (App/Data). External path: CloudFront → ALB; internal via PrivateLink.
  • Notes: Aggregated egress simplifies audit; WAF at CDN/L7. Flow Logs stored in a separate account.
  • GCP/Azure: Cloud CDN → HTTP(S) LB / Front Door → App Gateway, with PSC/Private Endpoint—same idea.

16.2 SaaS Platform (multi-tenant + zero trust)

  • Design: Isolate tenants in spokes; central IdP + mTLS; Interface Endpoints for internal SaaS.
  • Notes: SG-to-SG references enforce zero inter-tenant reachability; centralize audit logs.
  • GCP/Azure: VPC Peering + firewall rules / VNet Peering + NSG; leverage PSC/Private Endpoint.

16.3 Hybrid (plants/branches × cloud)

  • Design: On-prem dual VPN → future dedicated circuit; routes aggregate in hub; minimal plant → cloud prefixes.
  • Notes: Define split-horizon DNS, bandwidth monitoring, and local breakout criteria.
  • GCP/Azure: Cloud Router + HA VPN / Virtual WAN + VPN Gateway for symmetry.

17. First-Week Design Checklist (10 Items)

  1. CIDR/naming rules: shared reservation ledger + request flow.
  2. Hub-and-spoke: which VPC is hub; egress/firewall/audit landing zone.
  3. Subnet roles: separate Public/Private/Admin; spread across AZs.
  4. Routes: send 0.0.0.0/0 to TGW/hub; manage on-prem/other-cloud prefixes.
  5. SG/NACL: SG-first; NACL supplemental; SG→SG references by default.
  6. Endpoints: Gateway (S3/DynamoDB); Interface/PrivateLink for others.
  7. DNS: split Private Hosted Zones/Private DNS; naming; TTLs.
  8. Observability: Flow Logs/CloudTrail/Traffic Mirroring; store in separate account.
  9. Zero Trust: bastion-less (SSM/IAP/Bastion JIT), mTLS/OIDC, least reachability.
  10. Cost: review NAT/Endpoints/log retention cycles.

18. Operations Runbook (Network Incident First Response)

  1. Detect: alert on p95 latency/path failures/deny surges.
  2. Isolate: trace DNS → SG → route table → NAT/Firewall → peer; confirm allows/denies via Flow Logs.
  3. Mitigate: enable alternate paths (standby NAT/other AZ); tune LB health thresholds.
  4. Identify root cause: review recent IaC diffs/CloudTrail; deep-dive with Traffic Mirroring.
  5. Fix: improve route convergence, DNS TTL, and SG ID-references.
  6. Retro: update dashboards/runbooks; add “route-impact review” to change control.

19. Readers & Concrete Outcomes

  • Enterprise IT (many sites/integration)
    Using CIDR ledger + hub-and-spoke + Private Endpoint, eliminate collisions/backdoors; centralized audit logs simplify compliance.
  • SRE/Platform engineers
    Turn the minimum secure pattern (ALB fronting/SG-first/Endpoints) into IaC to launch new products in a day with the same pattern.
  • App developers
    With “L7 at the LB, reach by SG, outbound via hub,” deploy safely without deep network design.
  • Security/Governance
    Standardize bastion-less access, least reachability, zero trust; achieve non-person-dependent audit/change tracking.
  • Startup CTO/Tech leads
    Start small, but keep hub centralization & CIDR headroom for growth; mechanize NAT/log cost control.

20. Q&A (Frequently Asked)

  • Q: SG or NACL as the primary control?
    A: SG-first. You get ID-based allow and statefulness. Use NACL as a supplemental boundary.
  • Q: Where should NAT Gateways live?
    A: In the hub, unifying egress. Combine with Endpoints to minimize NAT traffic.
  • Q: Should we adopt IPv6?
    A: Yes—address headroom and fewer NATs are valuable. Strengthen egress control & audit.
  • Q: Transit Gateway vs. VPC Peering?
    A: For scale and centralized ops, use TGW. For point-to-point, Peering is fine. If growth is likely, start with a hub.

21. Appendix A: CIDR Reservation Ledger (Excerpt)

  • Envs: dev / stg / prod
  • VPCs: 10.64.0.0/16 (prod) / 10.65.0.0/16 (stg) / 10.66.0.0/16 (dev)
  • Subnet roles: public-ingress / private-app / private-data / shared-admin
  • Rules: No overlap at /16, keep ≥ /19 free, requests tracked by tickets.

22. Appendix B: Naming & Tag Standards (Example)

  • Tags: env=prod, owner=platform, data-classification=internal, egress=hub, zone=apne1
  • SG names: prod-app-web-in (ALB → App), prod-app-db-in (App → DB)
  • RouteTable name: prod-spoke-private-rt (0/0 → tgw)

23. Appendix C: Three Things You Can Do Today

  1. Document VPC CIDR/subnet roles/naming, and establish a “reservation ledger.”
  2. Enable Flow Logs/CloudTrail on every VPC and centralize in a separate account.
  3. Add Endpoints (S3/DynamoDB) to immediately reduce NAT egress & risk.

Summary: Allow by Identity, Aggregate Paths, and Make It Observable

Design Amazon VPC by standardizing in this order: CIDR plan → hub-and-spoke → SG-first → Endpoints → observability—you’ll balance safety, scalability, and cost.
The same principles apply to GCP VPC and Azure VNet: enforce private connectivity, unified egress, and centralized logs. Aggregate L7 at the LB and allow reach by identity—follow this path and you’ll grow an unbreakable network regardless of team size or cloud.
Next time we’ll dive into Amazon CloudFront, contrasting it with Cloud CDN (GCP) and Azure Front Door, and explore WAF/origin design/cache/cost to find the front-end optimum in depth. Stay tuned.

By greeden

Leave a Reply

Your email address will not be published. Required fields are marked *

日本語が含まれない投稿は無視されますのでご注意ください。(スパム対策)