Production Live
ap-south-1
Terraform ≥1.6
Region: ap-south-1
AZs: 1a · 1b · 1c
VPC: 10.1.0.0/16
DB: MySQL 8.0.39 Multi-AZ
Arrows:
Data / Traffic
Egress / IAM
Monitoring / Logs
Config / TLS / Storage
DB / Inspection
KMS Encryption
Config reference
① Edge Layer DNS · CDN · WAF · TLS · SSM
🌐
Internet
Users / Public traffic
HTTP / HTTPS
DNS query
🔗
Route 53
DNS + A alias record
apex + www
evaluate_target_health
30s health polling
CNAME alias
☁️
CloudFront
CDN · TLS termination
Origin → ALB
Global Edge POP
req → inspect
🛡️
WAF v2
Web ACL rules
Logs → CW Log Group
Managed Rules
TLS config
🔐
ACM Certificate
TLS cert ap-south-1
Attached to ALB HTTPS
Auto-renew
💻
SSM Session Mgr
No-SSH shell access
via EC2 IAM Role
shell tunnel to EC2
IMDSv2 required
💓
R53 Health Check
HTTPS :443 /health
Interval 30s · fail 3
Fail → SNS alert (us-east-1)
② VPC 10.1.0.0/16 — Networking Layer 3 AZ · IGW · NAT GW per-AZ · ALB · Target Group · Bastion
filtered traffic in
🌉
Internet Gateway (IGW)
public route 0.0.0.0/0 → IGW · Attached to VPC 10.1.0.0/16
🟢 Public Subnets — 1a 10.1.1.0/24 · 1b 10.1.2.0/24 · 1c 10.1.3.0/24
⚖️
App Load Balancer
Internet-facing · multi-AZ
:80 → 301 redirect HTTPS
:443 → TG :8080 · TLS1.3
Logs → S3 alb-logs
ALB-SG
:8080 TG
🎯
Target Group
Port 8080 · HTTP
Health /health · 3×30s
Deregister delay 30s
Stickiness off
🔀
NAT GW — 1a
Elastic IP
10.1.1.0/24
Egress EC2-1a
🔀
NAT GW — 1b
Elastic IP
10.1.2.0/24
Egress EC2-1b
🔀
NAT GW — 1c
Elastic IP
10.1.3.0/24
Egress EC2-1c
🏰
Bastion Host
t3.nano · AL2023
Elastic IP · gp3 8GB enc
IMDSv2 · SSH key auth
Auto-stop 00:00 UTC
Bastion-SG
Public Route Table
0.0.0.0/0 → IGW
ALB-SG: :80/:443 ← 0.0.0.0/0
Bastion-SG: :22 ← CIDRs
TG forwards :8080 health OK → 1a 1b 1c
🔵 Private Subnets (App Tier) — 1a 10.1.10.0/24 · 1b 10.1.11.0/24 · 1c 10.1.12.0/24
📈
Auto Scaling Group
min 2 · max 8 · desired 3
CPU target 60%
Rolling refresh 50%
Scale ↓20:00 ↑07:00 UTC
3-AZ distribution
🖥️
EC2 — 1a
t3.small · AL2023
App :8080 · gp3 enc
IMDSv2 required
IAM profile attached
ap-south-1a
🖥️
EC2 — 1b
t3.small · AL2023
App :8080 · gp3 enc
IMDSv2 required
IAM profile attached
ap-south-1b
🖥️
EC2 — 1c
t3.small · AL2023
App :8080 · gp3 enc
IMDSv2 required
IAM profile attached
ap-south-1c
📋
Launch Template
AL2023 AMI (HVM)
IAM Instance Profile
Key pair ecia-prod-key
user-data bootstrap
Private Route Tables (×3)
0.0.0.0/0 → NAT GW (per-AZ)
App-SG: :8080 ← ALB-SG
App-SG: :22 ← allowed CIDRs
No direct internet to EC2
EC2 outbound via per-AZ NAT GW → 1a 1b 1c → IGW → Internet
Bastion SSH jump :22 → EC2-1a · 1b · 1c
EC2 (all) → RDS Primary MySQL :3306 → queries
🔴 DB Subnets — 1a 10.1.20.0/24 · 1b 10.1.21.0/24 · 1c 10.1.22.0/24
🗄️
RDS — Primary
MySQL 8.0.39 · db.t3.small
gp3 20GB (max 100GB)
KMS encrypted · Multi-AZ: true
IAM DB auth · port 3306
DB-SG · :3306 ← App-SG
sync rep
🗄️
RDS — Standby
Multi-AZ sync replica
Auto-failover < 2 min
No app config change
Different AZ from primary
Auto-failover
DB Subnet Group
ecia-prod-db-subnet-grp
Spans 1a + 1b + 1c
DB-SG: :3306/:5432
Inbound ← App-SG only
DB Parameter Group
mysql8.0 family
slow_query_log = 1
long_query_time = 2s
Logs → CloudWatch
RDS Monitoring
Enhanced Mon 60s
Backup retain 1d · 02:00–03:00
Maint: sun 04:00
final_snapshot on delete
③ Data & Storage Layer S3 Buckets · DynamoDB TF Lock · Secrets Manager · KMS Keys · ECR
🪣
S3 — App Bucket
KMS-SSE (aws:kms)
Versioning enabled
→ IA 30d → Glacier 90d
Expire 365d · CORS on
Public access blocked
🪣
S3 — Logs Bucket
AES-256 SSE
→ IA 30d → Glacier 90d
Expiry 365d
Public access blocked
🪣
S3 — ALB Logs
ALB access logs prefix
Bucket policy: ELB svc
PutObject only
Public access blocked
🪣
S3 — TF State
KMS-SSE versioned
Terraform remote backend
Public access blocked
DynamoDB TF Lock
ecia-prod-tf-lock
PAY_PER_REQUEST
Hash key: LockID
PITR enabled
🐳
ECR (read-only)
Container image registry
EC2 IAM: ECR ReadOnly
Pulled at launch
🔒
Secrets Manager
ecia-prod/db/credentials
Recovery window 7d
KMS Secrets key
EC2 GetSecretValue
KMS Encryption Keys
🔑
KMS — S3 Key
alias/ecia-prod-s3
Auto key rotation
Deletion window 7d
encrypts
🔑
KMS — RDS Key
alias/ecia-prod-rds
Auto key rotation
Deletion window 7d
encrypts
🔑
KMS — Secrets Key
alias/ecia-prod-secrets
Auto key rotation
Deletion window 7d
EC2 → Data Flows:
EC2 → S3 GetObject/PutObject
EC2 → Secrets Manager GetSecretValue
EC2 → ECR pull container image
ALB → S3 ALB access logs
KMS → S3/RDS/Secrets (encryption)
④ Security & IAM Layer IAM Roles · Instance Profile · Security Groups
👤
EC2 IAM Role
AmazonSSMManagedInstanceCore
CloudWatchAgentServerPolicy
ECR ReadOnly
Custom: S3 r/w · Secrets
KMS Decrypt · ec2:Desc*
Principal: ec2.amazonaws.com
→ Instance Profile
wraps
🏷️
EC2 Instance Profile
ecia-prod-ec2-profile
Wraps EC2 IAM Role
Attached via Launch Template
Passed to EC2 at launch
→ all EC2 instances
🚀
Deploy Role
Principal: CodePipeline + root
ec2:* · autoscaling:*
elasticloadbalancing:*
s3:* · logs:* · cloudwatch:*
🌊
VPC Flow Logs Role
Principal: vpc-flow-logs svc
logs:CreateLogGroup
logs:CreateLogStream
logs:PutLogEvents · Describe*
📊
RDS Monitoring Role
Principal: monitoring.rds svc
AmazonRDSEnhancedMonitoringRole
Policy attached · Interval 60s
Bastion Events Role
Principal: events.amazonaws
ssm:StartAutomationExec
ec2:StopInstances
Used by EventBridge 00:00
Security Groups — Rules
SG Ingress Egress Note
ALB-SG :80/:443 ← 0.0.0.0/0 All outbound create_before_destroy
App-SG :8080 ← ALB-SG (SG ref) · :22 ← 205.254.163.158/32 All outbound No direct internet to EC2
DB-SG :3306 ← App-SG · :5432 ← App-SG (SG ref) All outbound SG ref — not CIDR based
Bastion-SG :22 ← allowed CIDRs only All outbound WAF-SG via CloudFront
⑤ Observability & Monitoring Layer CloudWatch · SNS · VPC Flow Logs · EventBridge · R53 Health Alarms
📢
SNS Alerts Topic
ecia-prod-alerts
Email subscription
Receives all alarm actions
📊
CW Dashboard
ASG CPU · Network I/O
ALB req count / errors / latency
RDS CPU / conn / storage
📝
CW Log Groups
/aws/vpc/flowlogs/ecia-prod
/aws/waf/ecia-prod
Retention: 30 days
🌊
VPC Flow Logs
Traffic type: ALL
→ CW Log Group
IAM: flow-logs-role
EventBridge
cron(0 0 * * ? *)
Target: SSM Automation
AWS-StopEC2Instance
Stops Bastion at 00:00
💓
R53 Health Alarm
Region: us-east-1
HealthCheckStatus < 1
Period 60s → SNS (alarm+ok)
Monitoring Flows:
EC2/ASG → CW Alarms → SNS
VPC → Flow Logs → CW Log Group
WAF → CW Log Group
EventBridge → SSM → Bastion stop
RDS Enhanced Mon 60s → CW
Alarm Name Metric / Namespace Threshold Periods Action
asg-cpu-high CPUUtilization / AWS/EC2 ≥ 80% 2 × 2min SNS (alarm + ok)
asg-cpu-low CPUUtilization / AWS/EC2 ≤ 10% 3 × 2min SNS (alarm)
alb-5xx HTTPCode_ELB_5XX_Count / AWS/ApplicationELB ≥ 10/min 1 × 60s SNS (alarm + ok)
target-5xx HTTPCode_Target_5XX_Count / AWS/ApplicationELB ≥ 10/min 1 × 60s SNS (alarm)
alb-unhealthy-hosts UnHealthyHostCount / AWS/ApplicationELB ≥ 1 2 × 60s SNS (alarm + ok)
alb-latency TargetResponseTime / AWS/ApplicationELB ≥ 2s 2 × 60s SNS (alarm)
rds-cpu CPUUtilization / AWS/RDS ≥ 80% 2 × 60s SNS (alarm + ok)
rds-storage FreeStorageSpace / AWS/RDS ≤ 2 GB 1 × 60s SNS (alarm + ok)
rds-connections DatabaseConnections / AWS/RDS ≥ 80% max 2 × 60s SNS (alarm + ok)
⑥ Backup & Recovery Layer RDS Auto Backup · Final Snapshot · S3 Versioning · DynamoDB PITR · Multi-AZ
💾
RDS Auto Backups
Retention: 1 day
Window: 02:00–03:00
delete_automated_backups=false
copy_tags_to_snapshot=true
→ S3 (managed by RDS)
📸
RDS Final Snapshot
skip_final_snapshot = false
ID: ecia-prod-db-final-<id>
Taken on: terraform destroy
Stored: RDS managed S3
🗂️
S3 Versioning
App + TF State buckets
Noncurrent → IA: 30d
Noncurrent → Glacier: 90d
Expire noncurrent: 365d
Abort incomplete MPU: 7d
DynamoDB PITR
TF lock table
Point-in-time recovery: on
PAY_PER_REQUEST billing
Deletion protection: false
🔄
Multi-AZ Failover
RDS: sync standby replica
Auto-failover: < 2 min
No app DNS change needed
ASG: 3-AZ distribution
NAT GW: per-AZ (×3)
Resilience & HA Mechanisms
Multi-AZ RDS
Sync standby · auto-failover < 2 min · no DNS change required
3-AZ ASG
min 2 · max 8 · rolling refresh min_healthy=50%
ALB Health Checks
/health · 3×30s unhealthy · 3×30s healthy threshold
Route 53
alias evaluate_target_health=true · health alarms → SNS
Per-AZ NAT GW ×3
No single point of failure for all EC2 egress traffic
KMS + S3 + Secrets
Auto key rotation · versioning · 7d recovery window
⑦ Multi-Environment Overview — Terraform Managed dev · staging · prod
dev
VPC CIDR10.0.0.0/16
AZs1a, 1b
Public subnets10.0.1/2.0/24
Private10.0.10/11.0/24
DB subnets10.0.20/21.0/24
EC2 typet3.micro
ASGmin 1 · max 2 · des 1
RDSdb.t3.micro · single-AZ
NAT GWSingle (not per-AZ)
ACMnone
Key pairecia-dev-key
DB nameappdb_dev
Flow logsenabled
staging
VPC CIDR10.2.0.0/16
AZs1a, 1b
Public subnets10.2.1/2.0/24
Private10.2.10/11.0/24
DB subnets10.2.20/21.0/24
EC2 typet3.small
ASGmin 1 · max 3 · des 2
RDSdb.t3.micro · single-AZ
NAT GWSingle (not per-AZ)
ACMnone (HTTP only)
Key pairecia-staging-key
DB nameappdb_staging
Flow logsenabled
prod ← (this diagram)
VPC CIDR10.1.0.0/16
AZs1a, 1b, 1c (3 AZs)
Public subnets10.1.1/2/3.0/24
Private10.1.10/11/12.0/24
DB subnets10.1.20/21/22.0/24
EC2 typet3.small
ASGmin 2 · max 8 · des 3
RDSdb.t3.small · Multi-AZ = true
NAT GWPer-AZ (×3 gateways)
ACMReal ARN (HTTPS enabled)
Key pairecia-prod-key
DB nameappdb
Flow logsenabled
Terraform Modules
module.vpc
module.security_groups
module.iam
module.s3
module.load_balancer
module.ec2
module.rds
module.cloudwatch
module.kms
module.waf
module.route53
module.bastion
ECIA — AWS Production Architecture  ·  ap-south-1  ·  Terraform ≥1.6 · AWS Provider ~5.0  ·  MySQL 8.0.39 Multi-AZ  ·  3 AZ · 99.99% SLA