Identity and Access Services
AWS Account Root User
Root User Privileges
- The root user of an AWS account (considered the account owner, not an IAM identity) should not be used for routine tasks, as it cannot have restricted permissions.
- Create an admin IAM user for daily administrative activities.
- Protect the root account with MFA and securely store its access keys.
- The root user is required for certain critical tasks, which may be tested on exams.
- Key root user privileges to remember:
- Modify account settings (account name, email, root password, root access keys)
- Close or delete the AWS account
- Change or cancel the AWS Support Plan
- Register as a seller in the Reserved Instance Marketplace
- Example: If you reserve an instance for three years but only use it for two, you can sell the remaining term in this marketplace; registration requires root access.
- Additional privileges (good to know, not mandatory to memorize):
- Some billing and invoice management functions
- Restore IAM user permissions
- Enable MFA on an S3 bucket
- Edit or delete S3 bucket policies containing invalid VPC IDs or endpoints
- Sign up for GovCloud
Introduction to AWS IAM (CLF-C02)
IAM Identities
- Users (long-term credentials)
- Typically, an IAM user represents an individual who can log in to an AWS account.
- Groups (containers for IAM users)
- Simplify management: assigning a policy to a group automatically applies it to all members.
- Note: IAM groups cannot log in to an AWS account, unlike users or roles.
- Roles (short-term credentials)
- Mainly used by AWS services (e.g., EC2 instances, Lambda functions) to act on resources on your behalf. The service assumes the role and credentials expire after use.
- Also used for federated access, e.g., an external user logging in via a social identity (like Facebook) can assume a role to access AWS resources.
IAM PoliIntroduction to AWS IAM (CLF-C02)cy
- JSON document that defines permissions for granting or denying access to services and resources.
- Identity policies: assigned to IAM identities (users, groups, or roles).
- Resource policies: attached to AWS resources to allow or deny access for specific identities.
- Example: An S3 bucket policy restricting access only to IP addresses within a defined range.
IAM Security – Best Practices
- Root user: use only for account setup (e.g., creating an admin user) or closing the account. Avoid daily usage.
- Password policy: enforce strong passwords for IAM users.
- MFA (Multi-Factor Authentication): require MFA for all IAM users to strengthen security.
- Principle of least privilege: grant users and roles only the permissions necessary for their tasks to prevent misuse.
- Privacy: never share IAM user credentials or access keys.
IAM Audit Tools
- IAM Credential Reports
- Account-level report listing all users and the status of their credentials.
- Provided in CSV format; useful for identifying credentials that haven’t been rotated recently.
- IAM Access Advisor
- User-level tool showing services a user has access to and when they last used them.
- Useful for reviewing permissions and adjusting policies accordingly.
Shared Responsibility Model in IAM
- AWS responsibilities:
- Securing the underlying infrastructure and global network
- Performing configuration and vulnerability analysis
- Ensuring compliance
- Customer responsibilities:
- Managing and monitoring IAM users, groups, roles, and policies
- Enabling and enforcing MFA for all accounts
- Rotating access keys regularly
- Applying appropriate permissions using IAM tools
- Reviewing access patterns and auditing permissions
Additional Identity Services
Advanced Identity Services
- AWS STS (Security Token Service): provides temporary credentials with limited permissions to access AWS resources
- Used whenever an IAM role is assumed (
sts:assumeRoleoperation)- IAM roles can grant access within the same account or across multiple accounts
- Roles can support identity federation, allowing external identities (e.g., Google, Facebook) to access AWS resources by assuming a role
- AWS services use roles (service roles) to perform actions on your behalf, e.g., a Lambda execution role allows the function to write logs to CloudWatch
- Used whenever an IAM role is assumed (
- Amazon Cognito: manages user databases for mobile and web applications
- IAM has a limit of 5,000 users per account; for applications with millions of users, IAM is impractical, making Cognito the preferred solution
- Supports identity federation with social providers like Google and Facebook
- AWS Directory Service: integrates a user directory (e.g., Microsoft Active Directory) with AWS
- Microsoft Active Directory (AD) is commonly used in Windows environments and provides centralized management for users, computers, printers, and file shares
- Other directories, such as SAMBA, or on-premises directories can also be integrated
- IAM Identity Center: single login for multiple AWS accounts and applications
- Formerly known as AWS Single Sign-On (AWS SSO)
- Supports SAML 2.0, EC2 Windows instances, and business applications such as Salesforce or Microsoft 365
- Can use a built-in identity store or integrate third-party identity providers (MS AD, OneLogin, Okta, etc.)
- Used by AWS Organizations to manage identities across multiple accounts
Cloud Compute Services
Amazon EC2 (Elastic Compute Cloud) – CLF-C02 Overview
EC2 Instance – Key Concepts
- EC2 Instance = Virtual Machine (VM) = Virtual Server (VS)
- Hosted on a physical EC2 host
- Deployed inside a VPC subnet within a single Availability Zone (AZ)
- Provides OS-level access and control
- Configurations:
- Operating System (OS): Linux, Windows, macOS
- Instance Type: general purpose, compute optimized, memory optimized, storage optimized, accelerated computing, etc.
- Size: CPU and RAM allocation
- Storage: either local hardware (Instance Store) or network-attached (EBS – Elastic Block Store)
- EC2 User Data: bootstrap script executed only at instance launch
- EC2 Instance Role: IAM role granting permissions for actions on other AWS resources
- ENI (Elastic Network Interface): provides network connectivity and IP addresses
- Security managed via VPC Security Groups acting as firewalls
Secure Shell (SSH) Protocol
- SSH is a secure protocol to access a command-line shell on a remote host
- Shell = Linux CLI terminal
- Runs on TCP port 22
- SSH keys for EC2 instances:
- Allow secure login and access to the instance for management
- Linux and macOS can connect natively
- Windows machines may require PuTTY if using versions older than Windows 10
- Instance Connect: browser-based SSH access
- No SSH key creation required
- Supported out-of-the-box only on Amazon Linux 2
EC2 Purchasing Options
- Shared Host (default): instances share physical hosts with other customers (isolated from other customers)
- On-Demand Instances: pay-per-second, no discounts, suitable for short, uninterrupted workloads
- Spot Instances: utilize spare EC2 capacity at significant discounts; can be interrupted and shut down if capacity is reclaimed, suitable only for flexible workloads
- Reserved Instances: commit for 1 or 3 years to receive discounts, ideal for long-term workloads; may choose full, partial, or no upfront payment
- Convertible Reserved Instances: allow changing instance type, family, OS, scope, and tenancy
- Capacity Reservations: reserve capacity in a specific AZ or region; guarantees availability but does not reduce instance costs
- Zonal reservations: specific AZ, high priority
- Regional reservations: any AZ in the region, lower priority
- Dedicated Instances: dedicated physical hardware for your instances; AWS manages the host
- Useful for compliance or special security requirements
- Dedicated Host: full control over the physical host; pay for the host rather than individual instances
- Necessary for server-bound licenses tied to physical cores or sockets
Savings Plans (1 or 3 years)
- Commit to a certain amount of compute usage for a discounted rate; excess usage billed at on-demand prices
- Can cover EC2 only or multiple compute services (EC2, ECS, Lambda)
Shared Responsibility Model for EC2
- AWS Responsibilities:
- Infrastructure management, including global network security
- Isolation of instances on physical hosts
- Replacement of faulty hardware
- Compliance validation
- Customer Responsibilities:
- OS patching and updates
- Installation of software and utilities on the instance
- Configuration of Security Groups
- Assignment of IAM roles and user access management
- Securing data within the instance
EC2 Resilience and Scaling with ELB and ASG
Infrastructure – Key Concepts
- Resilience: the ability of a system to recover automatically from failures
- AZ-resilient: data replicated within a single AZ; system can continue operating if some hardware fails, as long as the AZ itself remains operational
- Regionally-resilient: data replicated across multiple AZs in a region; system can withstand failure of an entire AZ
- High Availability (HA): systems designed to maximize uptime and recover quickly from failures, often automatically
- Short downtime may still occur during recovery, but it is much less than in non-HA systems
- Related to resilience: resilience is the ability to self-heal, HA ensures fast recovery within a target time
- Fault Tolerance (FT): systems that continue operating without downtime even when some components fail
- Typically involves redundant hardware
- FT provides higher reliability than HA
- Scalability: ability to adjust system resources to handle varying loads
- Overprovisioning: too much capacity for the load → wasted resources and costs
- Underprovisioning: insufficient capacity → poor performance and user experience
- Goal: maintain proper capacity at all times by scaling resources according to load
- Types of scaling:
- Vertical scaling: increase or decrease the size/capacity of a single server (Scale UP/DOWN)
- Simple to implement in non-distributed systems but can be costly and has limits
- Horizontal scaling: add or remove identical server instances (Scale OUT/IN)
- Ideal for distributed systems, supports stateless architecture, integrates with Auto Scaling Groups
- Enables resilience and HA; failed instances can be replaced automatically
- Vertical scaling: increase or decrease the size/capacity of a single server (Scale UP/DOWN)
- Elasticity: automated scalability that adjusts capacity in real-time according to load, optimizing costs and performance
- Agility: ability to quickly provision or terminate resources, enabling fast experimentation and testing
- Not the same as scalability
- Right-sizing: selecting instance types and sizes that best match workload requirements for performance and cost
- Avoid overprovisioning; cloud environments can scale on demand
- Right-sizing is important:
- Before migration to the cloud, to avoid copying overprovisioned servers
- Continuously after deployment, as workload requirements change
Elastic Load Balancing (ELB)
- Load Balancer (LB): distributes incoming traffic across multiple backend EC2 instances
- Single DNS endpoint for all backend instances
- Supports multi-AZ deployment for high availability
- Health checks ensure traffic is sent only to healthy instances
- Types of Load Balancers:
- Classic Load Balancer (CLB): legacy; avoid for new deployments
- Application Load Balancer (ALB): Layer 7, HTTP/HTTPS traffic
- Network Load Balancer (NLB): Layer 4, TCP/UDP traffic; very high performance
- Gateway Load Balancer (GWLB): Layer 3, handles GENEVE protocol for network security
EC2 Auto Scaling Groups (ASGs)
- ASG provides multi-AZ elasticity for applications
- Automatically scales EC2 instances horizontally based on load
- Replaces unhealthy instances automatically
- Configuration parameters:
MIN_CAPACITY,DESIRED_CAPACITY,MAX_CAPACITY- ASG maintains the desired number of instances and scales automatically according to rules
- Scaling policies:
- Manual Scaling
- Dynamic Scaling: Simple, Step, Target Tracking, Scheduled
- Predictive Scaling
- Integration with ELB: attach ASG to an ELB target group to distribute traffic across all instances in the group

Other Cloud Computing Services
Serverless Compute
- Serverless = running code/software without managing servers
- Serverless services automatically manage infrastructure to execute code and then remove the resources when the task is complete.
- Reduces administrative overhead for developers.
- Developers provide serverless functions to define the code, without configuring servers.
- Pay-per-use billing: charges are based on resources consumed while executing code; there are no ongoing fees for idle infrastructure.
- Ideal for workloads that are unpredictable or intermittent.
- Many AWS services are either serverless or offer a serverless option.
Typical AWS Serverless Architecture for Custom Compute
- AWS Lambda: Function-as-a-Service (FaaS)
- Refresher: AWS Lambda 101
- Provides short, on-demand, scalable, and cost-efficient code execution.
- Amazon API Gateway (APIGW): launch and manage APIs
- Supports RESTful APIs and WebSocket APIs.
- Can expose Lambda functions through HTTP APIs.
- Event-Driven Architecture (EDA): triggers actions (e.g., Lambda functions) in response to events (e.g., S3 uploads or scheduled CRON tasks).
- Typical services involved: Lambda, APIGW, S3, EventBridge, DynamoDB, etc.
- Example: serverless thumbnail generation
- Example: serverless scheduled job executed daily
Containerized Compute
- Containers = packaging an application with its runtime environment (RTE)
- Containers are lightweight and portable.
- Can run on any machine or OS, ensuring predictable behavior.
- Easier to scale than virtual machines or traditional servers.
- Docker: widely used container technology.
- Beginner-friendly course: Docker Fundamentals
- Example: EC2 instance hosting multiple Docker containers.
AWS Container Services
- Amazon ECS (Elastic Container Service): run Docker containers.
- Containers can run on EC2 instances or Fargate.
- AWS Fargate: serverless containers.
- No infrastructure provisioning required; EC2 instances are not needed.
- Amazon ECR (Elastic Container Registry): stores Docker images.
- Repository = storage location for objects with versioning.
- Docker image = template used to launch containers.
- Docker Hub = popular public repository; ECR is AWS’s equivalent.
- Amazon EKS (Elastic Kubernetes Service): deploy Kubernetes clusters in AWS.
- Kubernetes (K8s): orchestrates multiple containers.
- Open-source, cloud-agnostic (works in AWS, Azure, GCP).
- K8s-launched Docker containers can run on EC2 or Fargate.
- Kubernetes (K8s): orchestrates multiple containers.
- AWS Batch: run batch jobs using Docker images.
- Batch job: a job with a defined start and finish.
- Automated execution: Docker images loaded into ECS, EC2 instances provisioned (spot instances possible) to run the job, then resources are removed. Billing applies only for active infrastructure.
Amazon Lightsail
- Simplified cloud offering with an easy-to-use interface for launching virtual servers
- Designed for beginners with minimal cloud experience.
- Provides basic virtual servers, storage, databases, and networking.
- Underlying AWS resources (EC2, RDS, etc.) are automatically provisioned.
- Advantages:
- Predictable and affordable pricing.
- Extremely easy to configure.
- Limitations:
- Limited scalability (some high-availability features available).
- Limited integration with broader AWS services.

Storage Services
Amazon S3 (Simple Storage Service) – CLF-C02 Overview
S3 Security
- User-based access control: managed through IAM policies to allow or deny operations on S3
- Resource-based access control:
- Bucket policies: define which users or identities can access the bucket and what actions they can perform
- Can allow cross-account or public (external) access
- IAM Access Analyzer helps review common access patterns and optimize bucket policies
- Block Public Access: enabled by default; prevents any public access and overrides other configurations regarding public access
- Access Control Lists (ACLs): simple, legacy permissions applied at the bucket or object level; avoid if possible
- Bucket policies: define which users or identities can access the bucket and what actions they can perform
- S3 encryption: protects data from unauthorized access
- In-transit encryption: enforce HTTPS (SSL/TLS) for data transfer
- At-rest encryption: can use client-side or server-side encryption (CSE/SSE)
S3 Static Website Hosting
- Amazon S3 can host a static website
- Configure a root file (e.g.,
index.html) - Website is accessible via a default S3 URL based on the bucket name
- Configure a root file (e.g.,
- Only static content (S3 objects) can be hosted
- Security requirements:
- Disable “Block Public Access” to allow external access
- Bucket policy must allow public read access; otherwise, requests will return 403 Forbidden
Additional S3 Features
- S3 Versioning: keeps multiple versions of an object rather than overwriting it
- Disabled by default
- Helps prevent accidental deletion; deleting an object adds a deletion marker instead of removing previous versions
- S3 Replication: automatically replicates objects from one bucket to another
- Asynchronous process, supports same-region and cross-region replication (SRR/CRR)
- Versioning must be enabled for replication
- S3 Storage Classes: offer trade-offs between storage cost and access speed
- Standard
- Infrequent Access (IA): cheaper storage, pay for retrieval; suitable for infrequently accessed data
- One Zone-IA (1Z-IA): even cheaper, data stored in a single AZ
- Amazon Glacier: archival storage with low cost but slower retrieval
- Instant Retrieval
- Flexible Retrieval
- Deep Archive
- Intelligent Tiering: automatically moves objects between classes based on usage patterns
- Lifecycle Policies: move objects to different storage classes on a schedule, not based on usage
Shared Responsibility Model for S3
- AWS responsibilities:
- Provide infrastructure with global security, durability, and availability
- Unlimited storage availability
- Encryption capabilities
- Data separation between customers
- Prevent AWS employees from accessing customer data
- Configuration and vulnerability management
- Compliance validation
- Provide infrastructure with global security, durability, and availability
- Customer responsibilities:
- Configure bucket policies and public access settings
- Manage data encryption at rest and in transit
- Enable logging and monitor access
- Enable and manage versioning
- Set up replication
- Choose and manage storage classes
Storage for Private AWS Services (such as EC2)
Databases & Data Services
Databases 101
- Databases provide structure to stored data, making it searchable and queryable
- Define schemas, indexes, and relationships
- Like organizing raw text into a table of contents, numbered pages, highlighted sections, and notes for easier navigation
Relational Databases (SQL / RDBMS)
- Rigid schemas with tables, rows (items), and columns (attributes)
- Use SQL (Structured Query Language) to query data

- Tables are similar to Excel spreadsheets, with relationships linking them
- Examples:
- Open-source: MySQL, PostgreSQL, MariaDB
- Proprietary: Oracle SQL, Microsoft SQL Server
- Two main types of optimization:
- Row-based / OLTP (Online Transaction Processing): optimized for transactions, e.g., Amazon RDS, Aurora
- Column-based / OLAP (Online Analytical Processing): optimized for analytics, e.g., Amazon Redshift
Non-relational Databases (NoSQL)
- Flexible schemas
- Cannot use standard SQL (some support SQL-like query languages)
- Better scalability than SQL DBs but with reduced consistency
- Examples:
- Document DBs: e.g., MongoDB, store JSON documents
{ "name": "Avatar", "year": 2009, "genre": "epic science fiction", "director": { "name": "James Cameron", "nationality": "Canada" } }- Graph DBs: e.g., Amazon Neptune
- Key-value DBs: e.g., Amazon DynamoDB (DDB)

Databases in AWS
- You can run database software on EC2, but it requires full instance management
- Recommended approach: AWS-managed database services
- Quick provisioning, high availability, scalable vertically and horizontally
- Automated backups and restores
- Patches and updates handled by AWS
- Easy integration with monitoring, alerts, and other AWS services
- Less control than self-managed databases
AWS SQL Database Services
- Amazon RDS (Relational Database Service):
- Row-based, OLTP, free-tier eligible
- Runs in your VPC, supports multiple SQL engines (PostgreSQL, MariaDB…)
- Optional features:
- Read Replicas: scale read operations across multiple instances
- Multi-AZ deployments: high availability with quick failover
- Amazon Aurora:
- Proprietary Amazon SQL engine, OLTP, not free-tier
- Higher performance and faster than standard RDS engines
- Optional serverless mode for flexible scaling
- Amazon Redshift:
- Column-based, OLAP, ideal for analytics
- Runs in VPC, with serverless option for automatic scaling
- AWS DMS (Database Migration Service):
- Migrates databases to and from AWS securely
- Original DB can remain operational during migration
- Supports migrations between different engines
- Mostly for SQL databases; DynamoDB is the only NoSQL target supported
Shared Responsibility Model for RDS
- AWS responsibilities:
- Manage underlying EC2 instances (no SSH access)
- Automatic OS and database software patching
- Ensure underlying instances and storage are operational
- Customer responsibilities:
- Manage network access (ports, IPs, security groups)
- Create in-database users and assign permissions
- Enable or disable public access
- Configure encryption in transit and at rest
AWS NoSQL Database Services
- Amazon ElastiCache: in-memory cache DB, supports Redis and Memcached
- Amazon DynamoDB: serverless key-value DB with high availability and low-latency performance
- DynamoDB Accelerator (DAX): in-memory cache for DDB only
- Amazon DocumentDB: managed MongoDB-compatible document DB
- Amazon Neptune: graph DB for storing nodes and relationships; ideal for knowledge graphs, fraud detection, recommendations, and social networks
- Amazon Timestream: time-series DB for sequential data (e.g., telemetry, stock prices)
- Amazon QLDB: immutable ledger database for cryptographically verifiable transactions (discontinued July 31, 2025)
- Amazon Managed Blockchain: fully managed Hyperledger Fabric and Ethereum blockchains for decentralized transactions
AWS Data Engineering & Analytics Services
- Amazon EMR: runs Apache Hadoop clusters for processing large-scale data
- AWS Glue: serverless ETL and data catalog service
- ETL: extract, transform, and load data
- Data Catalog: stores metadata about datasets
- Amazon Athena: serverless SQL engine for querying data in S3
- Schema-on-read (ELT), billed per query based on data scanned
- Amazon QuickSight: BI service to create dashboards and reports
- Serverless, ML-powered, integrates with RDS, Aurora, Redshift, Athena, S3
Additional Storage Services
AWS Snowball
- Physical hardware devices designed to collect and process data at the edge
- Secure and portable
- Use cases:
- Edge computing
- “Edge” refers to locations with limited or no internet connectivity
- Examples: trucks, ships, mining stations
- Offline data migration
- Moving large datasets to or from AWS when network transfer is impractical
- Edge computing
AWS Storage Gateway
- Hybrid storage solution: combines on-premises infrastructure with cloud storage
- Storage Gateway: acts as a bridge between on-premises storage and AWS cloud storage (S3)
- Useful for backups or extending storage capacity to the cloud
- Types of Storage Gateway:
- Volume Gateway
- Tape Gateway
- File Gateway
Networking Services
Amazon VPC (Virtual Private Cloud) – CLF-C02 Overview
IP Addresses in AWS
- IPv4 & IPv6 refresher: IP Address Space
- When a resource (e.g., an EC2 instance) is launched in a VPC subnet:
- It automatically receives a static Private IPv4 address, usable internally within the subnet for routing traffic.
- Additional Private IPv4 addresses can be assigned to the instance.
- Stopping and restarting the instance does not change its Private IPv4 addresses.
- It may receive a dynamic Public IPv4 address if deployed in a public subnet.
- Stopping and restarting the instance changes its Public IPv4 address.
- It can be assigned an Elastic IP (EIP), which is a static Public IPv4 address, if deployed in a public subnet.
- Stopping and restarting the instance does not change its EIP.
- It can have multiple IPv6 addresses, all routable on the internet.
- The subnet must be configured with an IPv6 CIDR block.
- It automatically receives a static Private IPv4 address, usable internally within the subnet for routing traffic.
- Billing considerations:
- All public IPv4 addresses in AWS incur charges ($0.005/hour).
- This includes any allocated EIPs. If an EIP is allocated but not attached to an instance, it is still billed.
- Free Tier provides 750 hours per month, which covers running one instance continuously with a single public IPv4 or EIP for a month.
- IPv6 addresses and Private IPv4 addresses are free of charge.
- All public IPv4 addresses in AWS incur charges ($0.005/hour).
VPC Components
- VPC Subnet: a segment of a VPC associated with a specific Availability Zone (AZ).
- AWS resources are deployed into subnets based on service type.
- Private subnet: no internet access.
- Public subnet: accessible from the internet.
- Route Table (RT): manages routing within the subnet and outbound traffic.
- Internet Gateway (IGW): enables access to the internet and public AWS services; deployed in a public subnet.
- NAT (Network Address Translation): translates private IPv4 addresses to public IPv4 addresses and vice versa, allowing outbound internet connections from private subnets.
- NAT Instance: EC2 instance performing NAT, managed by the customer.
- NAT Gateway (NATGW): fully managed by AWS.
- Firewalls: control incoming and outgoing network traffic.
- Network Access Control List (NACL): stateless firewall at the subnet level.
- Security Group (SG): stateful firewall at the instance or ENI level.
- VPC Flow Logs: capture metadata of network traffic, without recording the actual content.
Hybrid Networking Services/Products
- VPC Peering: private connection between two VPCs with non-overlapping IP ranges.
- Non-transitive: if VPC A is peered with B, and B is peered with C, A is not automatically peered with C.
- VPC Endpoint (VPCE): allows private access from within a VPC to public AWS services, avoiding internet routing.
- Includes Gateway Endpoints (GWEs) and Interface Endpoints (IEs).
- AWS PrivateLink: establishes private connections to services in a third-party VPC and powers VPC Interface Endpoints (IEs).
- AWS Site-to-Site VPN: secure VPN over the public internet between an on-premises network and AWS, with in-transit encryption.
- AWS Client VPN: OpenVPN-based connection from a user’s computer to a VPC over the internet.
- AWS Direct Connect (DX): dedicated physical connection between on-premises network and AWS, offering higher speeds but requiring installation time.
- VPC Transit Gateway (TGW): connects multiple VPCs and/or on-premises networks.
- Provides transitive, scalable connectivity, simplifying network topology for large environments.
Global Cloud Infrastructure Services (AWS)
Global Applications
- Global application: an application deployed across multiple geographic locations.
- For example, deploying an app across several AWS regions and/or edge locations.
- Refresher: AWS Global Infrastructure
- Benefits:
- Reduced latency:
- Users located closer to the infrastructure experience lower latency.
- Deploying infrastructure in multiple regions ensures more users are near a resource.
- Resilience and disaster recovery (DR):
- If one region fails, other regions remain operational, allowing the application to continue functioning.
- Distributed global infrastructure is harder to attack.
- Reduced latency:
- Drawbacks:
- Higher infrastructure costs: more deployment locations increase expenses.
- Increased management complexity: requires configuring global services, data replication, redundancy, and failover mechanisms.
- In general, making an application global is recommended when performance improvements are needed or when the user base grows significantly.
| AWS App Deployment | High Availability? | Good Global Read Latency? | Good Global Write Latency? |
|---|---|---|---|
| Single-region, single-AZ | No | No | No |
| Single-region, multi-AZ | Yes | No | No |
| Multi-region, Active-Passive | Yes | Yes | No |
| Multi-region, Active-Active | Yes | Yes | Yes |
Amazon Route 53
- Amazon Route 53: a global Domain Name System (DNS) service.
- Refresher: Route 53 Basics
- Route 53 can direct users to different infrastructure endpoints when they query a DNS name.
- Useful for sending users to the nearest infrastructure to reduce latency.
- Supports disaster recovery by routing traffic to healthy resources if the primary resource fails.
- Common routing policies:
- Simple Routing: routes to a single resource without health checks.
- Failover Routing: monitors the health of the primary resource and routes traffic to a secondary resource if the primary is unhealthy, providing a basic DR solution.
- Weighted Routing: distributes traffic across multiple resources according to specified weights (e.g., 70% to server A, 20% to server B, 10% to server C).
- Latency Routing: directs users to the resource with the lowest latency for them.
Services Optimized for the AWS Global Network
- Amazon CloudFront (CF): a global Content Delivery Network (CDN) for caching content at AWS edge locations, improving read performance and reducing latency.
- S3 Transfer Acceleration: speeds up uploads and downloads by routing data through the AWS Global Network via the nearest edge location, minimizing reliance on the public internet.
- AWS Global Accelerator: improves availability and performance for global applications over TCP/UDP by routing traffic through the AWS Global Network.
- Note: it does not cache content like CloudFront.
Edge Deployments of AWS Infrastructure
- AWS Outposts: a private cloud solution that extends AWS services to your own data center.
- You can run EC2, RDS, S3, and other AWS services on-premises using the same APIs and tools as public AWS, enabling hybrid applications.
- Benefits: low latency, data residency on-premises, simplified migration between on-premises and cloud.
- Responsibility: the business manages the physical security of Outposts racks (power, cooling, etc.).
- Deploying Outposts alongside public AWS infrastructure constitutes a Hybrid Cloud setup.
- AWS Local Zones: extend AWS infrastructure to a local data center in a specific city.
- Brings AWS compute, database, and storage resources closer to end-users.
- Useful for latency-sensitive applications where users are far from the nearest Availability Zone.
- Note: the local data center is managed by AWS, not the customer.
- Example:
us-east-1region includes multiple AZs and Local Zones in cities like Boston, Chicago, and Miami.
- AWS Wavelength: deploys AWS resources within 5G networks for ultra-low latency applications, leveraging mobile networks provided by telecom companies or communication service providers (CSPs).
Additional AWS Services
Infrastructure as Code (IaC) & Deployment Services
Infrastructure as Code (IaC) Services
- AWS CloudFormation (CFN): AWS-native IaC service using JSON/YAML templates
- Refresher: AWS CloudFormation (CFN) 101
- AWS Infrastructure Composer: tool for visually creating CloudFormation templates.
- AWS Cloud Development Kit (CDK): library for defining AWS infrastructure in code
- Allows defining cloud resources using familiar programming languages (JavaScript/TypeScript, Python, Java, .NET).
- The code is synthesized into a CloudFormation template for deployment.
AWS Elastic Beanstalk (EB)
- Platform-as-a-Service (PaaS) designed for developers.
- Deploy applications without managing underlying infrastructure.
- Automatically provisions required AWS resources (EC2, Auto Scaling Groups, Load Balancers, RDS, etc.) managed by Elastic Beanstalk.
- Supports multiple programming languages and Docker images.
- Deployment follows standard architectures, e.g., 3-tier web applications with ALB, EC2, and RDS.
- Elastic Beanstalk manages:
- Instance operating systems and configuration.
- Deployment strategies.
- Capacity provisioning.
- Load balancing and auto-scaling.
- Application health monitoring and responsiveness.
AWS Systems Manager (SSM)
- Manage systems at scale across cloud and on-premises infrastructure.
- Hybrid service: manages EC2 instances as well as on-premises servers.
- Supports multiple operating systems: Linux, Windows, macOS, Raspberry Pi OS.
- SSM originally stood for Simple Systems Manager, but the service is now just called Systems Manager.
- To register a server in the SSM Fleet Manager, the SSM Agent must be installed.
- Pre-installed on Amazon Linux AMIs and some Ubuntu AMIs.
- Key Systems Manager capabilities:
- Automate updates and patching to maintain compliance.
- Execute commands across multiple servers simultaneously.
- SSM Session Manager: open a secure terminal session without configuring SSH.
- Eliminates need for SSH keys, bastion hosts, or opening port 22.
- Session activity can be logged to S3 or CloudWatch Logs for auditing.
- SSM Parameter Store: securely store configuration values, environment variables, and secrets.
AWS Code* Family Services
- AWS CodeCommit: version-controlled Git repository for code storage.
- Discontinued: AWS recommends using third-party Git services (GitHub, GitLab, Bitbucket).
- Git enables version control and collaboration, which is central to Continuous Integration (CI).
- AWS CodeBuild: compiles, builds, and tests code in AWS.
- Produces deployable artifacts that can be used by CodeDeploy.
- AWS CodeDeploy: automates deployment and updates of applications to servers.
- Hybrid capability: deploys to EC2 instances and/or on-premises servers.
- AWS CodePipeline: orchestrates CI/CD pipelines.
- CI (Continuous Integration) with GitHub or CodeCommit.
- CD (Continuous Deployment) with CodeBuild and CodeDeploy.
- Example: automatically updates Elastic Beanstalk whenever code is pushed to production.

- AWS CodeArtifact: stores software packages and dependencies.
- Manages code dependencies such as libraries and packages.
- Compatible with common dependency management tools: Maven, Gradle, npm, Yarn, pip, and others.
Application Integration & Decoupling Services
Application Integrations and Communication in AWS
- Application integration refers to how two or more applications exchange data and interact with one another.
- Application communication models:
- Synchronous communication: the sender waits for the receiver to respond or acknowledge the request.
- Asynchronous communication: the sender sends data without waiting for an immediate response; acknowledgements occur independently if required.
- Event-driven architectures (EDA): actions are triggered by events. Applications are loosely coupled and respond to changes or events rather than direct requests.
- Decoupling applications involves shifting from synchronous to asynchronous communication.
- Applications operate independently and do not rely directly on one another.
- Enables independent scaling and isolates failures between systems.
- Decoupling is commonly achieved using messaging services such as queues and topics.
AWS Application Messaging Services
Amazon SQS (Simple Queue Service)
- Fully managed, serverless message queuing service.
- Queue-based communication: one-to-one messaging.
- FIFO queues: maintain message order, prevent duplicates, but have limited throughput.
- Standard queues: do not guarantee ordering, may deliver duplicate messages, and scale more easily.

- One or more producers send messages to a queue.
- Producers do not wait for consumers to process messages, enabling asynchronous communication.
- Messages can be stored in the queue for up to 14 days.
- One or more consumers poll messages from the queue.
- After processing, consumers delete messages from the queue.
Amazon SNS (Simple Notification Service)
- Fully managed, serverless messaging service using topics.
- Topic-based communication: one-to-many messaging.
- Uses the publisher-subscriber (pub/sub) model.
- Unlike queues, messages are not retained if subscribers do not receive them.

- A single publisher sends a message to a topic.
- The topic distributes the message to one or more subscribers.
- Supported subscribers include email, Lambda, SQS, HTTP/S endpoints, and mobile services.
- Example: when an EC2 instance is terminated, an event can be published to SNS, which then triggers multiple actions such as sending notifications or invoking Lambda functions.
Amazon MQ
- Managed message broker service supporting Apache ActiveMQ and RabbitMQ.
- Not serverless; message broker servers are provisioned and managed.
- Supports both queue-based and topic-based messaging.
- Uses standard messaging protocols such as MQTT and AMQP, enabling compatibility with existing applications.
Amazon Kinesis
- Platform for real-time data streaming, storage, and analysis.
- A stream is a time-ordered sequence of data records with a defined retention period (default is 24 hours).
- Key services include:
- Kinesis Data Streams: ingest and process large volumes of streaming data.
- Kinesis Video Streams: stream video and video-related data.
- Amazon Data Firehose: delivers streaming data to storage and analytics services such as S3 and Redshift.
- This service is now independent and no longer part of the Kinesis family.
- Amazon Managed Service for Apache Flink: performs real-time analytics and enrichment of streaming data.
- Previously known as Kinesis Data Analytics and no longer categorized under Kinesis.
Monitoring Services for Cloud Applications
Amazon CloudWatch (CW)
- Amazon CloudWatch: AWS’s native monitoring and alerting service, providing Metrics, Alarms, Logs, and Events.
- Refresher: CloudWatch 101
- Key CloudWatch components:
- CW Logs: collects logs from multiple sources, such as EC2 instances, on-premises servers, Lambda functions, and ECS containers.
- CW Metrics: monitors performance and billing metrics for AWS resources, e.g.,
CPUUtilization,NetworkIn,NumberOfObjects. - CW Alarms: trigger notifications or actions when metrics reach or exceed defined thresholds.
- Examples: scale an EC2 Auto Scaling Group, send an SNS notification, or trigger a billing alert if costs exceed a limit.
- CW Events: previously allowed reacting to AWS events or scheduled triggers.
- Amazon EventBridge now replaces CW Events, offering enhanced features such as aggregating events across multiple AWS accounts.
Other Monitoring Services
- AWS CloudTrail (CT): records API calls made within your AWS account.
- Enables auditing to track who performed what action and when.
- CloudTrail Insights: automatically analyzes CloudTrail events for anomalies or unusual activity.
- AWS X-Ray: provides end-to-end tracing of requests in distributed applications.
- Offers visibility into infrastructure and performance insights for applications.
- Amazon CodeGuru: uses machine learning to optimize application code and performance.
- CodeGuru Reviewer: automated static code analysis for reviewing code.
- CodeGuru Profiler: provides performance recommendations for applications, both pre- and post-production.
- AWS Health Dashboard: monitors the status and health of AWS services.
- Service Health: historical status of all AWS services across all regions (formerly AWS Service Health Dashboard).
- Your Account: shows events affecting only your AWS resources, such as maintenance or incidents (formerly Personal Health Dashboard).
Security & Regulatory Compliance Services
Network Protection Services
- AWS Shield: protects against Distributed Denial of Service (DDoS) attacks automatically.
- Shield Advanced provides premium 24/7 support and additional security features.
- AWS WAF (Web Application Firewall): Layer 7 firewall that filters incoming web requests based on configurable rules.
- Protects against common web exploits such as SQL injection and cross-site scripting (XSS).
- AWS Network Firewall: protects an entire VPC against network-level attacks.
- Security Groups (SGs) provide resource-level protection, NACLs operate at the subnet level, and Network Firewall protects at the VPC level.
- AWS Firewall Manager: centralizes and enforces security rules across multiple AWS accounts within an Organization.
- Applies rules immediately to existing and new resources and accounts, covering SGs, WAF rules, and Shield configurations.
Penetration Testing in AWS
- Penetration testing: intentionally testing your own systems and infrastructure to identify vulnerabilities.
- AWS allows penetration testing on certain services without prior approval:
- EC2 instances, NAT Gateways, ELBs, RDS, CloudFront, Aurora, API Gateway, Lambda, Lightsail, Elastic Beanstalk.
- AWS prohibits certain tests as they may affect AWS infrastructure:
- DNS zone walking through Route 53 hosted zones.
- DoS and DDoS attacks, including port, protocol, and request flooding.
- For other simulated security events, contact: aws-security-simulatedevent@amazon.com
Encryption Services
- AWS KMS (Key Management Service): manage encryption keys for data at rest.
- Default encryption service used by many AWS services.
- Keys can be customer-managed or AWS-managed.
- AWS CloudHSM: dedicated hardware for secure key storage.
- HSM = Hardware Security Module, a tamper-resistant device providing higher security than KMS.
- AWS does not manage the keys stored in CloudHSM.
- AWS Certificate Manager (ACM): create and manage SSL/TLS certificates to encrypt data in transit (e.g., HTTPS, FTPS, SMTPS).
- AWS Secrets Manager: securely store and manage application secrets such as database passwords.
- Supports cross-region replication and automatic secret rotation (e.g., for RDS credentials).
Insecurity and Vulnerability Detection Services
- Amazon GuardDuty: machine learning-powered threat detection to identify malicious or anomalous behavior in VPC, DNS, and CloudTrail logs.
- Amazon Inspector: scans compute resources (EC2 instances, ECR images, Lambda functions) for software vulnerabilities.
- Amazon Macie: detects sensitive data, such as personally identifiable information (PII), in S3 buckets.
- Amazon Detective: investigates the root cause of security issues or suspicious activity by analyzing findings from services like Macie and GuardDuty.
- IAM Access Analyzer: identifies resources shared outside trusted boundaries.
- Generates findings for S3 buckets, IAM roles, KMS keys, and other resources accessed beyond the defined zone of trust.
- Helps refine access policies for improved security.
Compliance and Audit Services
- AWS Artifact: provides access to compliance reports and agreements (PCI, ISO, HIPAA, etc.).
- AWS Config: monitors configuration changes to resources and assesses compliance against rules.
- AWS CloudTrail: logs API calls made within an AWS account for auditing purposes.
- AWS Security Hub: centralizes security findings from multiple services (Config, Macie, GuardDuty) across multiple accounts within an Organization.
- AWS Abuse: report AWS resources used for abusive or illegal activities, such as spam, DDoS attacks, illegal content hosting, or malware distribution.
- Contact options:
- Abuse form: AWS Abuse Form
- Email: abuse@amazonaws.com
- Contact options:
Cloud Migration Services
Cloud Migration Strategies: The 7 Rs

- Retire: shut down systems or applications that are no longer needed.
- Reduces costs, allows focus on other resources, and improves security by eliminating potential attack vectors.
- Retain: choose not to migrate certain systems for now.
- Specific performance, security, or business reasons may justify leaving resources on-premises.
- Avoid migration if there is no significant business value.
- Relocate: move applications from on-premises to the cloud, or within cloud environments.
- Examples: migrate EC2 instances to a different VPC, region, or AWS account.
- Move servers from a VMware Software-defined Data Center (SDDC) to VMware Cloud on AWS.
- Rehost (“Lift and Shift”): migrate applications as-is to AWS without cloud-specific optimization.
- Suitable for physical servers, virtual machines, or other cloud environments.
- Example: migrate using AWS Application Migration Service.
- Replatform (“Lift and Reshape”): maintain core architecture but apply some cloud optimizations.
- Cloud enhancements may include fully managed services or serverless paradigms.
- Examples: move an on-premises SQL database to Amazon RDS or deploy an app to Elastic Beanstalk (PaaS).
- Repurchase (“Drop and Shop”): replace existing software with a different product, often a SaaS solution.
- Can be costly initially but provides rapid deployment.
- Examples: migrate from an on-premises CMS to Drupal or from an on-premises CRM to Salesforce.
- Refactor (Rearchitect): redesign applications and infrastructure to be cloud-native.
- Maximizes cloud benefits such as scalability, performance, agility, and security.
- Requires significant effort, time, and engineering resources.
- Examples: convert monolithic applications to microservices, migrate on-premises apps to serverless architectures, or store media in Amazon S3.
AWS Migration Services
- AWS Application Discovery Service: collects information from on-premises data centers to plan migrations to AWS.
- Gathers server utilization and dependency mapping data.
- Offers both agentless and agent-based discovery methods.
- AWS Migration Evaluator: builds a data-driven business case for cloud migration.
- Analyzes the current state, defines the target state, and develops a migration plan.
- Agentless collector inventories on-premises resources and dependencies to establish a baseline.
- AWS Application Migration Service (MGN): rehost (lift-and-shift) solution for moving applications to AWS.
- Formerly CloudEndure Migration; replaced AWS Server Migration Service (SMS).
- Uses a replication agent on on-premises servers to continuously copy data to a staging environment in AWS, followed by a cutover event to launch the production environment.
- AWS DataSync: automates large-scale data migrations between on-premises systems and AWS.
- Supports online migrations; Snowball devices are used for offline migrations.
- Can perform incremental replication after the initial full-load migration to keep data synchronized.
- AWS Database Migration Service (DMS): migrate databases to or from AWS.
- Primarily used for SQL database migrations; DynamoDB is the only currently supported NoSQL target.
- AWS Migration Hub: centralized dashboard for migration operations.
- Consolidates inventory and progress from AWS Application Migration Service and Database Migration Service.
- Helps plan, track, and manage migrations, and supports automation for lift-and-shift.
- Migration Hub Orchestrator: provides predefined migration templates for streamlining workflows.
Machine Learning & Artificial Intelligence (ML/AI) Services
AI & Machine Learning (ML) 101
- A concise introduction to AI and ML is available here: AI 101
- It is recommended to pursue the AWS AIF-C01 certification to gain foundational knowledge of AI and ML, including key terminology and AWS-specific AI/ML applications.
- For the CLF-C02 exam, only a high-level understanding of some AWS-managed AI services is required.
AWS-Managed AI Services
- Amazon Rekognition: detects objects, scenes, and faces in images and videos.
- Examples: facial recognition, object labeling, celebrity recognition.
- Amazon Comprehend: Natural Language Processing (NLP) service.
- Examples: identify names in text, analyze sentiment or tone.
- Amazon Textract: extracts text and structured data from documents.
- Examples: retrieve costs, VAT, or other information from scanned receipts.
- Amazon Transcribe: Automatic Speech Recognition (ASR) converts audio into text.
- Examples: generate subtitles, transcribe phone calls.
- Amazon Polly: converts text into spoken audio (Text-to-Speech, TTS).
- Amazon Translate: translates text between different human languages.
- Amazon Kendra: ML-powered search engine.
- Example: locate semantically related keywords across multiple documents.
- Amazon Lex: builds intent-driven and conversational chatbots.
- Unlike GenAI or large language model chatbots, Lex chatbots focus on understanding user intent to perform specific actions, similar to Amazon Alexa.
- Amazon Connect: cloud-based contact and call center service.
- Amazon Personalize: provides real-time, personalized recommendations.
- Technology behind product recommendations on amazon.com.
- Amazon SageMaker: end-to-end platform for building, training, and deploying custom ML models.
- Core focus of the MLA-C01 and MLS-C01 certifications.
- Amazon Bedrock: marketplace for foundation models (FMs) used in Generative AI.
- Core focus of the AIF-C01 certification
Payment and Support Services
General Overview of AWS Cloud Costs
AWS Pricing Models
- Pay-as-you-go / Pay-per-use: pay for the exact resources you consume at full price.
- Advantages: allows agility, responsiveness, and the ability to scale on demand.
- This is the default pricing model unless otherwise specified.
- Reserved pricing: reserve capacity in advance to receive discounts if all reserved resources are used.
- Advantages: predictable budgeting, reduced risk, and support for long-term planning.
- Disadvantages: less flexibility; if reserved capacity is underutilized, money is wasted (can resell reserved instances in the marketplace).
- Reservations are available for select services, including EC2 Reserved Instances, DynamoDB Reserved Capacity, ElastiCache Reserved Nodes, RDS Reserved Instances, and Redshift Reserved Nodes.
- Volume-based discounts: pay less as usage increases.
- Example: AWS Organization accounts consuming resources collectively can benefit from discounts.
- Economies of scale: as AWS gains more customers, it can provision more resources and offer lower prices.
AWS Free Offerings
- 12-month Free Tier:https://aws.amazon.com/free/
- Example: run a t2.micro EC2 instance for one year or receive a limited number of S3 GET requests for free.
- Free Tier can be used continuously by migrating infrastructure across different AWS accounts each year.
- Ideal for small projects; Infrastructure as Code (IaC) helps automate recreation of resources.
- Free Trials: some services offer temporary free usage (e.g., one month).
- Always free services:
- Completely free: IAM, VPC subnets.
- Free to use but resources incur costs: CloudFormation stacks, Elastic Beanstalk, Auto Scaling Groups.
Compute, Storage, and Networking Costs
- AWS Cloud Pricing overview: What is AWS?
- Recommended: review the lecture for a concise pricing overview; exams focus on concepts, not detailed numbers.
- Tip: search for “<AWS_SERVICE> pricing” to quickly access official documentation.
AWS Billing and Cost Management Tools
Cost Tracking
- Billing and Cost Management Dashboard: provides a high-level view of costs, billing, and free-tier consumption.
- AWS Resource Groups: group resources for easier searching and management.
- Tags: define logical groupings of resources.
- AWS-generated:
aws:prefix (e.g.,aws:createdBy). - User-defined:
user:prefix. - Example: resources created by a CloudFormation stack share the same tags for cost tracking.
- AWS-generated:
- Cost Allocation Tags: used to track costs and generate detailed reports.
- Tags: define logical groupings of resources.
- Cost and Usage Reports: comprehensive dataset of usage and costs in CSV format.
- Can integrate with Excel, Athena, Redshift, or QuickSight.
- Includes metadata, reserved instance usage, and cost data.
- Cost Explorer: visualize and analyze AWS costs and usage over time.
- View detailed usage by service or resource type.
- Forecast costs up to 12 months based on historical usage.
Cost Optimization
- AWS Savings Plans: commit to long-term usage of compute resources for discounts.
- Commitments are for 1 or 3 years and billed at a fixed $/hour rate.
- Types:
- EC2 Savings Plan: applies to a specific instance family in a region; size, OS, AZ, and tenancy can vary.
- Compute Savings Plan: applies across EC2, Fargate, and Lambda with flexibility for family, size, region, AZ, OS, and service.
- Machine Learning Savings Plan: applies to SageMaker usage.
- AWS Compute Optimizer: ML-based recommendations for optimal compute configurations to reduce costs.
- AWS Trusted Advisor: account assessment tool providing recommendations on cost optimization, security, and performance. Higher support plans unlock more functionality.
Estimating, Planning, and Monitoring Costs
- AWS Pricing Calculator: estimate infrastructure costs in advance (https://calculator.aws/).
- CloudWatch Billing Alarms: alert when actual costs exceed a threshold; tracks actual, not projected costs.
- AWS Budgets: create usage, cost, reservation, or savings plan budgets with SNS notifications when thresholds are approached or exceeded.
- AWS Cost Anomaly Detection: ML-based detection of unusual costs or usage patterns.
- AWS Service Quotas: monitor service limits to prevent unexpected charges.
- Alerts can notify you before reaching quotas; exceeding quotas throttles service usage.
- Customers can request quota increases through AWS Support.
AWS Support Plans
From lowest to highest cost:
- Basic Support Plan (Free)
- 24/7 access to Customer Service and AWS communities.
- Read documentation, whitepapers, and participate in forums.
- Access 7 core Trusted Advisor checks and AWS Personal Health Dashboard.
- Developer Support Plan
- Includes all Basic plan features.
- Business-hours email access to Cloud Support Associates; unlimited cases.
- Response times: General guidance < 24 business hours; System impaired < 12 business hours.
- Business Support Plan
- For production workloads.
- Includes Developer Support features.
- Full set of Trusted Advisor checks, API access.
- 24/7 phone, email, and chat support; unlimited cases.
- Infrastructure Event Management available for additional fee.
- Response times: General guidance < 24 hours; System impaired < 12 hours; Production impaired < 4 hours; Production down < 1 hour.
- Enterprise On-Ramp Support Plan
- For production or business-critical workloads.
- Includes Business Support features.
- Access to Technical Account Managers (TAMs), Concierge Support Team, and Well-Architected / Operations Reviews.
- Response times: General guidance < 24 hours; System impaired < 12 hours; Production impaired < 4 hours; Production down < 1 hour; Business-critical down < 30 minutes.
- Enterprise Support Plan
- For mission-critical workloads.
- Includes Business Support features.
- Dedicated TAM, Concierge Support Team, Infrastructure Event Management, Well-Architected / Operations Reviews.
- Optional AWS Incident Detection and Response.
- Response times: General guidance < 24 hours; System impaired < 12 hours; Production impaired < 4 hours; Production down < 1 hour; Business-critical system down < 15 minutes.
Account Administration Services
AWS Accounts – Best Practices
- Use AWS CloudFormation (CFN) to deploy stacks across accounts and regions in a fast, consistent, and reliable manner.
- Security:
- Follow IAM guidelines: enable MFA, implement the least-privilege principle, enforce strong password policies, and rotate passwords regularly.
- Send service and access logs to CloudWatch Logs and/or S3.
- Use AWS CloudTrail to record all API calls within your account.
- Configure AWS Config to track all resource configurations and compliance over time.
- If an account is compromised: change the root user password, delete and rotate all keys/passwords, and contact AWS support.
- Billing:
- Use Tags and Cost Allocation Tags for easier management and billing.
- Use AWS Trusted Advisor to get insights and recommendations tailored to your support plan.
- The above practices apply to both single accounts and organizations with multiple accounts. This section also explores tools and services for managing multi-account environments.
Multi-Account Management in AWS
Possible Multi-Account Strategies
- Create separate accounts per department (e.g., dev, sales, HR) or per cost center.
- Create separate accounts per development environment (DEV/TEST/PROD).
- Create separate accounts based on regulatory requirements (using Service Control Policies or SCPs).
- Create separate accounts for enhanced resource isolation (sometimes separate VPCs are not sufficient).
- Create isolated accounts for centralized functions (e.g., identities, monitoring, logging).
- Create separate accounts to establish different per-account service limits.
AWS Organizations
- AWS Organizations enables multi-account management using a master account and member accounts.
- Cost Benefits:
- Consolidated Billing: one bill and a single payment method for all accounts; the master account pays for all.
- Volume Discounts: aggregated usage can reduce costs for compute, storage, and other services.
- Pooling Reserved EC2 Instances: cost-effective sharing of reserved capacity across accounts.
- Automate account creation through API.
- Group accounts into Organizational Units (OUs).
- Restrict member account privileges using Service Control Policies (SCPs):
- SCPs are JSON documents similar to IAM policies that allow or deny access.
- SCPs applied to an OU affect all member accounts (excluding the master account).
- The master account cannot be restricted by SCPs.
- Root users in member accounts are restricted according to SCPs applied to their OU.
- Access to resources requires that both account policies and SCPs permit it; if either denies access, it will be blocked.
- Many single-account best practices also apply to AWS Organizations:
- Evaluate multi-account setup versus a single account with multiple VPCs.
- Use standardized tagging for billing.
- Enable CloudTrail on all accounts, optionally sending logs to a central S3 account.
- Send CloudWatch Logs to a central logging account.
Other Account Management Services
- AWS RAM (Resource Access Manager): manage sharing of resources across accounts.
- Works with accounts in the same organization or external accounts.
- Helps avoid resource duplication (e.g., databases, VPC subnets, TGWs, EC2 dedicated hosts).
- AWS Service Catalog: self-service portal for launching authorized resources.
- Administrators approve resource stacks in CFN, which are then available in the portal.
- Users must have IAM permissions to launch pre-approved stacks, preventing unauthorized resource creation.
- AWS Control Tower: simplifies setup, orchestration, and governance of secure multi-account environments.
- Enforces best practices across all accounts created via Control Tower.
- Integrates and orchestrates multiple AWS services (e.g., AWS Organizations, Service Catalog, IAM).
DR (Disaster Recovery) Services
Disaster Recovery (DR) and Business Continuity (BC)
Active/Passive DR Strategy

- In this model, the primary site handles all production traffic while the secondary site remains on standby.
- Failover occurs only when the primary site fails.
Active/Active DR Strategy

- Both sites are active and share the workload.
- Provides faster failover since all systems are already running and can handle traffic distribution.
Four DR Strategies in AWS Cloud
- Trade-off: cost versus speed of recovery in case of a disaster.
- Ordered from lowest cost / slowest recovery to highest cost / fastest recovery:
- Backup and Restore: maintain snapshots or backups; recovery requires full restoration.
- Pilot Light: essential parts of the application are pre-configured; other components can be quickly started when needed.
- Warm Standby: the full application is deployed at minimal capacity; can be scaled up quickly during recovery.
- Multi-Site / Hot-Site: full application deployed and running at full capacity on multiple sites; can be used to load-balance traffic immediately.

Disaster Recovery Services in AWS
- AWS Backup: centralized management and automation of backups across AWS services.
- Supports on-demand or scheduled backups.
- Allows point-in-time recovery (PITR).
- Can perform cross-region and cross-account backups via AWS Organizations.
- AWS Elastic Disaster Recovery (DRS): quickly restore physical, virtual, or cloud servers in AWS.
- Formerly known as CloudEndure Disaster Recovery.
- Provides continuous block-level replication of servers.
Other Services
Media, Mobile, and Web Application Support Services
- Amazon WorkSpaces: Desktop-as-a-Service (DaaS) – provision Windows or Linux desktops in the cloud.
- Launches an instance hosting an OS desktop, which users can connect to as if it were a local laptop. Deploying instances close to users improves performance and reduces latency.
- Eliminates the need to manage on-premises Virtual Desktop Infrastructure (VDI), commonly used for remote access to internal resources.
- Amazon AppStream 2.0: application streaming service – run desktop applications directly in a web browser.
- Does not require a VDI setup.
- Example: stream applications like Blender or Paint via a web browser.
- AWS IoT Core: connect Internet of Things (IoT) devices to the AWS cloud and handle massive volumes of messages.
- IoT refers to devices like cars, smart home appliances, TVs, and more communicating through internet protocols.
- Amazon Elastic Transcoder: convert media files stored in S3 to different formats, such as video conversion.
- AWS AppSync: serverless GraphQL service for real-time data storage and synchronization across mobile and web applications.
- Unlike REST APIs, GraphQL APIs allow clients to request exactly the data they need.
- AWS Amplify: build and deploy full-stack web and mobile applications with scalable backend infrastructure.
- Automatically provisions resources for authentication, storage, APIs (REST or GraphQL), CI/CD, Pub/Sub, analytics, AI/ML, and monitoring.
- Offers functionality similar to platforms like Vercel.
- AWS Device Farm: test mobile and web applications on real devices, including desktop browsers, mobile phones, and tablets.
- Uses AWS-managed real devices rather than emulators.
Additional AWS Services
- AWS Infrastructure Composer: visually design and deploy infrastructure on AWS.
- Create infrastructure through a GUI and automatically generate Infrastructure as Code (IaC) using CloudFormation templates.
- Can also visualize existing CFN/SAM templates to understand deployments.
- AWS Fault Injection Simulator (FIS): perform fully-managed fault injection experiments to test AWS workloads.
- Supports chaos engineering by introducing disruptive events, such as increased CPU usage or terminating instances, to evaluate system resilience and identify performance bottlenecks.
- AWS Step Functions: serverless state machine service for orchestrating complex workflows.
- Allows coordination of multiple AWS services and Lambda functions depending on workflow state.
- Supports human intervention at workflow steps; use cases include order processing and data pipelines.
- AWS Ground Station: manage satellite communications and operations.
- Provides a global network of satellite ground stations near AWS regions for fast satellite data transfer.
- Use cases include weather monitoring and TV broadcasting.
- Amazon Simple Email Service (SES): automate sending emails at scale, similar to how SNS handles notifications.
- Amazon Pinpoint: scalable two-way marketing communications platform supporting email, SMS, push notifications, voice, and in-app messages.
- Handles billions of messages daily and allows targeting audience segments without manual message management.
- Provides analytics on delivery, open rates, and user responses for marketing campaigns.
- AWS OpsWorks: previously a configuration management service using Chef and Puppet for automating server setup across EC2 or on-premises servers.
- Discontinued.
- AWS Customer Carbon Footprint Tool: track and project carbon emissions from AWS usage to support sustainability initiatives.