Elastic Compute Cloud (EC2) Basics
OS Virtualization 101
OS Virtualization – Core Idea
- Definition: the ability to run multiple operating systems on a single physical machine without causing instability or crashes
- Evolution:
(i) no virtualization →
(ii) software-based virtualization (emulation & para-virtualization) →
(iii) hardware-assisted virtualization (CPU support & SR-IOV) - Amazon EC2 is an example of IaaS (Infrastructure-as-a-Service), which can be thought of as virtual machines delivered as a service (VMaaS)
OS Virtualization – Evolution Over Time
Traditional Setup (No Virtualization)

- Kernel
- The central component of the OS
- Operates in privileged mode, allowing direct control over hardware
- Applications
- Run in user mode (non-privileged)
- Cannot access hardware directly
- Must communicate through the OS using system calls
- If an application bypasses the OS and tries to access hardware directly →
This can lead to application failure or even a full system crash
Early Virtualization Attempt

- Initial approach: run multiple operating systems on the same hardware
- Issue: frequent crashes
- CPUs at the time allowed only one entity in privileged mode
- Each OS expects full privileged access, causing conflicts
Emulation-Based Virtualization

- Hypervisor
- Software layer running on the host OS
- Has full hardware access (privileged mode)
- Virtual Machine (VM)
- A packaged environment containing a guest OS + applications
- Uses virtualized hardware resources (CPU, memory, storage)
- Provides simulated devices (network, disk, graphics)
- The guest OS behaves as if it’s running on real hardware
- Installs drivers and makes privileged calls normally
- Hypervisor uses binary translation
- Intercepts and converts system calls dynamically
- Downside: very slow performance
- Often reduces speed significantly
- Not suitable for high-performance workloads
Para-Virtualization

- Works only with modified operating systems
- The guest OS is altered to avoid privileged instructions
- Uses para-virtualized drivers
- Instead of calling hardware directly → uses hypercalls to the hypervisor
- Benefit: much better performance
- Eliminates the need for binary translation
- Limitation:
- OS must be customized for a specific hypervisor/vendor
Hardware-Assisted Virtualization

- Major advancement: hardware (CPU) supports virtualization natively
- CPU includes special instructions for virtualization
- A hypervisor can directly manage these capabilities
- Guest OS system calls are:
- Captured by the CPU
- Redirected to the hypervisor for execution
- Advantage: near-native performance
- Remaining challenge: IO performance
- Network and disk operations still rely on shared physical devices
- High IO workloads increase CPU overhead due to mapping
Single Root I/O Virtualization (SR-IOV)

- Hardware devices become virtualization-aware
- A single physical network card can appear as multiple virtual network interfaces
- Each VM:
- Sees a dedicated network device
- Can interact with hardware more directly
- Hypervisor involvement is minimized or removed for IO
- Benefits:
- Significantly improved network performance
- Lower CPU usage on the host
- Reduced latency and more consistent performance under load
- AWS implementation:
- Uses the Nitro system for modern virtualization
- Enhanced Networking in EC2 leverages SR-IOV for high-performance networking
OS Virtualization – Summary Table
| OS Virtualization | Hardware Access Method | Limitations |
|---|---|---|
| No virtualization | Applications access hardware through the OS kernel using system calls | Does not support running multiple operating systems on the same machine |
| Emulation-based | Guest OS issues system calls that are translated by the hypervisor into instructions for physical hardware | Binary translation creates significant overhead, greatly reducing performance |
| Para-virtualization | Modified guest OS sends hypercalls directly to the hypervisor instead of making privileged hardware calls | Works only with certain OSes and requires customization for specific hypervisors |
| Hardware-assisted | CPU intercepts guest OS system calls and routes them to the hypervisor | IO operations still rely on software translation between virtual and physical devices |
| SR-IOV (Single Root I/O Virtualization) | Hardware devices expose multiple virtual interfaces, allowing guest OSes near-direct access to physical resources | Requires compatible SR-IOV-enabled hardware, but provides excellent scalability and performance |
Amazon EC2 (Elastic Compute Cloud) 101
Amazon EC2 – Core Concepts
- AWS’s primary compute offering
- Falls under IaaS (Infrastructure-as-a-Service) → the operating system is the main unit you manage
- Users launch instances, which run on underlying physical EC2 hosts
- Instances are also known as virtual machines (VMs) or virtual servers (VSs)
- Responsibility split:
- The customer manages the instance
- AWS manages the physical host
- (Exception: Dedicated Hosts, which customers control)
- Instances are placed inside VPC subnets
- Instances are private by default
- Designed to be Availability Zone (AZ) resilient
EC2 Instances
- An instance is essentially a virtual machine
- Users select and install the operating system
- Can also configure runtime environments, databases, and applications within it
- Instance size and features are chosen at launch
- Some attributes can be modified after deployment
- Default pricing model: On-Demand
- Charged based on usage time (per second in most cases)
- Networking
- Launched inside VPC subnets
- Not publicly accessible unless explicitly configured
- Storage options
- Instance Store → local storage attached to the host
- Amazon EBS (Elastic Block Store) → persistent external storage
EC2 Instance States

- The state reflects the current condition of an instance
- Three primary states:
- Running (Active)
- The instance is operational
- Charges apply for compute, memory, networking, and storage
- Stopped (Inactive)
- The instance is powered off
- Charges apply only for storage
- Can be restarted later
- Terminated (Deleted)
- The instance is permanently removed
- Cannot be recovered or restarted
- No further charges apply
Connecting to Amazon EC2 Instances via SSH
- SSH (Secure Shell) is a secure protocol used to remotely access instances
- Operates over port 22
- Uses key pair authentication (public key + private key)
- The private key is downloaded once and kept on your local machine (e.g.,
A4L.pem) - The public key is stored on the EC2 instance by AWS
- The private key is downloaded once and kept on your local machine (e.g.,
- After establishing an SSH connection:
- You gain access to a command-line shell
- This allows you to run commands and perform system administration tasks
- The private key file must have restricted permissions
- Example command (Mac/Linux):
chmod 400 A4L.pem - This ensures only the file owner can read it
- If other users on the same machine can access the key, AWS will deny the connection
- Example command (Mac/Linux):
Summary diagram:

Connecting to Older Windows Instances via RDP
- Older Windows versions (before Windows 10) do not support SSH natively
- Third-party tools (like PuTTY) were required for SSH
- For these instances, a different protocol is used:
- RDP (Remote Desktop Protocol)
- Uses port 3389
- RDP (Remote Desktop Protocol)
- Authentication process:
- SSH key pair is used to retrieve or decrypt the Administrator password
- Once obtained, you log in via RDP to access the Windows desktop environment
Amazon Machine Image (AMI)
- AMI = a template used to create EC2 instances
- Comparable to a pre-configured server image or a bootable OS installer
- Includes:
- A disk image
- The operating system and kernel
- Additional configuration settings for the virtual machine
- An AMI is used to launch new EC2 instances
- AMIs can be created from an existing instance
- Acts like a snapshot of the instance’s current setup
- Captures:
- Operating system
- Installed software
- Current configuration at the time of creation
AMI Components:
- Permissions
- Control which AWS accounts can use the AMI
- Types:
- Public AMI → accessible by anyone
- Private AMI → restricted access
- The owner has automatic access
- Can grant access to specific AWS accounts
- By default, AMIs are private and only accessible to the owner
- Root (Boot) Volume
- The primary storage used to start the OS
- Examples:
C:drive (Windows)/root volume (Linux)
- Block Device Mapping
- Defines how storage volumes are attached and presented to the instance
- Specifies volume configuration (e.g., size, type, mount points)
Summary diagram:

EC2 Architecture & Resilience
EC2 Architecture

EC2 Instances
- An instance is a virtual machine (VM) that includes an OS and allocated virtual resources
- Defined by its instance type, which determines:
- Family (general purpose, compute-optimized, etc.)
- Generation (hardware version)
- Size (CPU, memory capacity)
- Additional features (e.g., enhanced networking, GPU support)
EC2 Hosts
- Physical servers where instances are executed
- Provide CPU, memory, storage, and networking resources
- Fully managed by AWS
- Host types:
- Shared Host
- Used by multiple AWS customers
- No control or ownership of the underlying hardware
- Billing is based on instance usage and allocated resources
- Instances remain isolated despite sharing the same host
- Default option
- Dedicated Host
- An entire physical server is allocated to a single AWS account
- No sharing with other customers
- Billing is for the entire host, not individual instances
- Shared Host
- Hosts vary by:
- Hardware generation, CPU type, memory, and storage configuration
- Placement behavior:
- Typically runs instances of similar types (different sizes may coexist)
- Different instance types may be placed on separate hosts
- Instance-to-host relationship:
- An instance remains on the same host unless:
- The host fails
- AWS performs maintenance
- The instance is stopped and then started again
- A simple reboot does not move the instance to a new host
- An instance remains on the same host unless:
EC2 Resiliency
- EC2 provides Availability Zone (AZ)-level resiliency
- All resources (compute, storage, networking) exist within a single AZ
- If an AZ fails:
- The host fails
- All instances on that host are affected
- This design is beneficial for high availability (HA):
- Failures are isolated within one AZ (limited blast radius)
- Workloads can be distributed across multiple AZs
- Limitations:
- Instances cannot automatically move across AZs
- Moving to another AZ requires creating a new instance in that AZ
- Resources like EBS volumes and networking are AZ-specific and cannot be directly shared across AZs
EC2 Storage
- Instance Store
- Local storage is physically attached to the host
- Very high performance
- Ephemeral → data is lost if the instance is moved or stopped
- Amazon EBS (Elastic Block Store)
- Network-attached storage service for EC2
- Provides persistent volumes
- Volumes are tied to a single AZ
- Cannot be directly attached to instances in a different AZ
EC2 Storage – Summary Table
| EC2 Storage | Location | Durability | Key Benefit |
|---|---|---|---|
| Instance Store | Physically attached to the EC2 host | Temporary (data lost if instance stops or moves) | Very high performance and low latency |
| Amazon EBS Volume | Network-based storage attached to the instance | Persistent (data retained independently of instance lifecycle) | Reliable storage with strong durability and availability |
EC2 Networking
- Storage networking
- Handles communication between the instance and storage services
- Example: accessing EBS volumes
- Data networking
- Manages standard network traffic (inbound/outbound data)
- Uses a mapping between a virtual network interface and the physical hardware
- Elastic Network Interface (ENI)
- Represents a virtual network card attached to an instance
- Maps to an underlying physical NIC on the host
- When an instance is launched inside a VPC subnet:
- It is automatically assigned a primary ENI
- Instances can have:
- Multiple ENIs attached
- ENIs are placed in different subnets, as long as they are within the same Availability Zone (AZ)
What is Amazon EC2 best used for?
- Traditional OS + application workloads
- Ideal when you need full control over the operating system and installed software
- Long-running, persistent compute
- Designed for workloads that run continuously
- Unlike many AWS services that are short-lived or event-driven
- Server-based applications
- Systems that remain active and respond to incoming requests
- Can handle both bursty traffic and consistent workloads
- Monolithic architectures
- Applications where components (app, database, runtime) are tightly integrated within a single OS environment
- Workload migration to the cloud
- Suitable for moving existing on-premises systems to AWS
- Common with environments using physical servers or tools like VMware
- Lift-and-shift migrations are straightforward, requiring minimal changes to the application
- Disaster Recovery (DR) setups
- Acts as a backup environment for traditional infrastructure
- Cloud-based instances can take over if on-prem systems fail
- Default compute choice in AWS
- A strong general-purpose option for most workloads
- However, alternatives like Amazon ECS or AWS Lambda may be better for:
- Containerized applications
- Event-driven or serverless architectures
- Choosing the right service depends on the specific use case and architecture
EC2 Instance Types
Selecting an EC2 Instance Type
- Choosing an instance type involves selecting its family, generation, size, and additional capabilities
- Provides fine-grained control over the instance’s resources
- Correct choice → optimized performance for your workload
- Incorrect choice → poor performance and bad user experience
- No single type fits all workloads; multiple types may meet requirements, but careful selection is essential
- Provides fine-grained control over the instance’s resources
Factors Influenced by Instance Type
- Raw resources
- CPU, memory, local storage type, and storage capacity
- Resource ratios
- Example: a compute-optimized instance provides more CPU but less memory for the same cost
- Storage and network bandwidth
- Example: EBS throughput depends on the instance’s capabilities; insufficient instance bandwidth may become the bottleneck
- System architecture
- ARM vs x86 architectures
- CPU vendor
- Example: Intel vs AMD processors
- Extra features/capabilities
- Specialized hardware such as GPUs, FPGAs, or enhanced networking
Key takeaway: EC2 instances are highly customizable, allowing you to tailor compute resources to match your workload requirements precisely.
EC2 Instance Categories
Purpose: Group instance types based on workload characteristics and optimization goals
1. General Purpose
- Balanced CPU, memory, and networking resources
- Suitable for a wide range of workloads
- Default starting point: start here, switch to specialized types if needed
- Common types:
- A1, M6g → ARM-based, efficient for small workloads, low cost
- T3, T3a → “Burstable” instances, low baseline CPU with occasional spikes
- M5, M5a, M5n → Steady-state workloads
2. Compute Optimized
- Focused on high CPU performance
- Resource ratio: more CPU, less memory
- Ideal for:
- Media processing
- HPC & scientific modeling
- Gaming
- Machine Learning inference
3. Memory Optimized
- Optimized for large in-memory datasets
- Resource ratio: more memory, less CPU
- Use cases:
- In-memory caches (e.g., Redis, ElastiCache)
- Certain database workloads
4. Accelerated Computing
- Equipped with specialized hardware (e.g., GPUs, FPGAs)
- Designed for parallel processing or custom computation
- Scenarios:
- High-performance modeling or simulations
- Hardware-accelerated ML training
- Custom programmable workloads
5. Storage Optimized
- Provides high-speed, large-capacity local storage
- Ideal for workloads with high IOPS or sequential throughput
- Use cases:
- Scale-out transactional databases
- Data warehousing (e.g., Amazon Redshift)
- Elasticsearch & analytics workloads
Mnemonic to remember categories:
“Great Pirates Conquer Oceans, Master Onboard Acrobatics, Collect Stolen Opulence”
Decoding EC2 Instance Type

- Each instance type precisely and uniquely identifies the compute configuration you need
Example: R5dn.8xlarge
1. Family (R)
- Indicates the purpose of the instance
- Common families:
- C → Compute Optimized
- R → Memory Optimized (RAM-heavy)
- I → IO Optimized
- D → Dense storage
- G → GPU
- P → Parallel processing / GPU-intensive
2. Generation (5)
- Denotes the hardware generation of the instance
- Each generation specifies a combination of CPU, memory, and storage technology
- AWS releases new generations frequently
- Best practice: select the latest generation for optimal price-to-performance
- Exceptions: if unavailable in your region or business requirements dictate otherwise
3. Size (8xlarge)
- Determines the amount of CPU and memory allocated
- Typical size hierarchy:
nano < micro < small < medium < large < xlarge < 2xlarge < 4xlarge < 8xlarge - Larger sizes carry a price premium
- Horizontal scaling (multiple smaller instances) is often more cost-effective than vertical scaling (one large instance)
4. Extra Capabilities (dn)
- An optional suffix indicating special features or hardware
- Examples:
- a → AMD CPU
- d → NVMe local storage
- n → Network-optimized
- e → Extra memory or storage capacity
EC2 Instance Connect vs EC2 SSH
Well-Known Ports for Layer 7 Protocols
- SSH: 22
- HTTP: 80
- HTTPS: 443
Connecting to EC2 via SSH
- SSH (Secure Shell) requires generating and storing key pairs
- Private key stored locally (e.g.,
A4L.pem) - Public key stored on the EC2 instance
- Private key stored locally (e.g.,
- File permissions are critical
chmod 400 A4L.pem→ only the owner can read the file- If permissions are incorrect, EC2 rejects the SSH connection
- Server authenticity verification
- Prompt:
Are you sure you want to continue connecting (yes/no/[fingerprint])? - Optionally, provide a fingerprint shared by the server admin to confirm you are connecting to the correct server
- Protects against DNS spoofing or man-in-the-middle attacks
- Prompt:
- SSH connections use the IP address of the local machine initiating the connection
EC2 Instance Connect
- AWS-managed connection service
- Uses IAM permissions instead of local key pairs
- Reduces administrative overhead and scales better than traditional SSH
- Requirements:
- EC2 instance must have the EC2 Instance Connect package installed
- Installed by default on Amazon Linux 2+ and Ubuntu 16.04+
- Not all instances support this package
- EC2 instance must have the EC2 Instance Connect package installed
- AWS determines the correct user account to use:
- Usually works with default AMIs
- May fail or require manual user specification with custom AMIs
- Connection flow:
- Your local machine connects to AWS, not directly to the instance
- AWS then connects to the EC2 instance
- Therefore, the IP address used for the connection is AWS’s, not your local machine
- Security groups must allow AWS IP ranges for this connection
- AWS IP ranges: https://ip-ranges.amazonaws.com/ip-ranges.json
Storage Refresher
Storage Types in EC2
1. By Location
- Local Storage (Directly-Attached)
- Extremely fast because it’s physically attached to the host hardware
- Example: EC2 Instance Store volumes reside inside the EC2 host
- Limitations:
- Lost if the host disk fails
- Lost if the host hardware fails
- Lost if the EC2 instance is moved to another host
- Network Storage (Network-Attached)
- NAS (Network-Attached Storage) ≈ SAN (Storage Area Network)
- NAS → single device; SAN → cluster of storage devices
- Storage accessed over a network
- On-premises: fiber channel, iSCSI
- AWS: Amazon EBS (Elastic Block Store)
- Advantages:
- Decoupled from host hardware → highly resilient
- Survives EC2 host failures
- Can be easily attached to other instances
- NAS (Network-Attached Storage) ≈ SAN (Storage Area Network)
2. By Persistence
- Ephemeral Storage (Temporary)
- Lives only in the short term
- Examples: RAM, in-memory caches
- EC2 Instance Store is ephemeral → disappears when the instance stops, terminates, or moves
- Persistent Storage (Permanent)
- Survives beyond instance lifetime
- Examples: HDDs, SSDs
- EC2 uses EBS volumes → persist across instance stops and terminations
AWS Storage Categories

- Storage categories define how storage is presented to the system and what it is best used for
1. Block Storage
- Storage is divided into addressable blocks → think of cubes numbered sequentially
- Presented to the OS as:
- Blank physical drive (HDD, SSD…)
- Logical volume mapped to physical storage (e.g., EBS volume)
- Characteristics:
- No built-in structure
- Mountable → OS can apply its own file system (FS), e.g., NTFS, ext3
- Bootable → most EC2 instances use EBS volumes to boot
- NW-attached block storage is also possible (e.g., SAN)
- Advantage: high-performance, OS-level storage
2. File Storage
- Collection of files organized in a file system (FS)
- Presented as structured file shares
- Characteristics:
- Mountable
- Not bootable → OS cannot directly control storage hardware
- Access files via traversing the file structure
- Examples:
- On-premises: Windows File Server
- AWS: Amazon EFS, Amazon FSx
- Advantage: allows multiple servers/clients to share the same file system
3. Object Storage
- Collection of objects
- Each object = data blob + metadata + key
- Value contains the actual data (binary, text, image…)
- Characteristics:
- Flat structure → no hierarchy, no inherent file system
- Not mountable
- Not bootable
- AWS example: Amazon S3 (buckets contain S3 objects)
- Advantage: extremely scalable, accessible simultaneously by many users
Storage Categories – Summary Table
| Storage Category | Collection of | Structure | Mountable | Bootable |
|---|---|---|---|---|
| Block Storage | Addressable blocks | No built-in structure, configurable by OS | Yes | Yes |
| File Storage | Files | File system (FS) | Yes | No |
| Object Storage | Objects | Flat (cannot configure hierarchy) | No | No |
Storage Performance

Storage performance is determined by three interrelated metrics—they cannot be considered in isolation:
- Block Size (IO Size)
- Size of data chunks written or read from storage
- Measured in bytes (kB, MB…)
- IOPS (Input/Output Operations Per Second)
- The number of read/write operations per second that the storage can handle
- Throughput
- Amount of data processed per second
- Measured in bytes per second (MB/s, Mbps…)
Relationship: Throughput=Block Size×IOPS
Analogy: Racing car
- Wheel size = Block size
- Revolutions per second (rps) = IOPS
- End speed = Throughput
Key Considerations
- Throughput optimization is not trivial
- Different storage media have different strengths: some favor high IOPS, others favor high throughput
- Increasing the block size can reduce IOPS
- Network limitations can also affect IOPS
- Performance is limited by the slowest link in the storage chain
- Application, OS, storage subsystem, transport mechanisms, network, storage interface, or the physical drive itself
Best practice in AWS:
- Hands-on experience is crucial—experiment with different storage types, block sizes, and IOPS/throughput settings to find the optimal configuration
Amazon EBS (Elastic Block Store) – Architecture
Amazon EBS – Key Concepts & Architecture

- EBS = Elastic Block Store
- Provides raw block storage as volumes
- Volumes can be attached to EC2 instances or other AWS services
- Instances see an EBS volume as a block device, and you can mount a filesystem on top (e.g., ext3/4, XFS, NTFS)
- Encryption:
- Volumes can be encrypted via KMS or left unencrypted
- AZ-specific storage:
- Volumes are provisioned in a single Availability Zone (AZ)
- Built-in resilience within the AZ
- Cross-AZ attachments not allowed
- Attachment & persistence:
- Usually attached to one instance
- Some volumes support multi-attach, but the application must handle concurrent writes safely
- Volumes can be detached and reattached to the same or different instances
- Independent of instance lifecycle → Volumes are persistent
- Snapshots & backups:
- Snapshots stored in Amazon S3
- Can restore a volume from a snapshot, enabling cross-AZ migration
- Snapshots are regionally resilient
- For global resiliency, replicate S3 buckets across regions
- Volume types & features:
- Volumes come in different storage types, sizes, and performance profiles
- Elastic Volumes:
- Modify volume on-the-fly (size, type, performance)
- No need to detach or stop the instance
- Can increase volume size, but cannot decrease
- Billing:
- Charged per GB/month
- Some volume types also charge based on performance metrics
- Example: Two 1GB volumes billed separately → sum of individual sizes
EBS Volume Types
EBS – General Purpose SSD (GP2 & GP3)
- Balanced price & performance → versatile storage for most workloads
- Size range: 1 GB – 16 TB
- Performance standardization:
- 1 IOPS = 1 IO per second = 1 credit/sec = 16 KB/sec (applies to all SSD volumes)
Types:
| Type | Notes | Performance | Use cases |
|---|---|---|---|
| GP2 | Default SSD | Burstable IOPS based on volume size | Boot volumes, low-latency apps, Dev/Test |
| GP3 | Newer SSD, generally cheaper | IOPS & throughput configurable independently from size | Boot volumes, low-latency apps, Dev/Test, virtual desktops, medium single-instance DBs (MSSQL, Oracle) |
Exam tip: Remember the size range, IOPS credit system, and GP2 vs GP3 differences. These details often appear in multiple-choice questions.
EBS GP2 Volumes – Key Points

- Default general-purpose SSD for EBS
- Size range: 1 GB – 16 TB
- Performance
- Baseline IOPS (minimum guaranteed)
- 1 GB ≤ Size < 33.33 GB: 100 IOPS
- 33.33 GB ≤ Size ≤ 5.33 TB: 3 IOPS per GB
- Example: 100 GB → 3 × 100 = 300 IOPS
- 5.33 TB ≤ Size ≤ 16 TB: 16,000 IOPS (max baseline)
- Max throughput: 16 KB × 16,000 IOPS ≈ 250 MB/s
- Burst IOPS (max achievable)
- 3000 IOPS → only for volumes < 1 TB
- Volumes ≥ 1 TB → baseline > 3000 IOPS → burst irrelevant
- Baseline IOPS (minimum guaranteed)
- IO Credit Model (for volumes < 1 TB)
- Each volume has a credit bucket (5.4 million credits at start)
- Credit mechanics:
- Actual < Baseline → bucket refills
- Actual = Baseline → bucket stays the same
- Baseline < Actual ≤ Burst → bucket depletes
- Bucket empty → potential performance penalties
- The bucket allows ≈30 minutes at 3000 IOPS without a refill
- Bucket refills continuously at baseline rate → can burst longer
- Volumes ≥ 1 TB
- Baseline > burst → no credit system
- Always perform at baseline IOPS
- Exam tips:
- Remember baseline vs burst IOPS for GP2
- The credit model only applies to <1 TB
- Max throughput ≈ 250 MB/s
- Useful for boot volumes and general-purpose workloads
GP3 – General Purpose SSD (newer version of GP2)

- Newer EBS SSD type, likely to become the default over GP2.
- Generally, more performant and cheaper than GP2.
- Simplified architecture compared to GP2’s credit bucket model:
- Every volume starts with standard performance regardless of size:
- Baseline IOPS: 3,000
- Baseline throughput: 125 MB/s
- If more performance is needed, you can pay to increase IOPS or throughput.
- Maximum performance: 16,000 IOPS or 1,000 MB/s
- GP3 max throughput (1,000 MB/s) is 4x GP2 max throughput (250 MB/s)
- Every volume starts with standard performance regardless of size:
- Base price ~20% cheaper than GP2; even with performance upgrades, usually cheaper than GP2.
- Conceptually, GP3 is like a combination of GP2 and IO1: baseline affordability + scalable high performance.
EBS – Provisioned IOPS SSD (IO1 / IO2 / IO2 Block Express)

- Main feature: IOPS can be provisioned independently of volume size
- Allows extreme performance on small volumes
- Great for workloads that require consistent, low-latency I/O
- Use Cases
- I/O-intensive relational and NoSQL databases
- Latency-sensitive workloads
- Applications needing high, predictable IOPS
- Key Notes
- 1 IO = 1 IOPS = 16 KB (applies to all SSD volumes)
- Billed for:
- Volume size (GB/month)
- Volume type (IO1 / IO2 / IO2 Block Express)
- Provisioned IOPS
- Volume Types
- IO1
- IO2 – successor to IO1
- IO2 Block Express
- Exam tip:
- IOPS independent of size = main distinction from GP2/GP3
- Best for workloads needing guaranteed, low-latency I/O, like production databases
EBS Provisioned IOPS SSD – Performance Comparison
| Volume Type | Size Range | Max IOPS | Max Throughput | Max IOPS per GB | Notes |
|---|---|---|---|---|---|
| IO1 | 4 GB – 16 TB | 64,000 | 1,000 MB/s | 50 | Original provisioned IOPS SSD |
| IO2 | 4 GB – 16 TB | 64,000 | 1,000 MB/s | 500 | Higher durability & better price/performance vs IO1 |
| IO2 Block Express | 4 GB – 64 TB | 256,000 | 4,000 MB/s | 1,000 | Enterprise-grade extreme performance |
- Key Takeaways
- Max IOPS for IO1/IO2 = 4× GP2/GP3 (16,000 IOPS)
- Max throughput for IO1/IO2 = 4× GP2; same as GP3 (1,000 MB/s)
- Max IOPS per GB shows the performance density advantage
- IO1: 50 IOPS/GB vs GP2: 3 IOPS/GB → huge difference
- IO2 Block Express → scales IOPS, throughput, and volume size dramatically for enterprise workloads
- Instance Store volumes can outperform all EBS volumes in raw speed, but they’re ephemeral
- Mnemonic for exam and practical use:
- GP2/3 → general workloads
- IO1/2 → high-performance persistent I/O
- IO2 Block Express → extreme enterprise I/O
EBS – Maximum Performance per EC2 Instance
- Definition: Maximum IOPS and throughput achievable between EBS volumes and a single EC2 instance
- Factors affecting limits:
- EBS volume type
- EC2 instance type & size
- Only modern, large EC2 instances can achieve the maximum numbers
- Multiple volumes may be needed to reach the instance cap
| EBS Volume Type | Max IOPS per EC2 Instance | Max Throughput per EC2 Instance |
|---|---|---|
| GP2 & GP3 | 260,000 | 7,000 MB/s |
| IO1 | 260,000 | 7,500 MB/s |
| IO2 | 160,000 | 4,750 MB/s |
| IO2 Block Express | 260,000 | 7,500 MB/s |
- Notes / Takeaways
- IO1: need ~4 volumes at max performance to hit instance cap
- IO2: interestingly lower max IOPS and throughput per instance than IO1, despite higher durability
- GP2/GP3 can scale to 260k IOPS, but usually require multiple volumes attached
- Always consider instance-level limits in addition to volume-level limits when designing high-performance storage
EBS HDD-Based Volumes

- SSD vs HDD: SSDs are faster and more expensive, using flash memory; HDDs are cheaper, with spinning platters and moving heads.
- Use case: Optimized for sequential access workloads where throughput and cost efficiency matter more than IOPS. Avoid HDDs if high IOPS are needed—use SSD instead.
- Size range: 125 GB – 16 TB.
- Block size: 1 IO = 1 MB. Larger than SSD-based volumes.
- IOPS: 1 IOPS = 1 IO per second = 1 MB/s.
- Performance model: Uses a credit-bucket system similar to GP2, with baseline and burst rates. Performance is measured in MB/s rather than IOPS.
- Types:
- ST1 – Throughput Optimized HDD: Low-cost, designed for frequently accessed, throughput-intensive workloads, e.g., Big Data, Data Warehousing, log processing.
- SC1 – Cold HDD: Lowest-cost, for infrequently accessed workloads; slower than ST1, ideal for archival storage where performance is not critical.
- Key choice tip: If performance matters, pick ST1; if economy matters more and performance is secondary, pick SC1.
EBS ST1 – Throughput Optimized HDD
- Low-cost HDD for frequently accessed, throughput-intensive workloads.
- Ideal for Big Data, Data Warehousing (DWH), and log processing.
- Optimized for high sequential throughput rather than IOPS.
EBS SC1 – Cold HDD
- Lowest-cost HDD, designed for infrequently accessed workloads.
- Performance is much lower than ST1; the trade-off is economy.
- Good for archival storage or colder data that is rarely scanned.
- Rule of thumb: choose SC1 if performance is not needed, otherwise ST1.
EBS HDD Volumes – Type Performance Table
| HDD Type | Max IOPS | Max Throughput | Baseline Rate | Burst Rate |
|---|---|---|---|---|
| ST1 | 500 IOPS | 500 MB/s | 40 MB/s per TB | 250 MB/s per TB |
| SC1 | 250 IOPS | 250 MB/s | 12 MB/s per TB | 80 MB/s per TB |
EC2 Instance Store Volumes – Architecture
Knowing the advantages and limitations of instance store volumes helps optimize cost and performance, while misusing them can lead to major issues.
EC2 Instance Store – Key Features
- High-performance, temporary, locally-attached block storage
- Block storage
- Raw volumes that can be mounted to an EC2 instance
- File systems can be created on these volumes
- Local to the host
- Physically attached to a single EC2 host (not networked)
- Each host has its own isolated instance store volumes
- Instances on that host can access them if configured
- Cost is included with the instance
- Supported volume types and sizes depend on instance type/size
- Some instance types do not offer an instance store at all
- Larger instance sizes usually provide larger volumes
- You pay for it regardless → better to use it than leave it idle
- Supported volume types and sizes depend on instance type/size
- Attached only at launch
- Cannot be added after the instance is running
- You must decide to use them during launch; any changes require a relaunch
- Main advantage: performance
- Local attachment delivers the highest I/O and throughput in AWS
- Examples:
- D3 instance (storage-optimized) → 4.6 GB/s throughput using HDD storage
- I3 instance (storage-optimized) → NVMe SSDs with 16 GB/s sequential throughput, IOPS in the millions for large configs
- Main limitation: ephemeral nature
- Do not store critical or irreplaceable data
- Typical use cases: buffers, caches, temporary scratch space, replicated data in load-balanced environments
EC2 Instance Store – Architecture

- An EC2 instance can have zero or more instance store volumes.
- These volumes are physically attached to the host machine.
- Ephemeral volumes:
ephemeral[0-23]- Data can be lost if:
- The instance is moved to a different host
- The instance changes type or size (usually triggers a host change)
- Hardware failure occurs (entire host or specific device failure)
- Example: If you write a file to
ephemeral0and then migrate the instance, theephemeral0Volume will exist on the new host, but your file will not be there.- Each
ephemeral0It is tied to its host, so data does not transfer automatically between hosts.
- Each
- Data can be lost if:
Choosing Between EC2 Instance Store and EBS Volumes
Common Requirements for Storing EC2 Data
- Persistent storage → Avoid instance store
- Most important requirement
- Instance store data can be lost in many ways
- If storage must survive instance lifecycle → choose EBS
- Resilient storage → Usually avoid instance store
- Instance store is not inherently resilient
- EBS offers AZ-level redundancy + optional S3 snapshots
- Exception: Apps with built-in replication can safely use multiple instance store volumes for high performance
- High performance → Depends
- Both EBS and instance store can deliver high performance
- EBS limits (for larger instances):
- GP2/3: up to 16k IOPS per volume
- IO1/2: up to 64k IOPS per volume
- IO2 Block Express: up to 256k IOPS per volume
- Can combine multiple EBS volumes in RAID0 → max 260k IOPS per EC2 instance
- For super high performance (>260k IOPS) → instance store may be better, but must tolerate ephemeral storage
- Cost considerations → Instance store is free with the instance
- If using EBS, the cheapest options are ST1 or SC1
- ST1: throughput/streaming workloads
- Boot volumes cannot use ST1/SC1
- If using EBS, the cheapest options are ST1 or SC1
EBS Snapshots, Restore, and Fast Snapshot Restore (FSR)
EBS Snapshots – Key Characteristics

- EBS snapshots = backups of EBS volumes stored in S3
- Adds region-level durability on top of EBS’s AZ resiliency
- Snapshots can be restored cross-AZ or cross-region → supports migrations and disaster recovery
- Efficient storage: Only copies the used data
- EBS volume size = 40GB, but only 10GB used → snapshot = 10GB
- You’re billed only for used data
- Incremental backups
- First snapshot = full copy of used data
- Subsequent snapshots = incremental (store only changes)
- Smaller, faster, and space-efficient
- Original snapshot + incremental differences = full backup of current volume
- EBS snapshots do not affect volume performance
- Each snapshot is self-sufficient → deleting an intermediate snapshot does not break recovery
- Cross-AZ/region restore
- Useful for migration or global disaster recovery
- Billing
- Charged per GB/month
- Frequent snapshots are affordable due to incremental storage
- One snapshot every 5 minutes costs only slightly more than one per hour
- Encryption
- EBS encryption affects snapshots
EBS Snapshot Restore – Performance & Fast Snapshot Restore (FSR)
- EBS volumes can be created blank or from a snapshot
- Blank volume → full performance immediately
- From snapshot → lazy restore
- Data fetched gradually from S3
- Accessing unfetched blocks triggers on-demand fetch → lower performance than direct EBS
- Options to achieve full performance immediately:
- Manual pre-read
- Read all blocks (e.g.,
ddon Linux) to force data fetch from S3 - Ensures optimal performance before production use
- Requires admin effort
- Read all blocks (e.g.,
- Fast Snapshot Restore (FSR)
- Instantly restores full snapshot performance
- Limit: 50 snapshot-AZ pairs per region
- Extra cost, can get expensive if enabled widely
- Manual pre-read
DEMO: Useful Commands for Storage Volumes (Linux only)
lsblk→ List all block devicessudo file -s /dev/xvdf→ Detect filesystem type on/dev/xvdf- No filesystem → returns
data - XFS → returns something like:
SGI XFS filesystem data (blksz 4096, inosz 512, v2 dirs) - The
filecommand checks file data, not extensions
- No filesystem → returns
sudo mkfs -t xfs /dev/xvdf→ Create an XFS filesystem on the device- sudo mount /dev/xvdf /ebstest → Mount the device to the directory
/ebstest df -k→ Show disk space usage in kBsudo blkid→ Display block device attributes- Filesystem table (
/etc/fstab)- Config file for auto-mounting filesystems at boot
- Helps avoid manual mounting after every restart
Example: Mount an EBS Volume & Add a File

- This sequence:
- Checks the volume and its filesystem
- Formats it (if needed)
- Mounts it to a directory
- Creates and edits a file
- List files to confirm
EBS Encryption
EBS Encryption – Overview
- At-rest encryption for EBS volumes and snapshots
- Managed by AWS KMS
- Mitigates security risks
- Optional → default is unencrypted
- Recommended → efficient, free, and no performance impact
EBS Volume Encryption

- Encrypt volume at creation using KMS key
- Default AWS-managed key (
kms/ebs) or customer-managed key - KMS generates an encrypted DEK (
GenerateDataKeyWithoutPlaintext) - Encrypted DEK stored with volume on disk
- Default AWS-managed key (
- Volume usage:
- EC2 requests KMS to decrypt DEK
- Plaintext DEK loaded in the memory of the EC2 host
- DEK encrypts/decrypts data on the volume
- Data on volume always ciphertext; OS sees plaintext transparently
- Stopping/moving EC2 → host discards plaintext DEK
- New host requests KMS again to decrypt the DEK for usage
EBS Snapshot Encryption

- Snapshots of encrypted volumes are automatically encrypted
- Snapshots store the same DEK as the original volume
- Volumes restored from these snapshots reuse the same DEK
Additional Facts
- Encryption algorithm: AES-256
- Default encryption can be enforced per account (KMS key choice)
- DEKs are unique per volume, except:
- Snapshots & restored volumes use the same DEK as the source
- Cannot remove encryption from an existing volume or snapshot
- Workaround: clone data to a new unencrypted volume at the OS level
Software Disk Encryption (OS-level)
- OS can encrypt volumes independently
- Unrelated to EBS encryption
- OS is unaware of EBS encryption
- EBS is unaware of OS encryption
- Can be combined, but usually unnecessary
- EBS encryption preferred → lower admin overhead, simpler
EC2 Network Interfaces, Instance IPs, and DNS
EC2 Networking – Architecture

- EC2 networking is handled via ENIs (Elastic Network Interfaces)
- Each EC2 instance has at least one primary ENI
- Primary ENI holds network configurations, not the instance itself
- Primary ENI and instance can be thought of as “one unit” for networking purposes
- Secondary ENIs: optional, attachable to the same instance
- Can belong to a different subnet within the same AZ
- Number depends on instance type/size
- Can detach/reattach to another instance → main difference from primary ENI
- Use case: Multi-homed systems (e.g., separate ENIs for management & data subnets)
ENI Attributes & Configurations
AWS Console shows ENI attributes under the instance, but they belong to the ENI(s), not the instance itself.
- MAC address → hardware identifier, useful for legacy software licensing
- Reattaching ENI to a different instance can move licensing
- Primary private IPv4 (static, e.g.,
10.16.0.10)- Can have private DNS (e.g.,
ip-10-16-0-10.ec2.internal)
- Can have private DNS (e.g.,
- Secondary private IPv4s → optional, dynamic if ENI moved
- Public IPv4 → optional, dynamic
- Changes if the instance is stopped/moved to another host
- Use an Elastic IP (EIP) for static public IPv4
- Public IP is stored in IGW, not ENI; OS cannot see it directly
- Elastic IP → 1 per private IPv4, persists across instance stop/start or host move
- AWS account default pool: 50 EIPs → billed if unused
- IPv6 addresses → optional
- Security Groups (SGs) → rules applied to all ENI IPs
- Different ENIs can have different SGs for flexible rules
- Source/Destination Check → enabled by default
- Discards traffic not destined for ENI IPs
- Must be disabled for NAT instances or traffic forwarding
Amazon Machine Images (AMIs)
Amazon Machine Image (AMI) – Key Concepts
- AMI = template for launching EC2 instances
- Logical container including:
- AMI ID
- Permissions
- Volume snapshots
- 1 Boot volume (OS + apps)
- 0+ Data volumes (data)
- Referenced via Block Device Mapping (Device ID ↔ Snapshot ID)
- Logical container including:
- EC2 instance launched from an AMI is effectively a clone of the source instance
- Types of AMIs
- AWS-provided → e.g., Amazon Linux 2
- Community → e.g., Ubuntu, CentOS, RedHat
- Marketplace → may include commercial software, extra license cost
- Custom → created by customers
- Regional
- AMIs have region-unique IDs → can only launch in the same region
- Can be copied between regions → creates a new AMI with a new ID
- Permissions
- Private → only the owner can use
- Shared → specific accounts
- Public → everyone can use
- Ensure no sensitive info in public AMIs
- AMI stores references to snapshots, not volumes themselves
- Boot volume → e.g.,
/dev/xvda - Data volume → e.g.,
/dev/xvdf - Billing includes snapshot storage in S3 (used data only)
- Boot volume → e.g.,
AMI Creation (Baking)
- Existing EC2 instance used as a template
- AMI baking = prepare instance with desired apps/configuration → create AMI
- Example: Install WordPress → create AMI → launch many instances with WordPress pre-installed
- AMIs are immutable
- Update instance → create a new AMI for updated configuration
AMI Lifecycle (4 Phases)

- Launch → start instance from existing AMI
- Configure → prepare instance with business-specific software/configuration
- Create Image → AMI creation generates volume snapshots and updates Block Device Mapping
- Launch (again) → new instance launched from AMI
- EBS volumes restored from snapshots
- Device IDs preserved
- Supports the rapid creation of many instances with identical configuration
DEMO: Running WordPress with EC2
WordPress (WP) – Overview
- Popular web Content Management System (CMS)
- Publish blogs, websites, and online stores
- Free and Open-Source Software (FOSS)
- Written in PHP
- Uses MariaDB/MySQL for data storage
- Can run on web servers, local machines, or cloud (EC2, containers, multi-tier systems)
- Cantrill demos: showcase WP installation and deployment across different architectures and AWS services
Manual WP Install on EC2
- Disadvantages vs automated install:
- Prone to human errors
- Slower, more effort
- Less consistent
- Must manually clean up installation files
systemctl→ start/stop system services
Bake AMI with WP and Launch EC2
- Stop instance before creating AMI → ensures snapshot consistency
- AMI stores the current EBS configuration, optionally modified during creation
- The default login when launching from a custom AMI is often
root→ may need to switch toec2-user - Baked AMIs = automated deployment at scale
- Install WP once → launch hundreds of identical instances
- Saves time and avoids repetitive manual installation
Copying and Sharing a WordPress AMI
- Copying AMIs / EBS snapshots:
- Time depends on:
- Source and destination regions
- Distance between regions
- Snapshot size and amount of data
- Can range from a few minutes to much longer
- Time depends on:
- Sharing AMIs:
- Prefer sharing with specific, trusted accounts over making it public
- Public AMIs risk exposing sensitive information
- Can share with AWS Organizations or Organizational Units (OUs)
- Option to grant “create volume” permission to selected accounts:
- Prefer sharing with specific, trusted accounts over making it public

- “Create volume” permission:
- Allows other accounts to create EBS volumes from snapshots linked to the AMI
- Denied by default for accounts that don’t own the AMI, even if they can launch it
- Use case: Good practice for internal organizational sharing, where permissions can be more flexible
- Public sharing best practices: Follow AWS guidelines
- Ensure no sensitive information is included in public AMIs
EC2 Purchase Options
EC2 Purchase Options / Launch Types – Overview
- Definition: Different ways to launch EC2 instances, each with trade-offs in cost, flexibility, and availability.
- Key trade-offs:
- Discounts vs Flexibility → commit ahead → pay less, but may allow interruptions or restrict changes.
- Reserved Capacity → pay extra to ensure instances are prioritized when EC2 capacity is limited.
- Dedicated Hardware → pay extra for exclusive use of physical hosts, avoiding shared hardware with other customers.
- Terminology:
- Official: EC2 Purchase Options
- Unofficial/colloquial: EC2 Launch Types
EC2 Shared Hosts
- Definition: Instances run on EC2 hosts shared with other AWS customers.
- Default launch environment.
- Billing: Only instance and storage fees.
- Per-second instance fee while running
- Storage fee while allocated (running or stopped)
- Customer control: None over physical hardware.
- Isolation: Instances are isolated, even on shared hardware.
- Launch types using shared hosts: On-Demand, Spot, Reserved
EC2 On-Demand Instances

- Definition: Default EC2 purchase option; balanced flexibility and cost.
- Characteristics:
- No interruptions
- No capacity reservation
- Predictable pricing (per-second billing while running)
- No upfront cost or discounts
- Best use cases:
- Short-term or unpredictable workloads
- Apps that cannot tolerate interruption
EC2 Spot Instances

- Definition: Runs on spare EC2 capacity; you bid for instances.
- Pros: Very cheap (up to 90% off On-Demand)
- Cons: Instances may be terminated if the spot price exceeds your max bid.
- Pricing: Pay only the spot price, never your max bid.
- Best use cases:
- Non-critical workloads
- Jobs that can be rerun (e.g., scientific computations, batch jobs)
- Bursty or cost-sensitive workloads
- Stateless applications
- Never use for: Long-running, critical services (mail servers, websites, domain controllers).
Example:
- Initial spot price: 2 coins
- Spot price rises to 4 coins
- Customer A’s instance (max bid 2) is terminated
- Customer B’s instance (max bid 4) continues running
- The freed capacity may now be used by On-Demand customers
- Key point: You only pay the current spot price, not your max bid.
- Risk: Spot instances can be terminated if the spot price exceeds your max bid.
EC2 Reserved Instances (RIs)
- Definition: Commit to use specific instance types in an AZ or region for a fixed term.
- Benefit: AWS offers billing discounts in exchange for commitment.
- Best use case: Long-term, predictable workloads and consistent usage.
EC2 Dedicated Hosts – Key Points

- Definition: A whole EC2 host dedicated to a single AWS account.
- You pay for the host itself, not individual instances.
- Instances run on this host only; no other customers’ instances share it.
- Use Cases / Benefits:
- Legacy software licensing tied to physical sockets or cores
- Host affinity – instances remain on the same host after stop/start
- Dedicated hardware – full control, no sharing with other AWS accounts
- Considerations / Downsides:
- The customer must manage the host capacity
- Full host → cannot launch new instances
- Underutilized host → wasted cost
- Admin overhead is higher than using Dedicated Instances
- The customer must manage the host capacity
- Comparison to Dedicated Instances:
- Dedicated Instances also provide dedicated HW
- But AWS manages the host, reducing management burden
- Choose Dedicated Instances if dedicated HW is required without full host management
EC2 Dedicated Instances – Overview

- Run instances on AWS-managed servers with dedicated hardware
- Serves as a compromise between shared hosts and fully dedicated hosts
- Exclusive hardware for your instances
- No other AWS accounts’ instances are placed on the same host
- You get dedicated HW without paying for the entire host
- Ideal when strict HW isolation is required, while still leveraging EC2 features safely
- Exclusivity incurs additional costs
- One-time hourly fee per region where dedicated instances are used (applies regardless of instance count)
- Plus per-second instance fees (same as on-demand) and storage costs
- AWS handles host management
- No need to worry about capacity planning or host maintenance, unlike dedicated hosts
EC2 Host Models – Overview Table
| EC2 Host Model | Who Manages Host Capacity | Dedicated Hardware? | Billing Summary | Typical Use Case |
|---|---|---|---|---|
| Shared Hosts | AWS | No | Instance fees only (runtime + allocated resources like storage & network) | Default option for general workloads |
| Dedicated Instances | AWS | Yes | 1. Hourly regional fee for dedicated instances 2. Instance runtime fees | When dedicated hardware is required but AWS manages host capacity |
| Dedicated Hosts | Customer | Yes | Pay for the full host | Software licensed per physical HW (cores/sockets) or strict isolation needs |
EC2 Shared Host Instance Types – Overview Table
| Shared Host Instance Type | Cost Advantage | Interruptions | Recommended Use Cases |
|---|---|---|---|
| On-Demand | No | None | • Short-term or unpredictable workloads • Applications that cannot be interrupted |
| Spot | High (up to 90%) | Possible (if spot price exceeds your max) | • Non-urgent workloads • Jobs that can be stopped and restarted • Bursty or parallelizable workloads • Stateless applications |
| Reserved | Moderate to High (depends on commitment) | None (with capacity reservation, interruptions avoided even under high demand) | • Long-term, steady workloads • Business-critical applications where uptime is essential |
EC2 Reservations
AWS Reservations – Core Concepts
- Long-term commitments to use AWS resources in exchange for cost benefits
- Applicable across many AWS services, not just EC2
- Commitments are binding – you pay whether you fully use the resources or not
- Plan carefully to avoid wasting money
- Key types of EC2 reservations:
- Billing reservations – commit to a specific instance type in an AZ/region for a term and get billing discounts
- Capacity reservations – pay to guarantee EC2 capacity in a specific AZ, useful for business-critical workloads
- EC2 Savings Plans – commit to an hourly spend for a fixed term to receive discounted compute costs
EC2 Reserved Instances (RIs)

- Reserve instances for a 1- or 3-year term and receive cost savings
- Common practice in large-scale AWS deployments
- Ideal for predictable, long-term workloads
- Suitable for always-on infrastructure or stable services
- Commitment length: 1 or 3 years
- 3-year term offers higher discounts but carries more risk if requirements change
- Early termination still incurs full billing
- Workaround: resell unused RIs on AWS Marketplace
Payment options:
- No Upfront – pay per-second instance fee at a reduced rate, no initial payment
- Low barrier to entry
- Smaller discount compared to upfront options
- Partial Upfront – pay part upfront, reduce per-second fees significantly
- Balanced option
- All Upfront – full payment upfront, no per-second charges
- Maximum discount
- RIs are tied to a specific instance type and AZ or region
- Partial coverage is possible if the instance size exceeds the reservation
RI Scope:
- Regional – discounts apply across all AZs in a region; no capacity guarantees
- Flexible, but the priority is the same as on-demand
- Zonal – discounts only in the specified AZ; includes capacity reservation
- Less flexible but ensures higher launch priority
Standard RIs
- Original and most common type, designed for always-on, uninterrupted workloads
- Best for infrastructure with consistent, known usage
Scheduled RIs (Discontinued)
- Allowed usage only in predefined time slots
- No longer offered; previously useful for long-term, periodic workloads
EC2 Capacity Reservations
- Guarantee priority access to EC2 capacity in a specific AZ
- Ensures computing for workloads that cannot tolerate interruption
- AWS’s priority for allocating EC2 resources:
- Reserved capacity (zonal reservations, on-demand capacity reservations)
- On-demand instances & regional reservations
- Spot instances (leftover capacity)
- Usage patterns:
- RIs with Regional Reservation – get a discount but no capacity guarantee
- RIs with Zonal Reservation – discount + reserved capacity in AZ
- On-Demand Capacity Reservation – reserve capacity anytime, flexible term
- No billing discount; pay full on-demand cost
Compute Savings Plans
- Commit to a fixed hourly spend on compute for 1 or 3 years to get discounts
- Instead of reserving a specific instance, commit to a total $$ amount for compute use
- Works across AWS compute services with on-demand and savings plan rates
- Pay a discounted rate until the commitment is met, then revert to on-demand pricing
- Types:
- General Savings Plan – applies to EC2, Fargate, Lambda
- Up to 66% savings
- Ideal for organizations moving workloads from EC2 → Fargate → Lambda
- EC2 Savings Plan – EC2-only, flexible in instance size & OS
- Up to 72% savings, higher than the general plan
- General Savings Plan – applies to EC2, Fargate, Lambda
- Exam tip: General savings plans are great for organizations transitioning to containers or serverless architectures
EC2 Instance Status Checks & Auto Recovery
EC2 Instance Status Checks (Monitoring EC2 Health)

- Each EC2 instance is continuously monitored via two main status checks:
- System status check
- Detects issues affecting the EC2 platform or host machine
- Failures can indicate: power loss, network outage, host OS/software errors, or hardware problems
- Instance status check
- Detects issues specific to the EC2 instance itself
- Failures can indicate: filesystem corruption, misconfigured networking, or OS kernel errors
- 💡 Each check runs a different set of tests; failing one points to a distinct problem type compared to the other
- System status check
- On instance launch, status checks start in an initializing state
- When you reach 2/2 checks passed, the instance is considered healthy by EC2 standards
- “Healthy” from the EC2 perspective does not guarantee your custom configs or applications are running correctly
- If a check fails and the instance fails to launch → you must intervene
- Fixes can be manual or automated
- When you reach 2/2 checks passed, the instance is considered healthy by EC2 standards
EC2 Status Check Alarms
- CloudWatch alarms can monitor EC2 status checks and trigger notifications or actions automatically if a check fails
- Alerts are typically sent via SNS topics
- Possible automated responses to instance failures:
- Reboot → standard method to restore functionality
- Stop → allows diagnostic work on the instance
- Terminate → suitable when high availability is in place (e.g., via Auto Scaling Group)
- Avoids unnecessary billing for stopped instances
- Recover → uses EC2 Auto Recovery to attempt automated restoration
EC2 Auto Recovery
- Feature to automatically restore an EC2 instance when a status check fails
- Not intended for large-scale or complex system failures – handles isolated instance issues only
- Helps reduce manual intervention for common failures
- Recovery process:
- Moves the instance to a new EC2 host within the same AZ
- Restarts the instance with the same configuration and settings
- Maintains IP addressing (except for public IPv4 that isn’t an EIP)
- The instance may reboot if it was previously configured to do so
- Considerations/limitations:
- Requires spare EC2 host capacity; otherwise, recovery will fail
- Supported only on modern instance types (e.g., A1, C4–C5, M4–M5, R3–R5…)
- Does not work with instance store volumes; EBS-backed instances are supported
EC2 Termination Protection and Shutdown Behavior
Shutdown Behavior of EC2 Instances
- Running instances can normally be stopped, rebooted, or terminated.
- The default shutdown behavior of an instance is to stop it.
- Shutdown behavior can be configured to terminate the instance instead:
- This is a specialized feature.
- Useful when the instance state is not important:
- Eliminates storage costs associated with stopped instances.
- Reduces administrative overhead when many instances are frequently stopped.
- Example: Instances in an EC2 Auto Scaling Group that are frequently stopped can be terminated automatically, avoiding unnecessary storage costs while supporting automation.
EC2 Termination Protection
- By default, terminating an instance triggers a confirmation dialog in the AWS Management Console, but accidental termination is still possible.
- Termination protection provides an additional safeguard:
- Attempting to terminate a protected instance triggers a confirmation step that prevents accidental deletion.
- Example dialog in the console:
- Failed to terminate an instance: The instance ‘i-068b2e3ba2f2a2fbd’ may not be terminated. Modify its ‘disableApiTermination’ instance attribute and try again.
- Key advantages:
- Prevents accidental termination of critical instances.
- Introduces an extra approval step:
- Permission to enable or disable termination protection is controlled via
disableApiTermination. - This permission is separate from the permission to terminate instances.
- Supports role separation, allowing only senior admins to remove protection while restricting junior admins.
- Permission to enable or disable termination protection is controlled via
- Recommended for production environments, while TEST or DEV environments may not require protection.
EC2 Instance Metadata Service (IMDS)
EC2 Instance Metadata – Overview

- Every EC2 instance can access an internal metadata endpoint by default.
- This endpoint connects to the EC2 Instance Metadata Service (IMDS), which runs inside the instance and exposes metadata that is otherwise not directly accessible from the instance OS.
- Examples: bootstrap launch scripts (
user-data), instance’s public IPv4 address, instance ID, security group info.
- Examples: bootstrap launch scripts (
- Metadata can also be used to configure or manage a running instance.
- This endpoint connects to the EC2 Instance Metadata Service (IMDS), which runs inside the instance and exposes metadata that is otherwise not directly accessible from the instance OS.
- The metadata endpoint is available at:
http://169.254.169.254/latest/meta-data- This is a key address to remember when working with EC2 instances.
- Common use cases:
- Environment information
- Fetch instance details such as hostname, network settings, and security groups.
- One important category: network info, including public IPv4 addresses that are otherwise invisible inside the instance.
- Authentication and credentials
- Instances with assigned IAM roles can fetch temporary credentials via IMDS, which are automatically rotated.
- EC2 Instance Connect can use IMDS to retrieve SSH credentials.
- Accessing user-data
- Useful for running automated configuration scripts during instance launch.
- Environment information
- Security considerations:
- IMDS does not enforce authentication and does not encrypt data.
- Any user with access to the instance’s shell can reach the metadata endpoint by default.
- Mitigation options include local firewall rules to block
169.254.169.254, but this adds administrative overhead per instance.
DEMO: Useful Commands for EC2 Instance Metadata
ifconfig→ Shows network interfaces as seen by the instance OS.- Displays private IPv4 and IPv6 addresses.
- Does not show public IPv4 addresses (configured at the IGW). IMDS is required to retrieve public addresses.
curl http://169.254.169.254/latest/meta-data/public-ipv4→ Returns the instance’s public IPv4 address.curl http://169.254.169.254/latest/meta-data/public-hostname→ Returns the instance’s public DNS name.
EC2 Instance Metadata Query Tool
- A command-line tool that simplifies access to IMDS.
- Installation on a Linux instance:
- Download:
wget http://s3.amazonaws.com/ec2metadata/ec2-metadata - Make executable:
chmod u+x ec2-metadata
- Download:
- Display available commands and options:
ec2-metadata --help - Example usage:
ec2-metadata -a→ Shows the AMI ID used to launch the instance.
Vertical & Horizontal Scaling
System Scaling
- Systems face challenges when the load fluctuates:
- High load → system may struggle to keep up → customers experience slow response times, failures, or even crashes.
- Low load → resources remain idle → leads to wasted capacity and increased costs.
- Scaling refers to adjusting system resources to handle changes in load.
- Resources are added or removed depending on demand to maintain performance and cost efficiency.
- Two approaches to scaling:
- Vertical scaling – increasing or decreasing the capacity of a single server.
- Horizontal scaling – increasing or decreasing the number of identical server instances.
Vertical Scaling

- Vertical scaling involves resizing a single server instance, such as an EC2 instance.
- Increase resources (CPU, memory) to handle higher load.
- Decrease resources to reduce cost when load is lower.
- Advantages:
- Simple to implement, does not require changes to the application.
- Works with any type of application, including monolithic designs.
- Limitations:
- Resizing usually requires a reboot, causing temporary downtime.
- Must be scheduled during maintenance windows.
- Limits the ability to respond quickly to sudden load spikes.
- Larger instances often have higher costs, and cost increases are not always linear.
- Maximum capacity is capped by the largest available instance type.
- Resizing usually requires a reboot, causing temporary downtime.
Horizontal Scaling

- Horizontal scaling adjusts the number of server instances to match the workload.
- Add instances to handle a higher load.
- Remove instances to reduce cost when demand decreases.
- Systems run multiple identical instances of the same application.
- Each instance shares the workload evenly.
- A Load Balancer distributes incoming traffic across all instances.
- User session management is critical:
- With multiple instances, sessions cannot be stored on a single server.
- Example:
- The load balancer routes a user to instance 1, the user logs in, and adds items to a cart.
- Later, Load Balancer routes the same user to instance 2, which has no session data → user is logged out, and cart is empty.
- Example:
- Solutions:
- Off-host sessions: store session data externally (e.g., ElastiCache, database).
- Servers remain stateless, making any instance interchangeable.
- With multiple instances, sessions cannot be stored on a single server.
- Advantages:
- Scaling can occur without downtime.
- No inherent limits on capacity; instances can be added indefinitely.
- Often more cost-efficient than large single instances.
- Fine-grained control over capacity increases.
- Vertical: doubling instance size = 100% more capacity.
- Horizontal: adding 1 instance to 4 = 20% more capacity → more flexible scaling.
- Limitations:
- The application must be designed for horizontal scaling.
- Many legacy apps require modification to support stateless operation and off-host sessions.
- The application must be designed for horizontal scaling.
Horizontal vs Vertical Scaling – Key Comparison

Advanced EC2
EC2 Bootstrapping with User Data
Bootstrapping Concepts
- Bootstrapping (automation approach) refers to a mechanism that enables a system to configure itself automatically
- In EC2, bootstrapping means executing setup scripts at instance launch
- Automates instance setup to reach a ready-to-use state
- e.g., installing software, applying configurations after install
- Implemented using EC2 User Data (through the Instance Metadata Endpoint)
- Unlike pre-built images, setup occurs after the instance has started
- Automates instance setup to reach a ready-to-use state
EC2 User Data
- A data payload that can be supplied to an EC2 instance at launch
- Primarily intended for bootstrapping tasks
- Retrieved via Metadata Endpoint:
http://169.254.169.254/latest/user-data - Processed by the operating system after the instance starts
- User Data runs only once during initial launch
- Modifying User Data and rebooting will not trigger it again
- To rerun it, a new instance must be created
- Launching again does not reuse the same instance—it creates a separate one
- User Data runs only once during initial launch
- Not interpreted by EC2
- Execution depends entirely on the OS
- EC2 does not validate or process the content
- Runs with root privileges, so incorrect scripts can cause issues
- Not secure by design
- Accessible from within the instance
- Should not contain sensitive data like passwords or long-term credentials
- Size limit: 16 KB
- Larger setups should download additional scripts or data externally
- Can be updated when the instance is stopped
- Changes are visible after restart but won’t be executed again
- Better alternatives exist for updating running instances
EC2 Bootstrapping – Architecture

- An EC2 instance is launched with its boot volume attached
- User Data is provided to the instance at launch
- The operating system checks for the presence of User Data
- If present, it is executed as a startup script
- The instance remains in a running state during execution
- If the script succeeds → instance becomes service-ready
- If the script fails → instance may run but be incorrectly configured
- Instance health checks may pass even if setup is still ongoing, so “running” does not always mean “ready”
EC2 Bootstrapping – Boot-Time-To-Service-Time

- A measure of how long it takes for an instance to become usable after launch
- Includes:
- Time for AWS to start the instance
- Time required to complete configuration (manual or automated)
- Bootstrapping reduces this time by automating setup tasks, improving speed and consistency
- AMI baking can preconfigure instances:
- Advantage: reduces setup time after launch
- Disadvantage: limits flexibility due to fixed configurations
- Best practice: use both AMI baking and bootstrapping
- Perform heavy setup tasks in advance using AMIs
- Apply final, lightweight configuration during bootstrapping
- This approach balances efficiency and flexibility, which is useful for scaling and high availability
Enhanced EC2 Bootstrapping with CFN-INIT
CFN-INIT – Key Concepts
- CFN-INIT (AWS::CloudFormation::Init) is a lightweight configuration management tool
- Commonly described by AWS as a helper script on the EC2 OS, but it provides broader functionality
- Enables defining advanced bootstrapping instructions for EC2 instances
- More capable and structured compared to basic User Data scripts
- Can operate in a procedural manner, similar to User Data (executing steps sequentially)
- Also supports a desired state model, where multiple instructions define the final configuration
- For example, specifying a required Apache version ensures it is installed or updated as needed
- Desired state definitions are included within the CloudFormation logical resource
- Supports configuration of multiple elements to achieve the target state:
- Packages (including version control)
- OS groups
- OS users
- Sources (downloading and extracting software, with optional authentication)
- Files (with permissions handling)
- Commands (with success validation)
- Services (ensuring services are started or enabled at boot)
CFN-INIT – Architecture

- CFN-INIT is triggered through
UserDataprovided to the instance- Example:
/opt/aws/bin/cfn-init -v --stack ${AWS::StackId} --resource EC2Instance --configsets wordpress_install --region ${AWS::Region}
- Example:
- It retrieves configuration details from the CloudFormation template
- Found under the
Metadatasection →AWS::CloudFormation::Init
- Found under the
- It applies the configuration to move the instance toward the defined desired state
- Works with stack updates
- Unlike User Data (which runs only once), CFN-INIT can reapply configuration whenever the stack is updated
- This allows ongoing configuration management after launch
- Works with stack updates
CFN CreationPolicy and CFN-SIGNAL

- By default, EC2 provisioning does not consider bootstrapping progress
- An instance is marked as
CREATE_COMPLETEonce provisioning finishes and status checks pass - If bootstrapping fails, CloudFormation is not aware and still treats the resource as successful
- This can cause the stack to proceed with an improperly configured instance
- An instance is marked as
- CFN-SIGNAL is used to communicate the result of post-launch configuration back to CloudFormation
- CreationPolicy
- Applied to a CloudFormation resource
- Defines a timeout period (e.g., 15 minutes)
- Forces CloudFormation to wait for a success signal before marking the resource as complete
- Even if EC2 reports the instance as running, the stack pauses until a signal is received
- CFN-SIGNAL
- Executed from
UserDatawithin the instance - Example:
/opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackId} --resource EC2Instance --region ${AWS::Region} - Uses stack ID, resource name, and region to report back to CloudFormation
-e $?reflects the exit status of the previous command:- Success → sends a success signal → resource marked complete
- Failure → sends an error signal → resource marked as failed
- No signal before timeout → treated as failure and marked accordingly
- Executed from
DEMO: Bootstrapping EC2 WordPress Installation
Configuring Bootstrap Scripts in EC2 User Data
- In this example, EC2 User Data is set up in two approaches:
- Through the EC2 console prior to launching the instance
- Within a CloudFormation (CFN) template
- Using CloudFormation enables input from users via parameters (e.g., usernames, passwords), which can also include predefined default values

- By default, EC2 expects User Data in Base64-encoded format
- When using the EC2 console, plain text can be automatically encoded
- In CloudFormation, Base64 encoding must be explicitly defined in the template

- The provided User Data script provisions an EC2 instance to:
- Install required software (web server, database, dependencies)
- Configure and start services
- Deploy and configure WordPress
- Set up permissions and initialize the database
- Customize the system message using cowsay


Diagnosing Problems with EC2 Bootstrap Scripts
- User Data can always be retrieved from the Instance Metadata Endpoint
- For newer AMIs (e.g., Amazon Linux 2023), a token may be required before accessing it.

- Log files are located in
/var/logand provide execution details:cloud-init-output.log→ includes executed commands and their outputcloud-init.log→ includes only the commands executed during boot
Bootstrapping WordPress with CFN-INIT
- A CloudFormation template can be used to deploy an EC2 instance and configure WordPress using CFN-INIT
- Parameters allow customization of values such as database name, user, and passwords
- These values can be validated and optionally hidden (e.g.,
NoEcho)
- These values can be validated and optionally hidden (e.g.,
- Key components of the template:
- configSets define the sequence of configuration steps
- Example flow: install CFN tools → install software → configure instance → install WordPress → finalize configuration
- Each step contains specific instructions (files, commands, services, sources)
- configSets define the sequence of configuration steps
!Subis used for variable substitution- Replaces placeholders (e.g.,
${AWS::Region}) with actual deployment values
- Replaces placeholders (e.g.,
--configsets wordpress_installspecifies which set of instructions CFN-INIT should execute- Each config key runs in order as part of the overall setup process
cfn-auto-reloader.confenables automatic re-execution of CFN-INIT when the template metadata changes- In User Data:
cfn-initapplies the configuration defined in the templatecfn-signalreports success or failure back to CloudFormation

Diagnosing Problems with CFN-INIT
- User Data is simplified to primarily invoke
cfn-initandcfn-signalwith resolved parameters

- In addition to standard cloud-init logs, CFN-specific logs are available:
cfn-init.log→ overall CFN-INIT execution detailscfn-init-cmd.log→ output of executed commandscfn-hup.log→ logs related to automatic updates via cfn-hupcfn-wire.log→ communication between the instance and CloudFormation
- These logs, combined with
/var/log/cloud-init*, provide full visibility into both bootstrapping and CFN-driven configuration processes

EC2 Instance Roles & InstanceProfile
EC2 Instance Roles – Architecture

- The recommended way for AWS services to access other AWS resources is by using IAM roles
- Services assume roles to obtain the required permissions for interacting with other services
- Reasons:
- Security
- Long-term credentials (such as access keys configured via
aws configure) should not be stored in insecure environments - While a local machine may be controlled, an EC2 instance can be exposed or accessed by others
- Long-term credentials (such as access keys configured via
- Scalability
- A single IAM role can be assigned to multiple EC2 instances
- Managing and rotating long-term credentials across many instances is complex and inefficient
- Security
- An instance role is an IAM role assigned to an EC2 instance
- Any application or process running on the instance inherits the permissions defined by that role
- Temporary credentials for the role are provided through the EC2 Instance Metadata Service (IMDS)
- Accessible via:
http://169.254.169.254/latest/meta-data/iam/security-credentials/<ROLE-NAME> - The role name can be discovered from the metadata path
- Accessible via:
- These credentials are short-lived and automatically refreshed
- AWS handles rotation using the Secure Token Service (STS)
- Applications should periodically retrieve updated credentials or refresh cached ones
- The metadata endpoint always supplies valid credentials
- The AWS CLI on an EC2 instance automatically uses the attached role’s credentials when available
EC2 Instance Profile
- An Instance Profile acts as a container for an instance role
- It is the component actually attached to an EC2 instance to enable role usage
- When using AWS CLI or CloudFormation, the Instance Profile must be explicitly created
- In the EC2 console:
- Creating a role typically also creates a corresponding Instance Profile
- Selecting a role in the UI effectively attaches its associated Instance Profile
Credential Precedence for AWS CLI
- When multiple credential sources are available, long-term credentials take priority over role-based temporary credentials
- Even though this precedence exists, instance roles remain the recommended approach for EC2 due to improved security and easier management
SSM Parameter Store
SSM Parameter Store – Key Concepts

- AWS Systems Manager (SSM) is a centralized service for managing AWS resources and applications at scale
- Previously called Simple Systems Manager (SSM)
- Divided into four main areas: Operations Management, Application Management, Change Management, and Node Management
- Parameter Store is part of Application Management and is used to store configuration data
- Applications, EC2 instances, and Lambda functions can securely retrieve parameters using IAM and KMS
- Provides key-value storage for configuration values
- Built to be secure, highly available, and scalable
- Common use cases include database connection details, credentials, and application settings
- Supports three parameter types:
- String
- StringList (comma-separated values)
- SecureString (encrypted values)
- More secure than storing sensitive data in EC2 User Data
SSM Parameter Store Tiers
| SSM Parameter Store Tier | Number of Parameters | Parameter Value Size | Parameter Policies Available | Cost |
|---|---|---|---|---|
| Standard | Up to 10,000 | Up to 4 KB | No | Free* |
| Advanced | No limit | Up to 8 KB | Yes | Paid |
*Additional charges may apply for higher throughput usage.
SSM Parameter Store – Characteristics
- Public AWS service
- Accessible via public endpoints
- Includes AWS-provided public parameters (e.g., latest AMIs per region)
- Integrated with multiple AWS services
- Works with CloudFormation, CLI tools, and automation workflows
- Supports versioning
- Each update creates a new version of a parameter
- Supports hierarchical structure
- Uses
/to organize parameters into paths - Enables grouping by application, environment, or team
- Parameters can be retrieved individually or by path
- Uses
- Access controlled via IAM
- Fine-grained permissions at parameter or path level
- Supports plaintext and encrypted values
- Encryption handled through AWS KMS
- Access to encrypted values requires appropriate KMS permissions
- Default AWS-managed key allows broad access; customer-managed keys provide stricter control
- Supports event triggering
- Parameter updates can initiate automated processes or notifications
Useful CLI Commands for Retrieving Parameters
aws ssm get-parameters-by-path --path /my-cat-app/- Retrieves all parameters under the specified path in JSON format
- Encrypted values are returned as ciphertext

aws ssm get-parameters-by-path --path /my-cat-app/ --with-decryption- Retrieves and decrypts parameter values
- Requires KMS permissions
aws ssm get-parameters --names /my-cat-app/dbstring- Retrieves a specific parameter by name
- AWS CloudShell can be used from the console to run these commands without local configuration
System and Application Logging on EC2
CloudWatch Logs for EC2

- By default, CloudWatch collects external (host-level) metrics from EC2 instances, such as:
- CPU utilization
- Disk read/write activity
- Network traffic (inbound/outbound)
- However, it does not have visibility into the instance’s internal data
- Metrics and logs inside the operating system are not captured automatically
- The CloudWatch Agent can be installed within an EC2 instance to extend monitoring capabilities
- Sends metrics and log data from inside the instance to CloudWatch
- Enables collection of internal system metrics, such as:
- Memory usage
- Detailed CPU statistics (idle, system, I/O wait, etc.)
- Enables collection of application and system logs
- Each log is associated with a log group
- Each instance contributes a log stream within that group
- The unified CloudWatch Agent is the modern standard (replacing the older CloudWatch Logs agent)
- For large-scale deployments, automated setup is recommended
- Configuration can be stored in SSM Parameter Store and reused across multiple instances
- The CloudWatch Agent requires appropriate permissions:
- Permission to send logs and metrics to CloudWatch
- Permission to retrieve configuration from SSM Parameter Store (if used)
- Best practice is to assign these permissions through an EC2 instance role
Demo: CloudWatch Agent Setup for WordPress EC2 Instance
- Create and configure an IAM role
- Role type: EC2
- Attach required managed policies:
CloudWatchAgentServerPolicyAmazonSSMFullAccess
- Install the CloudWatch Agent
- Command:
sudo dnf install amazon-cloudwatch-agent
- Command:
- Run the configuration wizard
- Command:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard - Accept most defaults, but choose advanced metrics when prompted
- Define log files to collect, such as:
/var/log/secure→ authentication and security logs/var/log/httpd/access_log→ Apache access activity/var/log/httpd/error_log→ Apache error events
- Command:
- Save the configuration
- Stored locally at:
/opt/aws/amazon-cloudwatch-agent/bin/config.json - Optionally store it in SSM Parameter Store for reuse
- Stored locally at:
- Prepare required directories and files
- Some Linux instances do not include required paths by default
- Create them manually:
sudo mkdir -p /usr/share/collectd/sudo touch /usr/share/collectd/types.db
- Start the CloudWatch Agent
- Command:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:AmazonCloudWatch-linux -s - This command retrieves the configuration from SSM and starts the agent
- Command:
- This setup enables centralized monitoring and logging, providing deeper visibility into both system performance and application behavior within EC2 instances
EC2 Placement Groups
EC2 Placement Groups – Overview
- By default, AWS decides how EC2 instances are physically placed within an Availability Zone
- Placement groups provide control over how instances are positioned on underlying hardware
- They influence whether instances are grouped closely together or kept apart
- There are three placement group types:
- Cluster → instances are placed close together
- Spread → instances are kept separate
- Partition → instances are grouped, but groups are isolated from each other
Cluster Placement Groups

- Instances are placed very close together to maximize performance
- Often share the same rack or even the same host
- Deployed within a single Availability Zone only
- Best results when using the same instance type and launching all instances together
- Advantages:
- Very high network throughput
- Very low latency
- High packet-per-second performance
- Disadvantages:
- Minimal fault tolerance
- Hardware failure can impact all instances
- Typical use cases:
- High Performance Computing (HPC)
- Applications requiring fast communication between instances
Spread Placement Groups

- Instances are distributed across separate hardware
- Each instance runs on isolated infrastructure
- Can span multiple Availability Zones
- Advantages:
- Strong fault isolation
- High availability
- Disadvantages:
- Limited to 7 instances per AZ
- Typical use cases:
- Critical systems requiring isolation
- Domain controllers or replicated services
Partition Placement Groups

- Instances are organized into partitions, with each partition isolated
- Instances within a partition may share hardware
- Different partitions are fully separated
- Can span multiple Availability Zones
- Advantages:
- Supports large-scale deployments
- Maintains fault isolation across partitions
- Disadvantages:
- Additional design and management complexity
- Limit of 7 partitions per AZ
- Typical use cases:
- Distributed and data-aware systems (e.g., Hadoop, Cassandra)
- Large-scale applications requiring controlled fault domains
EC2 Placement Groups – Summary Table
| Feature / Type | Cluster Placement Group | Spread Placement Group | Partition Placement Group |
|---|---|---|---|
| Placement Strategy | Instances placed very close together | Instances placed on separate hardware | Instances grouped into isolated partitions |
| Availability Zones | Single AZ only | Can span multiple AZs | Can span multiple AZs |
| Performance | Very high (low latency, high BW) | Standard | High within each partition |
| Fault Tolerance | Low | Very high | High (isolated per partition) |
| Instance Limit | No fixed limit (capacity dependent) | 7 instances per AZ | 7 partitions per AZ (many instances each) |
| Hardware Sharing | Often shared | Fully isolated | Shared within partition only |
| Best Use Cases | HPC, tightly coupled workloads | Critical isolated systems | Large distributed, topology-aware systems |
| Complexity | Low | Low | Higher (requires planning) |
EC2 Dedicated Hosts
CPU Sockets and Cores
- A CPU socket is the physical slot on a motherboard where a processor is installed
- A CPU can contain multiple cores, which act as independent processing units
- From a performance perspective, having:
- one socket with multiple cores, or
- multiple sockets with fewer cores
generally produces similar results
- However, some software licensing models depend on the number of CPU sockets
- In such cases, configuring fewer sockets may help reduce licensing costs
EC2 Dedicated Hosts – Key Concepts & Overview
- A Dedicated Host is a physical EC2 server fully allocated to a single AWS account
- Provides host affinity, meaning instances remain tied to the same host (no automatic migration)
- Designed for specific instance families and types (e.g., A1, C5, M5)
- Billing model:
- No per-instance cost
- Charged for the entire host capacity, regardless of usage
- Available as:
- On-demand (flexible usage)
- Reserved (1- or 3-year commitment for predictable workloads)
- The host hardware includes a fixed number of CPU sockets and cores
- This determines how many instances can be placed on the host
- Important for software licensed based on physical hardware
- Some enterprise applications require licensing based on total sockets/cores
- In these cases, using the full host capacity aligns better with licensing costs
Types of Dedicated Hosts
Traditional Dedicated Hosts
- Support only one instance size at a time
- Cannot mix different instance sizes on the same host
- Instance size must be defined before launching
- All instances consume portions of the available cores
- Example: A host with 1 socket and 16 cores distributes those cores across instances of the selected size

Nitro-Based Dedicated Hosts
- Built on the Nitro system, providing increased flexibility
- Allows multiple instance sizes to run on the same host simultaneously
- Instances can be mixed until the total available cores are fully utilized
- Better suited for environments with varying workload requirements

EC2 Dedicated Hosts – Considerations & Limitations
- Not supported:
- Certain operating systems (e.g., RHEL, SUSE Linux, Windows AMIs)
- Amazon RDS
- Placement groups
- Can be shared across accounts within an organization using AWS Resource Access Manager (RAM)
- The owner account has visibility of all instances on the host
- Other accounts can only see and manage their own instances
- Key considerations:
- Primarily intended for licensing compliance scenarios
- Comes with operational overhead, including capacity planning and host management
- Not ideal for general-purpose EC2 usage due to complexity and restrictions
- In most real-world scenarios, Dedicated Hosts are used only when required for specific licensing or compliance needs
EC2 Enhanced Networking & EBS-Optimized Instances
EC2 Enhanced Networking

- Enhanced Networking improves the network performance of EC2 instances
- Essential for high-performance scenarios such as cluster placement groups
- Available at no additional cost and enabled by default on most modern instance types
- Uses SR-IOV (Single Root I/O Virtualization) to optimize network operations
- Makes the network interface card (NIC) aware of virtualization
- Without SR-IOV:
- Multiple instances share a single physical NIC
- The host manages access through software
- This introduces overhead, increases CPU usage, and reduces performance under load
- With SR-IOV:
- The NIC provides multiple virtual network interfaces
- Each instance gets direct access to its own virtual interface
- Reduces reliance on host CPU and improves efficiency
- Benefits:
- Increased network throughput (higher bandwidth)
- Higher packets-per-second (PPS) performance
- Reduced CPU overhead on the host
- Lower and more consistent latency
- Particularly useful for workloads with high network demands or frequent small packet transfers
EBS-Optimized Instances
- Amazon EBS provides network-based block storage
- Historically, network bandwidth was shared between:
- General network traffic
- Storage (EBS) traffic
- This caused contention and performance degradation
- EBS-optimized instances provide dedicated bandwidth for EBS traffic
- Separates storage traffic from regular network traffic
- Prevents interference between the two
- Benefits:
- Improved storage performance
- More consistent throughput and latency
- Better overall system efficiency
- Key points:
- Typically enabled by default on modern instance types
- On older instance types, enabling this feature may incur additional cost
- Important for workloads requiring consistent performance, especially when using high-performance EBS volumes such as GP2 or IO1
Containers & ECS
Containerization 101
OS Virtualization Problems

- OS virtualization refers to running multiple operating systems on a single physical machine
- Challenges at scale:
- High storage usage
- A large portion of VM disk space is consumed by the operating system itself
- Resource duplication
- Multiple virtual machines may run identical operating systems on the same host
- Heavy resource consumption
- Each VM requires its own OS, increasing CPU, memory, and storage usage
- Operations like start, stop, and restart involve the full OS lifecycle
- High storage usage
- For many use cases, running separate operating systems per application is unnecessary
- Containers provide a more efficient alternative
Containerization (Container Virtualization)

- Containerization enables applications to run in isolated environments without full OS virtualization
- Common tools include Docker (most widely used), with alternatives like Podman
- Architecture:
- Physical hardware
- Host operating system
- Container engine (e.g., Docker Engine)
- Containers running on top
- Key characteristics:
- Application isolation
- Each container includes its dependencies and runtime environment
- Lightweight design
- Containers share the host OS instead of running separate OS instances
- Faster startup and lower resource usage
- Portability
- Applications run consistently across environments
- High density
- Many containers can run on a single host compared to VMs
- Networking and storage handled by host
- Containers expose ports through the host system
- Application isolation
- Applications can be composed of multiple containers
- Example: separate containers for application and database
Image Anatomy

- A container image is a template used to create containers
- A container is a running instance of an image
- Image structure:
- Built from multiple read-only layers
- Each layer stores only the differences from the previous one
- All layers are combined to appear as a single filesystem
- Dockerfile is used to define how an image is built
- Typically starts from a base image (
FROM) or from scratch - Images are immutable once created
- Typically starts from a base image (
Container Anatomy

- A container consists of:
- The image layers (read-only)
- An additional read/write layer unique to each container
- Key points:
- Multiple containers can be created from the same image
- Image layers are shared across containers
- Each container has its own read/write layer for changes
- Benefits:
- Efficient storage usage (shared layers)
- Minimal duplication of data
- Easy scaling with many containers using the same base image
Container Registry

- A container registry is a repository for storing and distributing container images
- Common options:
- Docker Hub (public registry)
- Amazon Elastic Container Registry (ECR)
- Images are pulled from registries to container hosts for deployment
- These hosts run the container engine and execute the containers
Amazon ECS (Elastic Container Service) 101
Amazon ECS – Key Concepts
- AWS service for running containerized workloads
- Similar to how EC2 provides virtual machines, ECS provides a managed platform for containers
- Containers run on infrastructure that AWS partially or fully manages
- Reduces administrative overhead compared to self-managing container hosts
- ECS Cluster
- Clusters are the environment where containers operate
- Deployed inside a VPC within an AWS account
- Can leverage multiple availability zones (AZs) for reliability
- Customer provides configuration via container images and instructions
- Tasks and services are deployed into ECS clusters
- Container images come from registries such as:
- Docker Hub
- AWS Elastic Container Registry (ECR) – fully integrated with AWS
- Deployment modes:
- EC2 mode
- Uses EC2 instances as container hosts
- Customers are responsible for managing the instance capacity
- Fargate mode
- Fully managed, serverless option
- AWS handles container hosts; customers define environment and task requirements
- EC2 mode
- Management components handle orchestration and scheduling
- Scheduling tasks
- Cluster management
- Container placement decisions (which host to run a container on)
- Both EC2 and Fargate modes use these components
Amazon ECS – Definitions
Container Definition
- Specifies a container’s configuration including:
- Image URI (location in container registry)
- Network ports exposed
- Additional settings if needed
- Acts as a reference for a single container with the minimum required info
Task Definition
- Represents a single application that may include one or multiple containers
- Includes:
- Container definitions that make up the application
- Resource allocation (CPU and memory)
- Network configuration
- Compatibility with EC2 or Fargate mode
- Task role (IAM role granting permissions to interact with AWS resources)
- Tasks do not automatically scale or provide high availability by themselves; a service is needed for that
- Important: Task definition is not the same as container definition
Service Definition
- Defines how tasks are deployed, scaled, and maintained for high availability
- Includes:
- Number of task instances to run (capacity and resilience)
- Optional load balancer to distribute traffic among tasks
- Monitoring and management settings
- Typically used for production workloads that require scaling and fault tolerance
- For non-critical or simple workloads, tasks can run independently without a service
ECS Cluster Modes
Amazon ECS – EC2 Mode

- EC2 instances act as container hosts
- Containers are deployed to EC2 instances via tasks and services
- ECS clusters use an Auto Scaling Group (ASG) to manage the number of EC2 instances
- ASG handles horizontal scaling of container hosts
- EC2 instances are fully visible in the EC2 console; you can connect, stop, or modify them
- Pros and considerations
- ECS handles container orchestration while customers manage the underlying EC2 hosts
- Pricing flexibility:
- Can use EC2 Reserved Instances or Spot Instances as container hosts
- Management overhead remains:
- Customer must manage capacity, scaling, and availability of the container hosts
- Not serverless; you pay for EC2 instances even if containers are idle
- ECS provides tools to simplify container host management
Amazon ECS – Fargate Mode

- AWS Fargate is a serverless compute engine for containers
- ECS tasks and services run on AWS-managed infrastructure
- Users are isolated from each other, even on shared resources, similar to EC2 isolation
- Key points
- Container hosts are fully managed by AWS; no provisioning or cluster management needed
- Customers pay only for the CPU and memory resources consumed by tasks, not the underlying hosts
- Architecture
- Tasks run within Fargate infrastructure but are attached to the customer’s VPC
- Each task gets an Elastic Network Interface (ENI) and an IP address in the VPC
- Tasks behave like any other VPC resource and can be deployed to different VPCs (Fargate only)
- High level of flexibility for task deployment within a VPC
ECS Cluster Modes – Comparison
| ECS Cluster Mode | Container Host Location | Container Host Management | Billing |
|---|---|---|---|
| EC2 | EC2 instances | Customer | Pay for full instances, regardless of container usage |
| Fargate | AWS-managed platform | AWS (Fargate) | Pay only for resources used by running tasks |
Choosing Between EC2, ECS-EC2, and ECS-Fargate
- Use plain EC2 for applications that require full VM-level features
- Use ECS for containerized applications:
- Isolate apps without full OS virtualization
- Suitable for apps with low usage or that share the same OS
- ECS-Fargate is ideal for:
- Small, bursty, or batch workloads
- Pay only for actual resource usage
- Reduces management overhead
- ECS-EC2 is suitable for:
- Large workloads where cost optimization is a priority
- Use EC2 Reserved or Spot Instances for cheaper container hosting
- Requires managing scaling, capacity, and instance faults
- Summary:
- Fargate reduces operational burden but can be more expensive
- ECS-EC2 gives pricing flexibility at the cost of more management effort
DEMO: Build, Register, and Deploy a Docker Container Image on AWS
Running Docker on an EC2 Instance (Amazon Linux 2)
- Launch a t2.micro EC2 instance using Amazon Linux 2, then connect to it once it is running
- Install Docker on the instance sudo dnf install docker
- DNF is a package manager used to install, update, and remove software packages on modern Linux distributions (successor to YUM)
- Start the Docker service sudo service docker start
- Verify Docker is running by listing containers docker ps
- This will initially return a permission error because the current user is not allowed to interact with Docker
- Grant Docker permissions to the default user sudo usermod -a -G docker ec2-user
- Adds
ec2-userto the Docker group, allowing interaction with the Docker Engine
- Adds
- Log out of the instance and reconnect
- Required for group membership changes to take effect
- Switch to the
ec2-user(if needed) sudo su – ec2-user- Necessary when using Session Manager instead of SSH or Instance Connect
- Run the verification command again docker ps
- Should now execute without errors (no containers will be listed initially)
STEP 2: Building the “Container of Cats” Docker Image

STEP 3: Deploying “Container of Cats” with ECS Fargate
- Create an ECS cluster and select Fargate as the launch type
- Define a task
- Add a container definition
- Provide the image URI from Docker Hub (created in Step 2)
- Run the task in the ECS cluster
- Choose a VPC where the task will be deployed
Container Image Registry (Amazon ECR)
Amazon ECR – Key Concepts
- AWS-managed container image registry service
- Comparable to Docker Hub, but native to AWS
- Seamlessly integrates with other AWS services
- Images stored in ECR can be used by container platforms such as Docker, ECS, and EKS
- Each AWS account includes both a public and a private ECR registry
- Public registry:
- Image pulls are open to anyone
- Pushing images requires proper permissions
- Private registry:
- All read and write actions require authorization
- Public registry:
- A registry contains multiple repositories
- Similar to repositories in version control systems like GitHub
- Each repository can store multiple images
- Images can have multiple tags
- Tags must be unique within a repository
Amazon ECR – Benefits
- Integrated with IAM
- Access control is managed through AWS Identity and Access Management
- Built-in image scanning
- Detects vulnerabilities in the operating system and software packages within container images
- Scans images layer by layer
- Supports two modes: basic and enhanced
- Enhanced scanning is powered by Amazon Inspector
- Near real-time monitoring with CloudWatch
- Tracks actions such as authentication, image pushes, and pulls
- API activity logging with CloudTrail
- Records all API interactions for auditing and tracking
- Event integration with EventBridge
- Enables event-driven automation and workflows
- Cross-region and cross-account replication
- Allows images to be copied across regions and shared between AWS accounts
Kubernetes Basics 101
Kubernetes (K8s) Concepts
- Kubernetes (K8s) is an open-source system for container orchestration
- Handles deployment, scaling, and management of containerized applications
- Comparable to Docker, but focused on automating and coordinating container operations at scale
- Cloud-agnostic platform
- Can run on AWS (EKS), Azure, GCP, or on-premises environments
Core Components
- Cluster
- A full Kubernetes deployment that manages and orchestrates applications
- Designed for high availability, with compute resources working together as a single system
- Includes a control plane responsible for scheduling, scaling, healing, and deployments
- Contains zero or more worker nodes
- Compute Units
- Pod
- Smallest deployable unit in Kubernetes
- Contains one or more containers (commonly one container per pod)
- Ephemeral and non-persistent by default
- Shares networking and storage within the pod
- Runs on nodes
- Node
- A virtual machine or physical server acting as a worker in the cluster
- Provides compute resources where pods are scheduled and executed
- Pod
- Application Definitions
- Service
- Represents a long-running application
- Maintains access to one or more pods over time
- Job
- Used for short-lived or one-time tasks
- Creates pods that run until the task completes, then terminate
- Service
- Ingress
- Provides external access into services
- Traffic flow: Ingress → Routing → Service → Pods
- Managed by an Ingress Controller (e.g., AWS Load Balancer Controller using ALB or NLB)
- Persistent Volume (PV)
- Storage resource that exists independently of pods
- Persists even after pods are deleted
- By default, Kubernetes storage is temporary unless explicitly defined as persistent
Kubernetes Cluster Structure
Node Components
- Container Runtime (e.g., containerd, Docker)
- Responsible for running and managing containers
- Kubelet
- Agent running on each node
- Communicates with the control plane via the Kubernetes API
- Kube-proxy
- Manages networking rules
- Enables communication between pods and external/internal services
- Supports service implementation
Control Plane Components
- Kube-apiserver
- Entry point to the Kubernetes control plane
- All components communicate through the Kubernetes API
- Can scale horizontally for high availability and performance
- Kube-scheduler
- Assigns pods to nodes
- Considers resource requirements, constraints, and placement rules
- etcd
- Distributed key-value store
- Stores the cluster’s state and configuration
- Cloud Controller Manager
- Integrates Kubernetes with cloud provider APIs (e.g., AWS in EKS)
- Kube-controller-manager
- Runs controller processes that maintain cluster state:
- Node Controller: handles node health and failures
- Job Controller: manages job execution
- Endpoint Controller: maps services to pods
- Service Account and Token Controllers: manage identities and access
- Runs controller processes that maintain cluster state:
Kubernetes Architecture Diagrams
- K8s High-level architecture diagram:

- K8s Detailed cluster diagram:

Amazon EKS (Elastic Kubernetes Service) Basics
Amazon EKS – Key Concepts
- AWS Kubernetes-as-a-Service (K8s-aaS)
- Fully managed Kubernetes service by AWS
- Provides Kubernetes functionality within the AWS ecosystem
- Kubernetes itself is open-source and cloud-agnostic; EKS is AWS’s implementation to run K8s workloads on AWS
- Ideal when you have K8s container workloads and want tight AWS integration
- Deployment Options for EKS
- AWS cloud (most common)
- AWS Outposts (private, on-premises AWS infrastructure)
- EKS Anywhere (create EKS clusters on-premises or in other environments)
- EKS Distro (AWS-provided open-source distribution of EKS)
- Integration with AWS services
- Works with ECR, ELB, IAM, VPC, and other AWS services
- Persistent storage options: EBS, EFS, FSx for Lustre, FSx for NetApp ONTAP
- EKS Cluster = Control Plane + Nodes
- Control Plane managed by AWS
- Scales automatically based on workload
- Runs across multiple availability zones (AZs) for high availability
- etcd (key-value store) is distributed across AZs
- Node Management Options:
- Self-managed nodes
- EC2 instances fully managed by customer
- Billed like regular EC2 instances
- Managed node groups
- EC2 instances managed by EKS
- AWS handles provisioning, updates, and lifecycle management
- Fargate pods
- Serverless pods on AWS Fargate
- No need to manage provisioning, scaling, or configuration
- Self-managed nodes
- Choosing node management depends on business requirements
- Consider OS support (Windows/Linux), GPU, Inferentia, Bottlerocket, Outposts, Local Zones
- Check available node types in your region to avoid project limitations
- Control Plane managed by AWS
Amazon EKS – Architecture

- Cluster Control Plane
- Managed by AWS
- Runs in an AWS-managed VPC spanning multiple AZs
- Worker Nodes / Pods
- Deployed in a customer-managed VPC
- EC2 nodes or Fargate pod ENIs run here
- Resource access happens via the cluster’s ingress endpoint
- Node-to-Control Plane Communication
- Two options for kube-api traffic:
- ENIs injected into the customer VPC by the control plane
- Public control plane endpoint
- Administration of the control plane is performed via the public endpoint only
- Two options for kube-api traffic: