Plenty Instance Types across 5 Instance Families
- AWS offers over 300 EC2 instance types across 5 instance families (general purpose family, memory-optimized, storage-optimized, compute-optimized, and accelerated computing), each with varying resource and performance focuses
- For example, within the compute-optimized family, you have C4 instance types (running on Haswell chips) and more recent C5 instance types (running on Skylake with Nitro system).
Instance Purchasing Options
- On-Demand Instances – the default option, for short-term ad-hoc requirements where the job can’t be interrupted
- On-Demand Capacity Reservations – the only way to reserve capacity for blocks of time such as 9am-5pm daily
- Spot instance – highest discount potential (50-90%) but no commitment from AWS, could be terminated with 2min notice. Could use for grid and high-performance computing.
- Reserved Instances – for long-term workloads, 1 or 3 year commitment in exchange for 40-60% discount
- Dedicated Instances – run on hardware dedicated to 1 customer (more $$)
- Dedicated Host – fully dedicated and physically isolated server. Allows you to use your server-bound software licenses (e.g. IBM, Oracle) and addresses compliance and regulatory requirements and potentially reduce cost (note: billing is per-hour not per-instance)
- Bare metal EC2 instance – for when the workload needs access to the hardware feature set (e.g. Intel hardware)
Launching Instances
- Configurations / Launch Templates used to create new EC2 instances using stored parameters such as instance family, instance type, AMI, key pair and security groups. Auto-scaling groups can launch instances using config templates. You can’t edit a launch config, but you can create a new one and point to it.
- User data – pass up to 16KB of user data at launch that the instance can run on startup such as config scripts
- Instance metadata (e.g. instance ID, hostname, events, security groups, public keys, network interfaces,) can be accessed via a direct URI or by using the Instance Metadata Query Tool
- When you launch an EC2 instance into a default VPC, it has a public and private DNS hostname and IP address. When you launch in a non-default VPC, it may not have a public hostname depending on the DNS and VPC configs.
- Errors when launching include InsufficientInstanceCapacity, InstanceLimitExceeded
- Instances terminate with no error if there are EBS problems (EBS volume limit, EBS snapshot is corrupt), or if the AMI you’re launching from is missing a required part
- Each EC2 instance that you launch has an associated root device volume, either EBS volume or an Instance Store volume (more these under Storage section below). You can use block device mapping to specify additional EBS volumes or instance store volumes to attach to a live instance, attach additional EBS volumes to a running instance, but can’t directly add additional Instance Store volumes.
- Run Command – run from the AWS Management Console, CLI or SDK, to install software, execute Powershell commands and scripts, configuring Windows settings, on live EC2 instances
Placement Groups
- Cluster placement group = packs instances close together inside an AZ to achieve low latency, high throughput – use for HPC
- Partition placement group = separate instances into logical partitions such that instances in one partition do not share hardware with instances in another partition. Gives you control and visibility into instance placement, but not great for performance. Used by large distributed workloads such as Hadoop.
- Spread placement group = place 1 or few instances each in distinct hardware to reduce correlated failures. Not great for performance
Scaling Instances
- In high-availability contexts you use an Auto-Scaling Group (ASG) to automatically launch and stop instances, and an Elastic Load Balancer (ELB) to distribute traffic among the instances
- specify which subnets the ASG should launch instances into
- attach Target Groups to the ASG
- ASG scaling policies
- Simple – maintain the # of instances, manually change the min/desired/max and attach/detach instances
- Scheduled – scale based on a scheduled event or recurring schedule (e.g. if you know that you have traffic spike every morning at 9am)
- Dynamic – scale in response to an event or alarm
- Step – configure multiple changes to scaling based on multiple events
- Target Tracking – uses a custom metric to add/remove instances
- NOTE: AWS recommends using target tracking over step scaling, and step scaling over simple scaling in most cases
- Cooldown period – reducing the cooldown period will more quickly terminate unneeded instances, reducing costs
- Enhanced networking provides higher bandwidth, higher packet-per-second (PPS) erformance, and lower inter-instance latency. Consider if PPS is maxed out.