A VPC is a virtual network for deploy resources (region lv)
A subnet is a range of IP addresses in your VPC as partitions (availability zone lv).
A public subnet is accessible from internet; private subnet is only accessible within the same VPC.
Use route tables to determine where network traffic from your subnet or gateway is directed.
A gateway connects your VPC to another network. For example, use an internet gateway to connect your VPC to the internet.
Use a VPC endpoint to connect to AWS services privately, without the use of an internet gateway or NAT device, (ie connect in a private network, not in public www internet). This is for VPC to connect out, not to for VPC to accept outside connect.
Use a VPC peering connection to route traffic between the resources in two VPCs, with condition: no overlapping CIDR (IP address range); also it’s not transitive.
Use a Transit Gateway, which acts as a central hub, to route traffic between your VPCs, VPN connections, and AWS Direct Connect connections.
Connect your VPCs to your on-premises networks using AWS Virtual Private Network (AWS VPN).
Use VPC Flow Logs to analysis/debug of the traffic issue, from
VPC Flow Logs
Subnet Flow Logs
Elastic Network Interface (ENI) Flow Logs
Capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data can be published to Amazon CloudWatch Logs and Amazon S3.
also included any network information of AWS managed interfaces (ELB, ElasticCache, RDS, Aurora, etc.)
data can send to S3, Kinesis Data Firehose, and CloudWatch Logs
Subnets
A VPC is housed within a Region, and a subnet maps 1-to-1 with an AZ. Therefore, for high availability, you need at least 2 subnets in your VPC so that you can span 2 AZs.
When you create a new subnet, it is automatically associated to the main route table.
Example VPC/subnet configurations recommended by AWS
VPC with single public subnet: e.g. for single-tier, public-facing web app such as a blog or a simple website
VPC with public and private subnets: e.g. for multi-tier web apps where the web servers are in the public subnet and the DBs in the private subnet
Public Subnet vs. Private Subnet
Public Subnets
has a route table that routes to an Internet Gateway (IGW) (note the Internet Gateway is attached to the VPC, not directly to the subnet)
When EC2 instances launched in a Public Subnet, they are auto-assigned a public IP address or ENI
Security groups and network ACLs on Public Subnet must allow SSH traffic (on port 22) for admin config.
Private Subnets
outbound traffic is routed to a NAT device. The NAT device is installed in the Public Subnet and connected to an Internet Gateway for outbound access to the internet.
NAT Gateway vs. NAT Instance = NAT Gateway is managed for you by AWS and highly available, whereas NAT Instance (self-managed) is a lot more manual work but can be used as a bastion host / jump box​
using NAT Gateway, enables outbound internet access from the private subnet while blocking inbound connections from the Internet
EC2 instances don’t have public IP or ENI
You have to use a bastion host (“jump box”) to access instances in the Private Subnet over SSH (port 22)
NAT Gateway
AWS-managed NAT, higher bandwidth, high availability, no administration
NAT Gateway is resilient within a single Availability Zone
Must create multiple NAT Gateways in multiple AZs for fault-tolerance
There is no cross-AZ failover needed because if an AZ goes down it doesn’t need NAT
Pay per hour for usage and bandwidth
NATGW is created in a specific Availability Zone, uses an Elastic IP
Can’t be used by EC2 instance in the same subnet (only from other subnets)
Requires an IGW (Private Subnet => NATGW => IGW)
5 Gbps of bandwidth with automatic scaling up to 100 Gbps
No Security Groups to manage / required
NAT Gateway
NAT Instance
Availability
Highly available within AZ (create in another AZ)
Use a script to manage failover between instances
Bandwidth
Up to 100 Gbps
Depends on EC2 instance type
Maintenance
Managed by AWS
Managed by you (e.g., software, OS patches, …)
Cost
Per hour & amount of data transferred
Per hour, EC2 instance type and size, + network $
Public IPv4
YES
YES
Private IPv4
YES
YES
Security Groups
NO
YES
Use as Bastion Host?
NO
YES
Security Groups vs. Network ACL (NACL)
Security Group is at the instance level, Network ACL (a firewall decides traffic from and to) is at the subnet level and applies to all instances within that subnet
Security Group looks on IP address + other security groups, but Network ACL only works on IP address
Security Groups has only ALLOW rules, Network ACL have ALLOW and DENY
Security Groups are stateful (return traffic is always allowed), Network ACL stateless (both income and return traffic need to validate)
Security Groups evaluate all rules together, Network ACL processes rules in order
Neither can block traffic by country
Security Groups have inbound allow rules allowing traffic from within the group, whereas custom security groups don’t allow any inbound traffic by default. All outbound traffic is allowed by default.
Security Group default state: outbound rule allows all traffic to all IPs, but inbound has no rules and traffic therefore denied by default
NACLs function at the subnet level with separate allow/deny rules for inbound and allow/deny rules for outbound. They are stateless so it’s all about what the rules say each time. Don’t apply -within- the subnet, only in/iout of the subnet.
Default security groups have inbound allow rules (from within the group). Custom security groups do not allow any inbound traffic. All outbound traffic is allowed.
VPC automatically comes with a default NACL which allows all inbound/outbound traffic. A custom NACL denies all inbound/outbound traffic by default.
VPC Endpoints
Interface endpoints privately connect your VPC to AWS services, services hosted by other AWS accounts, and supported AWS Marketplace services as if they were in your VPC
Gateway endpoints direct traffic to S3 or DynamoDB only, using private IP addresses.
Does not enable AWS Privatelink
You route traffic from your VPC to the gateway endpoint using route tables. Protected by VPC endpoint policies rather than Security Groups.
powered by AWS Privatelink
privately access services hosted on the AWS network without exposing your VPC to the public Internet
applies to many AWS services (API Gateway, CloudFormation, CloudWatch, S3)
does not go over the internet
no need to use an internet gateway, NAT device, DX connection, or VPN
Is an ENI with a private IP address, in the subnet that you specify, directing traffic to the service that you specify. Uses DNS to direct traffic to the service. Protected by a Security Group.
VPN Connection
Site-to-Site VPN, connect on-premises VPN to to AWS, encrypted transmit over public internet. Quick setup within minutes; AWS Managed site-to-site VPN Connection is connected between a Customer Gateway on the customer side and Virtual Private Gateway (VPG, orVPN gateway) that you create at the edge of your VPC.​
Direct Connect (DX), needs physical connections, for weeks or months of establishment. It goes over private network.
Amazon Route 53
Domain Name System (DNS)
Domain Registrar
DNS Records
Zone File: contains DNS records
Name Server: resolves DNS queries (Authoritative or Non-Authoritative)
Top Level Domain (TLD)
Second Level Domain (SLD)
A highly available, scalable, fully managed and Authoritative DNS
Authoritative = the customer (you) can update the DNS records
The only AWS service which provides 100% availability SLA
Health Checks
HTTP Health Checks are only for public resources
Health Check => Automated DNS Failover
Pass with 2xx/3xx status codes
Can setup based on the text on the first 5120 bytes of the response
Health Checks are integrated with CW metrics
(Combine) Calculated
up to 256 Child Health Checks
Health checkers are outside the VPC
can’t access private endpoints (private VPC or on-premises resource)
You can create a CloudWatch Metric and associate a CloudWatch Alarm, then create a Health Check that checks the alarm itself
Routing Policies
does not route any traffic, it only responds to the DNS queries
• Simple
Can specify multiple values in the same record
if multiple values are returned, a random one is chosen by the client
no Health Check
Weighted
Weights don’t need to sum up to 100
A record with 0 weight, then it assume to stop the resource; but if all records are set as 0, then all resources would be used equally
Latency based
Latency is based on traffic between users and AWS Regions
Geolocation
by location of the user [User GeoLocation]
Should create a “Default” record (in case there’s no match on location)
by proximity of the resources and users [Resource GeoLocation]
Ability to shift more traffic to resources based on the defined bias
Failover
IP-based Routing
based on clients’ IP addresses
a list of CIDRs for your clients
Multi-Value
Can be associated with Health Checks (return only values for healthy resources)
Up to 8 healthy records are returned for each Multi-Value query
Multi-Value is not a substitute for having an ELB
Traffic flow
Visual editor to manage complex routing decision trees
Configurations can be saved as Traffic Flow Policy
Can be applied to different Route 53 Hosted Zones (different domain names)
Supports versioning
Configurations
active/passive: in case of failure, return backup resource. Requires failover policy. Manual intervention can be required to then cause a fail-back to the active site.
active/active: return >1 resource. Requires latency policy, weighted policy, or some other policy besides failover. In the case of failover, returns only the healthy resource
combination: multiple policies are combined into a tree for more complex DNS failover
Routing Records
Best practice is to use DNS names/URLs whenever possible rather than IP addresses. Some exceptions include pointing ELBs directly to the IP address of a peered VPC, or an on-prem resource linked via DX or VPN connection.
Alias records provide a Route 53–specific extension to DNS functionality.
They let you route traffic to selected AWS resources: ELBs, APIs, CloudFront distributions, S3 buckets, Elastic Beanstalk, VPC interface endpoints, etc.
Unlike a CNAME record, they also let you route traffic from one record in a hosted zone (usually the zone apex / naked domain name, such as “example.com”) to another record (e.g. “www.example.com”)
When Route 53 receives a DNS query for an alias record, it responds with 1 or more IP addresses that the record maps to
Works for ROOT DOMAIN and NON ROOT DOMAIN (aka mydomain.com)
Alias Record is always of type A/AAAA for AWS resources (IPv4 / IPv6)
You can’t set the TTL
You cannot set an ALIAS record for an EC2 DNS name
CNAME records (canonical name records) redirect DNS queries to any DNS record. For example, you can create a CNAME record that redirects queries from acme.example.com to zenith.example.com or acme.example.org.
Points a hostname to any other hostname
You don’t need to use Route 53.
Unlike Alias records, they can’t be used for resolving apex domain names
ONLY FOR NON ROOT DOMAIN (aka. something.mydomain.com)
PTR records = reverse lookup where you map an IP address to a DNS name
CloudFront
CloudFront distributes files from an origin as CDN (Content Delivery Network).
The origin
S3 bucket (with Origin Access Control, OAC)
Secure with Original Access Control (OAC)
work as ingress (file upload to S3)
Custom Origin (HTTP)
public Application Loader Balancer (ALB)
public EC2 instance
S3 static website
Any other HTTP backend
CloudFront vs S3 Cross Region Replication
CloudFront cache with TTL
CloudFront is global
S3 Cross Region Replication is file updated near real-time, as Read Only
S3 Cross Region Replication is good at low-latency in few regions
Cache
stored in Edge locations
Cache key, usu. “Domain” + “resource portion of the url”
Using CloudFront Cache Policies to enhance the Cache Key with custom data, also control the TTL (0s – 1yr)
HTTP Header
Cookies
Query String
CloudFront Origin Request Policy defines what part of information can be bring over to the requests to Origin (but not included in Cache Key)
HTTP Header; CloudFront Headers and custom Headers can be appended
Cookies
Query String
Cache Hit Ratio to minimise the direct traffic to origins
Using CreateInvalidation to manual expire caches (ie CloudFront Invalidation)
The default Cache Behaviour is the last to be executed, and is always /*
CloudFront Signed URL/Cookies, for limited sharing, with a policy of
URL expiration
IP range allow to access
Trust singers
The most effective method to control unauthorized access to the photos and manage data transfer costs is to use a CloudFront web distribution with signed URLs or signed cookies.
Configure the S3 bucket to remove public read access and use pre-signed URLs with expiry dates is incorrect because it is not scalable for large numbers of objects, as it would require generating pre-signed URLs for potentially thousands or millions of objects, making it impractical.
Blocking the IP addresses of the offending websites using Network Access Control List is incorrect because a quick change in IP address would easily bypass this configuration
1 CloudFront Signed URL is for only 1 file to access; but 1 CloudFront Signed Cookies can access multiple files.
reduce price by utilise the Price Class
Price Class ALL
Price Class 200 – exclude Oceania and South America
Price Class 100 – USA + Europe
Multiple Origin, using path pattern
Origin Group is for failover and high availability, as one Primary and one Secondary
Field Level Encryption, adds extra layer of security along with HTTPS, using asymmetric encryption
HTTPS enforcement
Viewers <> CF, set “Viewer Protocol Policy” to use “Redirect HTTP to HTTPS“, “HTTPS Only“.
CF <> origins, using AWS Certificate Manager (ACM) (or 3rd party ssl certificates imported)
CloudFront Real Time Logs, can send all requests to Kinesis Data Streams
Sampling Rate, decide the percentage to be recorded
Specific field
Specific Cache Behaviour (path patterns)
Lambda@Edge is a feature of CloudFront that lets you run code closer to users of your application, which improves performance and reduces latency
Can be configured to load an error page (“content not found”) for operationally simple error handling
Geo restriction (whitelist/blacklist access to content by country, e.g. due to copyright restrictions)
Headers for CloudFront Function (similar to CloudFlare Worker)
YesCloudFront KeyValueStore only supports JavaScript runtime 2.0
No
Scale
10,000,000 requests per second or more
Up to 10,000 requests per second per Region
Function duration
Submillisecond
Up to 5 seconds (viewer request and viewer response)Up to 30 seconds (origin request and origin response)
Maximum function memory size
2 MB
128 MB (viewer request and viewer response)10,240 MB (10 GB) (origin request and origin response)
Maximum size of the function code and included libraries
10 KB
50 MB (viewer request and viewer response)50 MB (origin request and origin response)
Network access
No
Yes
File system access
No
Yes
Access to the request body
No
Yes
Access to geolocation and device data
Yes
No (viewer request and viewer response)Yes (origin request and origin response)
Can build and test entirely within CloudFront
Yes
No
Function logging and metrics
Yes
Yes
using CloudFront to ensure viewer <> origin are all end-to-end SSL connection
A viewer submits an HTTPS request to CloudFront.
If the object is in the CloudFront edge cache, CloudFront encrypts the response and returns it to the viewer, and the viewer decrypts it.
If the object is not in the CloudFront cache, CloudFront performs SSL/TLS negotiation with your origin
Your origin decrypts the request, encrypts the requested object, and returns the object to CloudFront.
CloudFront decrypts the response, re-encrypts it, and forwards the object to the viewer. CloudFront also saves the object in the edge cache so that the object is available the next time it’s requested.
The viewer decrypts the response.
Elastic Load Balancers
ELBs send traffic to AWS and on-prem resources. Unlike Route 53, they use resource IP addresses and you don’t get to specify policies such as a weighted policy. VPC flow logs show traffic going to/from an ELB
Health check is way to check the instance status under ELB, usu a port with a route (like /health)
A Classic Load Balancer (CLB) operates using TCP, SSL, HTTP and HTTPS.
An Application Load Balancer (ALB) makes routing decisions at the application layer aka Layer 7 (HTTP/HTTPS & WebSocket)
supports path-based routing and host-based routing (i.e. based on the content of the request in the host field), even QueryStrings and Headers
can route requests to one or more ports on each ECS container instance in a cluster, also the port mapping as Dynamic Host Port Mapping
support Lambda Functions as target
support private IPs (on-prem resources)
support redirects (from HTTP to HTTPS)
supports authentication from OIDC compliant IdPs (OpenID) such as Google and Facebook via an integration with Cognito (with HTTPS listener, port 443)
periodically sends messages to its targets to check their status – health checks. – and routes only to healthy targets
enable access logs which can get pushed to S3. They log info on requester, IP, request type, etc.
the client information can be transmitted by custom inserted HTTP Header, X-Forwarded-For (IP), X-Forwarded-Port (Port), and X-Forwarded-Proto (Proto)
Target Group Weighting
Specify weight for each Target Group on a single Rule
Example: multiple versions of your app, blue/green deployment
Allows you to control the distribution of the traffic to your applications
A Network Load Balancer (NLB) make routing decisions at the transport layer aka Layer 4 (TCP, TLS & UDP). They can handle millions of requests per second with extremely low latency. They don’t support path-based routing or host-based routing the way ALB does.
One static IP per AZ, and support assigning Elastic IP (good for whitelisting specific IP)
The target groups can be EC2, private IPs, and ALB
The Heath check supports TPC, HTTP, and HTTPS protocols
NOT support target group weighting
Gateway Load Balancer (GWLB) works on Layer 3 (IP Packages), mostly about Firewalls, Intrusion Detection and Prevention Systems, Deep Packet Inspection Systems, payload manipulation; using GENEVE protocol on port 6081
Transparent Network Gateway – single entry/exit for external traffic
Load Balancer
Target groups can be EC2 and private IPs
Sticky Sessions (Session Affinity), same client is always redirected to the same instance behind a load balancer
Available for CLB, ALB, and NLB
Two types of cookies
Application-based Cookies, with default AWSALBAPP or custom (not naming with AWSALB, AWSALBAPP, or AWSALBTG)
Duration-based Cookies, with AWSALB for ALB, AWSELB for CLB
With Cross-Zone Load Balancing, traffic would be spread across AZs not instances
Default enabled for ALB, but disabled for NLB, GWLB and CLB
free of inter AZ data transfer for ALB and CLB
To isolate the unhealthy instance in a specific zone
Disable Cross-Zone Load Balancing on the ALB and use Amazon Route 53 Application Recovery Controller to initiate a zonal shift to avoid the impacted zone entirely.
Route 53 ARC allows you to implement highly reliable routing controls to manage traffic during application recovery scenarios. These routing controls act as simple on/off switches hosted on a high-availability cluster. To handle failover, you can set one routing control to ON and another to OFF. This action reroutes traffic from the impacted Availability Zone to a healthy one, ensuring continuous application availability.
SSL certificates supports
only 1 per CLB
multiple on ALB and NLB, also with Server Name Indication (SNI)
Connection Draining for CLB, Deregistration Delay for ALB/NLB
Time to wait for “in-flight requests” completion while the instance is de-registering or unhealthy
Stops sending new requests to the EC2 instance which is de-registering
the value can be 0 (as disabled means no draining allowed) or between 1 to 3600 seconds, as default is 300 seconds
Set to a low value if your requests are short
DualStack Networking
Allows clients communicate with the ELB using both IPv4 and IPv6
Supports both ALB and NLB
ALB and NLB can have mixed IPv4 and IPv6 targets in separate target groups.
ELB DualStack ensures compatibility between client and target IP versions
IPv4 clients communicate with IPv4 targets, and IPv6 clients communicate with IPv6 targets.
If you only have IPv4 targets, the ELB automatically converts requests from IPv6 to IPv4
Note: AZ must be added/enabled for instances to receive traffic
With Auto-Scaling Group (ASG)
In high-availability contexts you use an Auto-Scaling Group (ASG) to automatically launch and stop instances, and an Elastic Load Balancer (ELB) to distribute traffic among the instances
​specify which subnets the ASG should launch instances into
attach Target Groups to the ASG
ASG Launch Template, with min/max/init capacity, also scale-in(decrease)/scale-out(increase) policies
AMI + InstanceType
EC2 User Data
EBSVolumes
Security Groups
SSH Key Pair
IAM Roles for EC2
Network + Subnets
Load Balancer
ASG scaling policies
Dynamic
Target Tracking – uses a custom metric to add/remove instances
Simple / Step Scaling
Scheduled – based on known usage patterns
Predictive – continuously forecast load and schedule scaling ahead
​Metrics to look: CPUUtilization, RequestCountPerTarget, Average Network In / Out, and custom
Cooldown period (default 300 seconds) – reducing the cooldown period will more quickly terminate unneeded instances, reducing costs
Instance Refresh, for recreated all instances with updated Launch Template
minimum healthy percentage
warm-up time (how long until the new instance is ready to use)
AWS Direct Connect (DX) Gateway
You can use Direct Connect (DX) to connect an on-prem data centre to one or multiple VPCs
DX can take > 1 month to setup
For resilience, add a 2nd DX connection. As this can take time to setup and is costly, in the short term consider also adding an IPSec VPN connection (with the same BGP prefix) for resiliency.
You must create one of the following virtual interfaces to begin using DX:
Private virtual interface (private VIF): access a VPC using private IP addresses
Public virtual interface (public VIF): access all AWS public services using public IP addresses
Transit virtual interface (transit VIF): access one or more VPC Transit Gateways associated with DX gateways, within a Region.
A hosted virtual interface (hosted VIF) allows another AWS account to access your DX
Use AWS DataSync to copy large amount of data from on-prem to S3, EFS, FSx, NFS shares, SMB shares, AWS Snowcone (via Direct Connect). For copying data, use DMS to copy databases.
AWS Global Accelerator
a service for improving the availability and performance of applications by routing user traffic through the AWS global network infrastructure.
primarily used for accelerating traffic to specific application endpoints like EC2 instances or Elastic Load Balancers, not for accelerating data transfers to Amazon S3
(So what’s the difference among Direct Connect, Transit Gateway, and Global Accelerator??)
3 Types of Network Adapters
ENI – basic type
ENA – for enhanced networking, high bandwidth and low latency
EFA (fabric adapter) – for high performance computing
AWS Services Calling into a VPC
To enable AWS serverless services such as Lambda to access resources inside your private VPC, you provide it with VPC-specific info such as your subnet IDs and security group IDs.
AWS Transit Gateway
Central Hub connecting on-prem networks and VPCs.
Reduces operational complexity as you can easily add more VPCs, VPN capacity, Direct Connect gateways, without complex routing tables.
Provides additional features over-and-above VPC peering
A transit virtual interface is used to access VPC Transit Gateways
Pattern for connecting 1 DX to multiple VPCs in the same Region is to associate the DX with a transit gateway