02. VPC & Network

VPC, SUBNETS, NETWORKING

  • VPC is a virtual network for deploy resources (region lv)
  • A subnet is a range of IP addresses in your VPC as partitions (availability zone lv).
  • A public subnet is accessible from internet; private subnet is only accessible within the same VPC.
  • Use route tables to determine where network traffic from your subnet or gateway is directed. 
  • gateway connects your VPC to another network. For example, use an internet gateway to connect your VPC to the internet.
  • Use a VPC endpoint to connect to AWS services privately, without the use of an internet gateway or NAT device, (ie connect in a private network, not in public www internet). This is for VPC to connect out, not to for VPC to accept outside connect.
  • Use a VPC peering connection to route traffic between the resources in two VPCs, with condition: no overlapping CIDR (IP address range); also it’s not transitive.
  • Use a Transit Gateway, which acts as a central hub, to route traffic between your VPCs, VPN connections, and AWS Direct Connect connections.
  • Connect your VPCs to your on-premises networks using AWS Virtual Private Network (AWS VPN).
  • Use VPC Flow Logs to analysis/debug of the traffic issue, as it captures
    • VPC Flow Logs
    • Subnet Flow Logs
    • Elastic Network Interface (ENI) Flow Logs

Subnets

  • A VPC is housed within a Region, and a subnet maps 1-to-1 with an AZ. Therefore, for high availability, you need at least 2 subnets in your VPC so that you can span 2 AZs.
  • When you create a new subnet, it is automatically associated to the main route table. 
  • Example VPC/subnet configurations recommended by AWS
    • VPC with single public subnet: e.g. for single-tier, public-facing web app such as a blog or a simple website
    • VPC with public and private subnets: e.g. for multi-tier web apps where the web servers are in the public subnet and the DBs in the private subnet

Public Subnet vs. Private Subnet

  • Public Subnets
    • has a route table that routes to an Internet Gateway (IGW) (note the Internet Gateway is attached to the VPC, not directly to the subnet)
    • When EC2 instances launched in a Public Subnet, they are auto-assigned a public IP address or ENI
    • Security groups and network ACLs on Public Subnet must allow SSH traffic (on port 22) for admin config.
  • Private Subnets
    • outbound traffic is routed to a NAT device. The NAT device is installed in the Public Subnet and connected to an Internet Gateway for outbound access to the internet.
      • NAT Gateway vs. NAT Instance = NAT Gateway is managed for you by AWS and highly available, whereas NAT Instance (self-managed) is a lot more manual work but can be used as a bastion host / jump box​
    • EC2 instances don’t have public IP or ENI
    • You have to use a bastion host (“jump box”) to access instances in the Private Subnet over SSH (port 22)

Security Groups vs. Network ACL (NACL)

  • Security Group is at the instance level, Network ACL (a firewall decides traffic from and to) is at the subnet level and applies to all instances within that subnet
  • Security Group looks on IP address + other security groups, but Network ACL only works on IP address
  • Security Groups has only ALLOW rules, Network ACL have ALLOW and DENY
  • Security Groups are stateful (return traffic is always allowed), Network ACL stateless (both income and return traffic need to validate)
  • Security Groups evaluate all rules together, Network ACL processes rules in order
  • Neither can block traffic by country
  • Security Groups have inbound allow rules allowing traffic from within the group, whereas custom security groups don’t allow any inbound traffic by default. All outbound traffic is allowed by default.
  • Security Group default state: outbound rule allows all traffic to all IPs, but inbound has no rules and traffic therefore denied by default
  • NACLs function at the subnet level with separate allow/deny rules for inbound and allow/deny rules for outbound. They are stateless so it’s all about what the rules say each time.  Don’t apply -within- the subnet, only in/iout of the subnet.
  • Default security groups have inbound allow rules (from within the group). Custom security groups do not allow any inbound traffic. All outbound traffic is allowed.
  • VPC automatically comes with a default NACL which allows all inbound/outbound traffic. A custom NACL denies all inbound/outbound traffic by default.

VPC Flow Logs

  • Capture all traffic logs, from
    • VPC Flow Logs
    • Subnet Flow Logs
    • Elastic Network Interface (ENI) Flow Logs
  • also included any network information of AWS managed interfaces (ELB, ElasticCache, RDS, Aurora, etc.)
  • data can send to S3, Kinesis Data Firehose, and CloudWatch Logs

VPC Endpoints

  • Interface endpoints privately connect your VPC to AWS services, services hosted by other AWS accounts, and supported AWS Marketplace services as if they were in your VPC
  • Gateway endpoints direct traffic to S3 or DynamoDB only, using private IP addresses.
    • Does not enable AWS Privatelink
    • You route traffic from your VPC to the gateway endpoint using route tables. Protected by VPC endpoint policies rather than Security Groups.
  • powered by AWS Privatelink
  • applies to many AWS services (API Gateway, CloudFormation, CloudWatch, S3) 
  • does not go over the internet
  • no need to use an internet gateway, NAT device, DX connection, or VPN
  • Is an ENI with a private IP address, in the subnet that you specify, directing traffic to the service that you specify. Uses DNS to direct traffic to the service. Protected by a Security Group.

VPN Connection

  • Site-to-Site VPN, connect on-premises VPN to to AWS, encrypted transmit over public internet. Quick setup within minutes; AWS Managed site-to-site VPN Connection is connected between a Customer Gateway on the customer side and Virtual Private Gateway (VPG, or VPN gateway) that you create at the edge of your VPC.​
  • Direct Connect (DX), needs physical connections, for weeks or months of establishment. It goes over private network.

AWS CloudFormation

  • CloudFront distributes files from an origin as CDN (Content Delivery Network).
  • The origin
    • S3 bucket (with Origin Access Control, OAC)
      • Secure with Original Access Control (OAC)
      • work as ingress (file upload to S3)
    • Custom Origin (HTTP)
      • public Application Loader Balancer (ALB)
      • public EC2 instance
      • S3 static website
      • Any other HTTP backend
  • CloudFront vs S3 Cross Region Replication
    • CloudFront cache with TTL
    • CloudFront is global
    • S3 Cross Region Replication is file updated near real-time, as Read Only
    • S3 Cross Region Replication is good at low-latency in few regions
  • Cache
    • stored in Edge locations
    • Cache key, usu. “Domain” + “resource portion of the url”
    • Using CloudFront Cache Policies to enhance the Cache Key with custom data, also control the TTL (0s – 1yr)
      • HTTP Header
      • Cookies
      • Query String
    • CloudFront Origin Request Policy defines what part of information can be bring over to the requests to Origin (but not included in Cache Key)
      • HTTP Header; CloudFront Headers and custom Headers can be appended
      • Cookies
      • Query String
    • Cache Hit Ratio to minimise the direct traffic to origins
    • Using CreateInvalidation to manual expire caches (ie CloudFront Invalidation)
    • The default Cache Behaviour is the last to be executed, and is always /*
  • CloudFront Signed URL/Cookies, for limited sharing, with a policy of
    • URL expiration
    • IP range allow to access
    • Trust singers
  • 1 CloudFront Signed URL is for only 1 file to access; but 1 CloudFront Signed Cookies can access multiple files.
  • reduce price by utilise the Price Class
    • Price Class ALL
    • Price Class 200 – exclude Oceania and South America
    • Price Class 100 – USA + Europe
  • Multiple Origin, using path pattern
  • Origin Group is for failover and high availability, as one Primary and one Secondary
  • Field Level Encryption, adds extra layer of security along with HTTPS, using asymmetric encryption
  • CloudFront Real Time Logs, can send all requests to Kinesis Data Streams
    • Sampling Rate, decide the percentage to be recorded
    • Specific field
    • Specific Cache Behaviour (path patterns)
  • Lambda@Edge is a feature of CloudFront that lets you run code closer to users of your application, which improves performance and reduces latency
  • Can be configured to load an error page (“content not found”) for operationally simple error handling
  • Geo restriction (whitelist/blacklist access to content by country, e.g. due to copyright restrictions) 

Elastic Load Balancers

  • ELBs send traffic to AWS and on-prem resources. Unlike Route 53, they use resource IP addresses and you don’t get to specify policies such as a weighted policy. VPC flow logs show traffic going to/from an ELB
  • Health check is way to check the instance status under ELB, usu a port with a route (like /health)
  • Classic Load Balancer (CLB) operates using TCP, SSL, HTTP and HTTPS.
  • An Application Load Balancer (ALB) makes routing decisions at the application layer aka Layer 7 (HTTP/HTTPS & WebSocket)
    • supports path-based routing and host-based routing (i.e. based on the content of the request in the host field), even QueryStrings and Headers
    • can route requests to one or more ports on each ECS container instance in a cluster, also the port mapping as Dynamic Host Port Mapping
    • support Lambda Functions as target
    • support private IPs (on-prem resources)
    • support redirects (from HTTP to HTTPS)
    • supports authentication from OIDC compliant IDPs such as Google and Facebook via an integration with Cognito
    • periodically sends messages to its targets to check their status – health checks. – and routes only to healthy targets
    • enable access logs which can get pushed to S3. They log info on requester, IP, request type, etc.
    • the client information can be transmitted by custom inserted HTTP Header, X-Forwarded-For (IP), X-Forwarded-Port (Port), and X-Forwarded-Proto (Proto)
  • Network Load Balancer (NLB) make routing decisions at the transport layer aka Layer 4 (TCP, TLS & UDP). They can handle millions of requests per second with extremely low latency. They don’t support path-based routing or host-based routing the way ALB does.
    • One static IP per AZ, and support assigning Elastic IP (good for whitelisting specific IP)
    • The target groups can be EC2, private IPs, and ALB
    • The Heath check supports TPC, HTTP, and HTTPS protocols
  • Gateway Load Balancer (GWLB) works on Layer 3 (IP Packages), mostly about Firewalls, Intrusion Detection and Prevention Systems, Deep Packet Inspection Systems, payload manipulation; using GENEVE protocol on port 6081
    • Transparent Network Gateway – single entry/exit for external traffic
    • Load Balancer
    • Target groups can be EC2 and private IPs
  • Sticky Sessions (Session Affinity), same client is always redirected to the same instance behind a load balancer
    • Available for CLB, ALB, and NLB
    • Two types of cookies
      • Application-based Cookies, with default AWSALBAPP or custom (not naming with AWSALB, AWSALBAPP, or AWSALBTG)
      • Duration-based Cookies, with AWSALB for ALB, AWSELB for CLB
  • With Cross-Zone Load Balancing, traffic would be spread across AZs not instances
    • Default enabled for ALB, but disabled for NLB, GWLB and CLB
    • free of inter AZ data transfer for ALB and CLB
  • SSL certificates supports
    • only 1 per CLB
    • multiple on ALB and NLB, also with Server Name Indication (SNI)
  • Connection Draining for CLB, Deregistration Delay for ALB/NLB
    • Time to wait for “in-flight requests” completion while the instance is de-registering or unhealthy
    • Stops sending new requests to the EC2 instance which is de-registering
    • the value can be 0 (as disabled means no draining allowed) or between 1 to 3600 seconds, as default is 300 seconds
    • Set to a low value if your requests are short

Auto-Scaling Group (ASG)

  • In high-availability contexts you use an Auto-Scaling Group (ASG) to automatically launch and stop instances, and an Elastic Load Balancer (ELB) to distribute traffic among the instances
    • ​specify which subnets the ASG should launch instances into
    • attach Target Groups to the ASG 
  • ASG Launch Template, with min/max/init capacity, also scale-in(decrease)/scale-out(increase) policies
    • AMI + InstanceType
    • EC2 User Data
    • EBSVolumes
    • Security Groups
    • SSH Key Pair
    • IAM Roles for EC2
    • Network + Subnets
    • Load Balancer
  • ASG scaling policies
    • Dynamic
      • Target Tracking – uses a custom metric to add/remove instances
      • Simple / Step Scaling
    • Scheduled – based on known usage patterns
    • Predictive – continuously forecast load and schedule scaling ahead
  • ​Metrics to look: CPUUtilization, RequestCountPerTarget, Average Network In / Out, and custom
  • Cooldown period (default 300 seconds) – reducing the cooldown period will more quickly terminate unneeded instances, reducing costs
  • Instance Refresh, for recreated all instances with updated Launch Template
    • minimum healthy percentage
    • warm-up time (how long until the new instance is ready to use)

3 Types of Network Adapters

  1. ENI – basic type
  2. ENA – for enhanced networking, high bandwidth and low latency
  3. EFA (fabric adapter) – for high performance computing

Amazon Route 53

  • Geolocation routing is by location of the user, geoproximity routing is by proximity of the resources
  • weighted routing = split traffic by %
  • Health check = check the health of your resources and only return healthy resources in response to DNS queries
  • apply a routing policy such as latency, weighted, failover
  • Configurations
    • active/passive: in case of failure, return backup resource. Requires failover policy. Manual intervention can be required to then cause a fail-back to the active site. 
    • active/active: return >1 resource. Requires latency policy, weighted policy, or some other policy besides failover. In the case of failover, returns only the healthy resource
    • combination: multiple policies are combined into a tree for more complex DNS failover

Routing Records

  • Best practice is to use DNS names/URLs whenever possible rather than IP addresses. Some exceptions include pointing ELBs directly to the IP address of a peered VPC, or an on-prem resource linked via DX or VPN connection.
  • Alias records provide a Route 53–specific extension to DNS functionality.
    • They let you route traffic to selected AWS resources: ELBs, APIs, CloudFront distributions, S3 buckets, Elastic Beanstalk, VPC interface endpoints, etc.
    • Unlike a CNAME record, they also let you route traffic from one record in a hosted zone (usually the zone apex / naked domain name, such as “example.com”) to another record (e.g. “www.example.com”)
    • When Route 53 receives a DNS query for an alias record, it responds with 1 or more IP addresses that the record maps to
  • CNAME records (canonical name records) redirect DNS queries to any DNS record. For example, you can create a CNAME record that redirects queries from acme.example.com to zenith.example.com or acme.example.org.
    • You don’t need to use Route 53.
    • Unlike Alias records, they can’t be used for resolving apex domain names
  • PTR records = reverse lookup where you map an IP address to a DNS name

AWS Services Calling into a VPC

  • To enable AWS serverless services such as Lambda to access resources inside your private VPC, you provide it with VPC-specific info such as your subnet IDs and security group IDs.

AWS Direct Connect (DX) Gateway

  • You can use Direct Connect (DX) to connect an on-prem data centre to one or multiple VPCs
  • DX can take > 1 month to setup
  • For resilience, add a 2nd DX connection. As this can take time to setup and is costly, in the short term consider also adding an IPSec VPN connection (with the same BGP prefix) for resiliency. 
  • You must create one of the following virtual interfaces to begin using DX:
    • Private virtual interface (private VIF): access a VPC using private IP addresses
    • Public virtual interface (public VIF): access all AWS public services using public IP addresses
    • Transit virtual interface (transit VIF): access one or more VPC Transit Gateways associated with DX gateways, within a Region.
  • hosted virtual interface (hosted VIF) allows another AWS account to access your DX
  • Use AWS DataSync to copy large amount of data from on-prem to S3, EFS, FSx, NFS shares, SMB shares, AWS Snowcone (via Direct Connect).  For copying data, use DMS to copy databases. 

AWS Transit Gateway

  • Central Hub connecting on-prem networks and VPCs.
    • Reduces operational complexity as you can easily add more VPCs, VPN capacity, Direct Connect gateways, without complex routing tables. 
    • Provides additional features over-and-above VPC peering
  • A transit virtual interface is used to access VPC Transit Gateways
  • Pattern for connecting 1 DX to multiple VPCs in the same Region is to associate the DX with a transit gateway
    • on-prem -> DX -> DX location -> transit virtual interface -> transit gateway association -> Transit Gateway -> multiple VPCs