AWS Step Functions
- Workflow in JSON
- Start with SDK, API Gateway call, or Event Bridge
- Task: Invoke 1 AWS service or Run 1 Activity
- States: Choice, Fail-or-Succeed, Pass, Wait, Map, Parallel
- Error Handling should be in Step Functions, not in Task; using Retry and Catch, running from top to bottom but not sequentially (“OR”)
- Wait for Task token: append .waitForTaskToken in Resource, pause the running until receiving a SendTaskSuccess or SendTaskFailure API call. (PUSH mechanism)
- Activity Task: Activity Worker on EC2/Lambda/.., using GetTaskActivity API call for poll, sending response with SendTaskSuccess or SendTaskFailure API call (PULL mechanism), with SendTaskHeartBeat + HeartBeatSeconds
- Standard vs Express (asynchronous and synchronous)
Cloud Development Kit (CDK)
- CloudFormation using JSON/YAML, but CDK using Javascript/Typescript, Python, Java, .Net
- Contain higher level components, constructs
- encapsulate everything for final CloudFormation stack creation
- AWS Construct Library or Construct Hub
- Layer 1 (L1): CloudFormation(CFN) resources, prefix with “Cfn”, and all resource properties needed to be explicitly configured
- Layer 2 (L2): intent-based API resources, with defaults and boilerplate, also can use methods
- Layer 3 (L3): aka Patterns, represents as multiple related resources (for example, API Gateway + Lambda, or Fargate cluster + Application Load Balancer)
- The codes would be complied to CloudFormation template
- Benefits for Lambda & ECS/EKS as infrastructures and applications runtime codes implemented together
- SAM focus on serverless, good for Lambda, but only JSON/YAML
- Bootstrapping: the process of provisioning before deploying AWS environment (Account+Region)
- CDKToolkit (CloudFormation stack), with S3 Bucket – store files and IAM Roles
- Error: “Policy contains a statement with one or more invalid principal”, due to the lack of new IAM Roles for each new environment
- UnitTest, using CDK Assertion Module for Jest(Javascript) or Pytest(Python)
- Fine-granted Assertions (common): check certain property of certain resource
- Snapshot Test: test against baseline template
AWS Serverless Application Modal (SAM)
- configure via JSON/YAML, complied to CloudFormation stack
- use CodeDeploy for Lambda function
- Traffic Shifting
- Pre- and Pro- for testing on traffic shifting
- rollback by AWS CloudWatch Alarm
- run Lambda, API Gateway, DynamoDB locally
- Lambda start/invoke
- API Gateway
- AWS Events (sample payloads for event resources)
- SAM Recipe
- Transform Header – template
- Write Code
- Package and Deploy – into S3 Bucket
- SAM Accelerate (sam sync) – reduce latency
- update existing SAM template
- using “–code” option, without updating infrastructure (service APIs and bypass CloudFormation)
- SAM Policy Templates
- apply permissions to Lambda Functions
- SAM Multiple Environments, using “samconfig.toml”
AWS CLI & SDK
- xxxxxx
- xxxxx
- xxxx
===========
AWS CloudFormation
- provision infrastructure using a text-based template that describes exactly what resources are provisioned and their settings. Can use scripts to automate the creation of member accounts and VPCs.
- manages the template history similar to how code is managed in source control
- 2 methods of updating a stack
- âdirect update – CloudFormation immediately deploys your changes
- change sets – preview your changes first, then decide if you want to deploy
- âAWS SAM (Serverless Application Model) is an extension of CloudFormation for packaging, testing and deploying serverless applications
AWS Elastic Beanstalk
- provision infrastructure using a text-based template that describes exactly what resources are provisioned and their settings. Can use scripts to automate the creation of member accounts and VPCs.
Disaster Recovery (DR)
- DR approaches
- Backup and restore = lowest cost, just create backups
- Pilot Light = small part of core services that is running and syncing data or documents
- Warm Standby = scaled down version of a fully functional environment that is actively running
- Multi-site = on-prem and in AWS in an active-active configuration
- For disaster recovery in a different region, create a AMI from your EC2 instance and copy it into a 2nd region.
Amazon SQS
- ideal for solutions that must be durable and loosely coupled
- pull-based (use SNS for pushing messages, especially broadcasting to multiple services)
- Standard vs. FIFO: FIFO is very rigorous whereas Standard is best-effort. The trade-off is that Standard has unlimited throughput of transactions per sec.
- Short polling vs. Long polling = time to wait before polling again
- âShort polling is the default. When you poll the SQS, it doesn’t wait for messages to be available in the queue to respond. It checks a subset of servers for messages and may respond that nothing is available yet.
- Long polling waits for a message to be in the queue before responding, so it uses fewer total requests and reduces cost.
- batching adds efficiency
- SQS doesn’t prioritize items in the queue. If you need to prioritize use multiple queues, one for each priority type
- Max message size is 256kb (otherwise use S3 to log events), and max retention time of 14 days
- When a reader picks a message from the queue, the message stays in the queue but is invisible until the job is processed. If the visibility timeout occurs (job is not processed in time), then the message reappears in the queue for another reader to take.
- To use industry standards with Apache ActiveMQ, use an Amazon MQ instead of SQS (this is similar to using EKS instead of ECS, the industry-standard version of containers rather than the Amazon proprietary version)
Amazon SNS
- fully managed messaging service for pushing async notifications, especially used for broadcasting to multiple services
Amazon Kinesis
- for use cases that require ingestion of real-time data (e.g. IoT senor data)
- Kinesis data stream is made up of shards, which are made up of data records, which each have a sequence #. Then you map devices to partition keys which group data by shard.
Amazon API Gateway
- Throttling limits: you can configure a server-side throttling limit, a per-method throttling limit, a per-client throttling limit, and an account-level throttling limit.
- API Caching for a STAGE by specifying a TTL = time-to-live (by default 300 seconds).
Amazon CloudFront
- CloudFront distributes files from an origin. The origin can be an S3 bucket, EC2 instance, ELB, Route 53, or external.
- CloudFront+S3
- S3 can host a static website but not over HTTPS. For HTTPS use CloudFront+S3 instead.
- To prevent users accessing S3 content directly, create an origin access identity (OAI) which is a special CloudFront user and change S3 bucket permissions so that only the OAI can access. This is specific to CloudFront+S3.
- Lambda@Edge is a feature of CloudFront that lets you run code closer to users of your application, which improves performance and reduces latency
- Field-level encryption is a feature that applies extra encryption at edge locations to ensure sensitive data provided by the user (e.g. PII) is secured end-to-end
- Can be configured to load an error page (“content not found”) for operationally simple error handling
- Not just for static content, CloudFront is used for streaming content too
- Geo restriction (whitelist/blacklist access to content by country, e.g. due to copyright restrictions)
- Set the price class to US, Canada, Europe, etc. to determine where the content will be cached
- To only allow specific IP addresses to access content, CloudFront can use signed URLs or signed cookies which include an expiration timestamp, and the range of IP addresses of users who can access the content.
AWS Global Accelerator
- increases availability and performance
- can be expensive
- runs over AWS global network
- directs traffic to optimal endpoints across multiple regions
- By default, provides you with 2 static IP addresses that are anycast from the AWS edge network. You can migrate existing IPv4 (/24) IPs rather than creating new.
AWS STS (Security Token Service)
- request temporary limited-privilege credentials for IAM users, or for users that you authenticate such as federated users from an on-prem directory
- Federation: STS can be used Federation (typically with Azure AD). It uses SAML 2.0 for authentication to grant temporary access based on the AD creds
- Single Sign-On: STS can be used to develop a custom identity broker for SSO to a service such as the AWS management console:
- Verify that the user is authenticated on the local IDP (AD)
- Call STS AssumeRole or GetFederationToken API to get temp credentials
- Pass the temp creds to AWS federation endpoint to request a sign-in token
- Construct a URL to the service that includes the token which can be provided to the user
Amazon ECS
- Container management service for Docker containers
- Highly scalable / high performance, lets you run applications on an EC2 cluster
- ECS Launch Types
- Fargate Launch Type is serverless, managed by AWS
- EC2 Launch Type gives you direct access to the instances, but you have to manage them
- ECS uses the ECS Service Auto Scaling (aka Application Auto Scaling) service to scale tasks using a scaling policy that you configure.
- ECS is about tasks. You pay for the running time of tasks. For example, you can’t add container instances to an IAM group, you associate tasks with IAM roles/groups.