27. AI – Governance and QA

Bedrock Prompt Management

Agent Tracing

Evaluation Techniques

Responsible AI

CloudWatch

  • Log groups: arbitrary name, usually representing an application
  • Log stream: instances within application / log files / containers
  • Can define log expiration policies (never expire, 1 day to 10 years…)
  • CloudWatch Logs can send logs to:
    • Amazon S3 (exports)
    • Kinesis Data Streams
    • Kinesis Data Firehose
    • AWS Lambda
    • OpenSearch
  • Logs are encrypted by default
  • Can setup KMS-based encryption with your own keys
  • Sources
    • SDK, CloudWatch Logs Agent, CloudWatch Unified
      Agent
    • Elastic Beanstalk: collection of logs from application
    • ECS: collection from containers
    • AWS Lambda: collection from function logs
    • VPC Flow Logs: VPC specific logs
    • API Gateway
    • CloudTrail based on filter
    • Route53: Log DNS queries
  • CloudWatch Logs Insights
    • Search and analyze log data stored in CloudWatch Logs
    • Example: find a specific IP inside a log, count occurrences of “ERROR” in your logs…
    • Provides a purpose-built query language
    • Automatically discovers fields from AWS services and JSON log events
    • Fetch desired event fields, filter based on conditions, calculate aggregate statistics, sort events, limit number of events…
    • Can save queries and add them to CloudWatch Dashboards
    • Can query multiple Log Groups in different AWS accounts
    • It’s a query engine, not a real-time engine
  • S3 Export
    • Log data can take up to 12 hours to become available for export
    • The API call is CreateExportTask
    • Not near-real time or real-time… use Logs Subscriptions instead
  • CloudWatch Logs Subscriptions
    • Get a real-time log events from CloudWatch Logs for processing and analysis
    • Send to Kinesis Data Streams, Kinesis Data Firehose, or Lambda
    • Subscription Filter – filter which logs are events delivered to your destination
  • CloudWatch Alarms
    • Alarms are used to trigger notifications for any metric
    • Various options (sampling, %, max, min, etc…)
    • Alarm States:
      • OK
      • INSUFFICIENT_DATA
      • ALARM
    • Period:
      • Length of time in seconds to evaluate the metric
      • High resolution custom metrics: 10 sec, 30 sec or multiples of 60 sec
    • Targets
      • Stop, Terminate, Reboot, or Recover an EC2 Instance
      • Trigger Auto Scaling Action
      • Send notification to SNS (from which you can do pretty much anything)
    • Composite Alarms
      • Composite Alarms are monitoring the states of multiple other alarms
      • AND and OR conditions
      • Helpful to reduce “alarm noise” by creating complex composite alarms
    • CASE: EC2 Instance Recovery
      • Status Check:
        • Instance status = check the EC2 VM
        • System status = check the underlying hardware
        • Attached EBS status = check attached EBS volumes
      • Recovery: Same Private, Public, Elastic IP, metadata, placement group
    • Alarms can be created based on CloudWatch Logs Metrics Filters
    • To test alarms and notifications, set the alarm state to Alarm using CLI
      aws cloudwatch set-alarm-state --alarm-name "myalarm" --state-value ALARM --state-reason "testing purposes"
  • CloudWatch and GenAI
    • Testing prompt regression
    • CloudWatch logs
      • Prompt inputs and model responses
      • Foundational to monitoring and troubleshooting
    • Monitor KPI’s
      • Prompt effectiveness / response quality
      • Latency
      • Error rates
    • Other Monitors
      • Foundation model interaction tracing
      • Business impact metrics
      • Prompt effectiveness
      • Hallucination rates
      • Anomaly detection
        • Token burst patterns
        • Response drift
      • Bedrock model invocation logs
      • Cost anomaly detection
    • CloudWatch Real User Monitoring (RUM)
      • Mostly for testing mobile apps (iOS or Android)
        • Measures page load times, errors, app launch times, etc.
        • From a real user session
      • Integrates with Application Signals
        • View results in X-Ray traces
      • Relevant for measuring end to end performance of your mobile GenAI apps

CloudTrail

  • Provides governance, compliance and audit for your AWS Account
  • CloudTrail is enabled by default!
  • Get an history of events / API calls made within your AWS Account by:
    • Console
    • SDK
    • CLI
    • AWS Services
  • Can put logs from CloudTrail into CloudWatch Logs or S3
  • A trail can be applied to All Regions (default) or a single Region.
  • If a resource is deleted in AWS, investigate CloudTrail first!
  • CloudTrail Events
    • Management Events:
      • Operations that are performed on resources in your AWS account
      • Examples:
        • Configuring security (IAM AttachRolePolicy)
        • Configuring rules for routing data (Amazon EC2 CreateSubnet)
        • Setting up logging (AWS CloudTrail CreateTrail)
      • By default, trails are configured to log management events.
      • Can separate Read Events (that don’t modify resources) from Write Events (that may modify resources)
    • Data Events:
      • By default, data events are not logged (because high volume operations)
      • Amazon S3 object-level activity (ex: GetObject, DeleteObject, PutObject): can separate Read and Write Events
      • AWS Lambda function execution activity (the Invoke API)
    • Retention
      • Events are stored for 90 days in CloudTrail
      • To keep events beyond this period, log them to S3 and use Athena
  • CloudTrail Insights
    • Enable CloudTrail Insights to detect unusual activity in your account:
      • inaccurate resource provisioning
      • hitting service limits
      • Bursts of AWS IAM actions
      • Gaps in periodic maintenance activity
    • CloudTrail Insights analyzes normal management events to create a baseline
    • And then continuously analyzes write events to detect unusual patterns
      • Anomalies appear in the CloudTrail console
      • Event is sent to Amazon S3
      • An EventBridge event is generated (for automation needs)
  • CloudTrail and GenAI
    • CloudTrail can track all API calls to Amazon Bedrock
      • Audit trails of which prompts were used
      • When and by who
    • This is often a compliance requirement

X-Ray

Lake Formation

aaaaa

  • aaaa
  • aaaaa