AI Agents on AWS
A Beginner's Guide to Building Production-Grade AI Agents
The complete guide to building, deploying, and operating autonomous AI agents on Amazon Web Services — from your first Hello World agent to governed multi-agent systems in production.
Is This Book For You?
This book is for you if:
- You've built chatbots or RAG apps and want to go further — into agents that reason, plan, and act autonomously
- You want to deploy agents to production on AWS, not just run them in a notebook
- You need to understand MCP, A2A, and multi-agent patterns for real enterprise work
- You care about governance, observability, and security — not just making things work
This book is NOT for you if:
- You want a high-level overview of AI without writing code
- You're looking for a general machine learning or LLM fine-tuning book
- You don't have basic Python and AWS familiarity
What's Inside
- Chapter 1 Understanding AI Agents on AWS
This chapter introduces the foundational concepts of AI agents, explaining how they differ from traditional LLM applications that follow rigid logic and RAG systems that enhance responses with retrieved context. You will learn about the core components that define an agent — reasoning, memory, tool use, and planning — and how these capabilities enable autonomous, goal-driven behavior. The chapter maps out the AWS Agentic Stack spanning Amazon Q, Amazon Bedrock, and Amazon SageMaker, introduces the MCP and A2A communication protocols for tool discovery and agent collaboration, and walks you through setting up your development environment and building your first Hello World agent with the Strands SDK.
- Chapter 2 Building Agents with Tools
This chapter teaches you how to give agents the ability to interact with the real world through tools. You will learn to transform Python functions into agent-accessible tools using the @tool decorator, work with pre-built tools from the Strands ecosystem for file operations, web search, and AWS service integration, and connect to external services through MCP. The chapter covers multi-tool orchestration where the agent autonomously determines which tools to call and in what order, and addresses advanced patterns including shared database connections via class-based tools and async tools for parallel execution.
- Chapter 3 Agent Memory
This chapter explores how to give agents the ability to retain information across interactions. You will learn about short-term memory for maintaining conversation context and long-term memory for learning user preferences and retaining knowledge across sessions. The chapter covers memory implementation using AWS services such as DynamoDB, Redis, and Amazon Bedrock AgentCore Memory, and walks you through building a personalized agent that remembers individual user preferences using Mem0 and the Strands SDK. It also covers context management strategies, memory optimization techniques, and the challenges of implementing memory at scale.
- Chapter 4 Advanced Agent Architecture Patterns
This chapter moves beyond single agents to multi-agent systems where specialized agents collaborate to solve complex problems. You will learn how to coordinate multiple agents through supervisor-worker patterns where a manager delegates to specialists, swarm patterns for decentralized collaboration, and graph patterns for structured workflows. The chapter explores the agent-as-tool pattern where one agent wraps another as a callable tool, and includes hands-on examples culminating in a research supervisor system that orchestrates multiple agents to accomplish complex, multi-step tasks.
- Chapter 5 Agent Communication
This chapter takes a deep dive into the protocols that enable agents to discover tools and collaborate with other agents at scale. You will build MCP servers and clients from scratch using FastMCP, learning how tools, resources, and prompt templates work within the protocol. The chapter then covers the A2A protocol in detail, including agent cards for capability advertisement, the four execution modes (synchronous, asynchronous, streaming, and push notifications), and how to implement multi-agent systems where a travel orchestrator coordinates weather and flight agents using A2A with the Strands SDK.
- Chapter 6 Production Deployment and Enterprise Integration
This chapter bridges the gap between a working prototype and a production system. You will deploy agents on AWS Lambda for event-driven workloads, Amazon ECS with Fargate for always-on multi-turn agents, and Amazon Bedrock AgentCore Runtime for managed, session-isolated deployments with built-in identity, policy, and gateway services. The chapter provides a clear decision framework for choosing the right deployment path and covers enterprise integration patterns for databases, external APIs, and event-driven architectures, with real-world use cases from financial services, healthcare, and manufacturing.
- Chapter 7 Evaluation, Observability, and AI Governance
This chapter covers the final pieces needed to run agents responsibly in production. The evaluation section introduces agent evaluation concepts including built-in and custom evaluators, LLM-as-a-judge scoring, and walks through a complete hands-on lab using Amazon Bedrock AgentCore Evaluations. The observability section covers structured tracing using OpenTelemetry with Jaeger for local development, Langfuse and LangSmith for production analysis, and AgentCore Observability for platform-level monitoring. The governance section covers Amazon Bedrock Guardrails for content filtering, AgentCore Policy with Cedar for deterministic tool-call authorization that prompt injection cannot bypass, and organizational frameworks including model cards, audit trails, and incident response plans.