How To Deploy with CDK

Note: All paths in this guide are relative to the repository root unless otherwise noted. Configuration files are located in the iac-cdk/ directory (e.g., iac-cdk/bin/config.yaml).

Pre-requisites

AWS Account Setup:
- AWS account with appropriate permissions
- Permissions to CDK bootstrap the account in the deployment region (cd iac-cdk && cdk bootstrap)
Local Requirements:
- Node.js (version 20 recommended)
- AWS CLI configured with appropriate authentication tokens
- zip command available (pre-installed on macOS and most Linux distributions)

Note: Docker and Finch are not required. All container image builds, pip layer builds, and the React frontend build run remotely on AWS CodeBuild.

Deployment

ℹ️ For detailed CDK commands, refer to the official documentation.

Makefile

⚠️ Use the Makefile shortcuts to make sure that you execute required scripts before deployment:

make deploy PROFILE=user-profile

The make deploy command runs a three-phase deployment:

Phase 1 — BuilderStack: Deploys CodeBuild projects, ECR repositories, and S3 artifact buckets
Phase 2 — CodeBuild builds: Triggers all CodeBuild builds in parallel (Docker images, pip layers, React frontend) and waits for completion. Only rebuilds projects whose source has changed.
Phase 3 — AcaStack: Deploys the application stack, consuming pre-built artifacts from Phase 2

To destroy the stacks:

make destroy PROFILE=user-profile

CDK Configuration

The CDK stack is configured through the SystemConfig interface defined in iac-cdk/lib/shared/types.ts. Configuration is loaded from iac-cdk/bin/config.yaml, with fallback to default values in iac-cdk/bin/config.ts.

Configuration Structure

prefix: Resource naming prefix (e.g., "dev", "prod")
enableGeoRestrictions: Boolean flag for geographic access restrictions
allowedGeoRegions: Array of allowed geographic regions when restrictions are enabled
dataProcessingParameters: (Optional) Configuration for data processing workflows including file prefixes and language settings. If omitted, the data processing pipeline will not be deployed.
knowledgeBaseParameters: (Optional) Knowledge base configuration including chunking strategies (FIXED_SIZE, HIERARCHICAL, SEMANTIC, NONE), embedding models, and descriptions. If omitted, the Knowledge Base feature will not be deployed.
supportedModels: Map of foundation model names to Bedrock model identifiers used by agents, with [REGION-PREFIX] placeholder for cross-region inference profiles
rerankingModels: (Optional) Map of reranking model names to Bedrock model identifiers for improving knowledge base retrieval relevance. Supported models include Cohere Rerank 3.5 and Amazon Rerank 1.0.
toolRegistry: Array of available tools with name, description, and sub-agent invocation flags
mcpServerRegistry: Array of available mcp servers with name, description, and URL
ingestionLambdaProps: Lambda function configuration for chatbot message ingestion including timeout in minutes and reserved concurrency (optional)
agentCoreObservability: (Optional) Configuration for X-Ray distributed tracing of agent invocations, including transaction search and trace indexing percentage. If omitted, observability features are not enabled. See Observability & Insights.
agentRuntimeConfig: (Optional) Default agent runtime configuration to deploy via CDK. If provided, an AgentCore runtime will be automatically created during deployment with the specified settings. If omitted, agent runtimes must be created manually through the Agent Factory UI.
evaluatorConfig: (Optional) Configuration for the LLM-based evaluation framework, including supported models, pass threshold, and default rubrics. Defaults are provided in config.ts.
experimentsConfig: (Optional) Configuration for synthetic data generation, including supported models, VPC settings, and Batch infrastructure toggle. See Experiments Configuration for details. Defaults are provided in config.ts.
bedrockAccessRoleArn: (Optional) IAM role ARN for cross-account Amazon Bedrock access.

Example of Configuration File

prefix: temp
enableGeoRestrictions: false
allowedGeoRegions: []
dataProcessingParameters:
    inputPrefix: inputs
    dataSourcePrefix: knowledge-base-data-source
    processingPrefix: processing
    stagingMidfix: input
    transcribeMidfix: transcribe
    languageCode: en-US
knowledgeBaseParameters:
    chunkingStrategy:
        type: HIERARCHICAL
        hierarchicalChunkingProps:
            overlapTokens: 60
            maxParentTokenSize: 1500
            maxChildTokenSize: 300
    embeddingModel:
        modelId: amazon.titan-embed-text-v2:0
        vectorDimension: 1024
    dataSourcePrefix: knowledge-base-data-source
    description: Knowledge Base that contains resources on AWS services.
supportedModels:
    Claude Sonnet 4.6: "[REGION-PREFIX].anthropic.claude-sonnet-4-6"
    Claude Haiku 4.5: "[REGION-PREFIX].anthropic.claude-haiku-4-5-20251001-v1:0"
    Nova 2 Lite: "[REGION-PREFIX].amazon.nova-2-lite-v1:0"
    GPT OSS 20B: "openai.gpt-oss-20b-1:0"
rerankingModels:
    Cohere Rerank 3.5: cohere.rerank-v3-5:0
    Amazon Rerank 1.0: amazon.rerank-v1:0
toolRegistry:
    - name: "get_current_time"
      description: "Get the current date and time in the specified timezone. Helpful when user refers to relative time (yesterday, today, this year, now, etc.)"
      invokesSubAgent: false
    - name: "invoke_subagent"
      description: "Invoke a sub-agent to handle specialized tasks or domain-specific queries that require dedicated processing"
      invokesSubAgent: true
ingestionLambdaProps:
    timeoutInMinutes: 3
    reservedConcurrency: 20
agentCoreObservability:
    enableTransactionSearch: false
    indexingPercentage: 10
mcpServerRegistry:
    - name: pubmed_mcp
      runtimeId: mcp_pubmed_server-yourid
      qualifier: DEFAULT
      description: A Model Context Protocol (MCP) server that provides tools for searching, retrieving, and exploring biomedical literature from PubMed via NCBI E-utilities.
agentRuntimeConfig:
    modelInferenceParameters:
        modelId: us.amazon.nova-2-lite-v1:0
        parameters:
            maxTokens: 2000
            temperature: 0.9
    instructions: |
        You an agent who is create at making jokes.
        Your answer should contain the joke inside <final></final> XML tags.
        If the user does not specify a topic, ask for it before generating the joke.
    description: Testing Agent CDK deployment with a joke maker
    tools: []
    toolParameters: {}
    mcpServers: []
    conversationManager: sliding_window

⚠️ If you have enabled already transaction search in the account where you want to deploy the stack, set enableTransactionSearch to false otherwise the deployment will fail.

Deployment Scenarios

The Agentic Chatbot Accelerator supports flexible deployment configurations based on your use case:

Full Deployment (with Knowledge Base)

Include both dataProcessingParameters and knowledgeBaseParameters in your configuration to deploy the complete solution with document processing and knowledge base capabilities. This enables:

Document upload and processing pipeline
Knowledge base creation and management from the UI
RAG (Retrieval-Augmented Generation) capabilities for agents

Minimal Deployment (without Knowledge Base)

Omit both dataProcessingParameters and knowledgeBaseParameters from your configuration to deploy a lightweight version focused only on agent management:

prefix: dev
enableGeoRestrictions: false
allowedGeoRegions: []
# dataProcessingParameters: omitted
# knowledgeBaseParameters: omitted
supportedModels:
    Claude Haiku 3.5: "[REGION-PREFIX].anthropic.claude-3-5-haiku-20241022-v1:0"
    # ... other models
toolRegistry:
  - name: "get_current_time"
    description: "Get the current date and time"
    invokesSubAgent: false
ingestionLambdaProps:
    timeoutInMinutes: 3

This configuration:

Deploys only the Agent Factory and chatbot interface
Hides Knowledge Base-related navigation items in the UI
Reduces deployment complexity and resource footprint
Is ideal for use cases that don't require RAG capabilities or when using external knowledge sources via MCP servers

Pre-configured Agent Runtime (via CDK)

Include agentRuntimeConfig in your configuration to automatically deploy an agent runtime during CDK deployment:

Eliminates manual configuration through Agent Factory UI after deployment
Useful for standardized deployments or CI/CD pipelines
The runtime will be automatically created when the stack is deployed
CDK-owned runtimes are protected from cleanup handler deletion

agentRuntimeConfig:
    modelInferenceParameters:
        modelId: "[REGION-PREFIX].anthropic.claude-3-5-haiku-20241022-v1:0"
        parameters:
            temperature: 0.5
            maxTokens: 4096
    instructions: "Your system prompt here"
    tools: ["get_current_time"]
    toolParameters: {}
    mcpServers: []
    conversationManager: "sliding_window"
    description: "Optional description"
    memoryCfg:
        retentionDays: 30
    lifecycleCfg:
        idleRuntimeSessionTimeoutInMinutes: 30
        maxLifetimeInHours: 24

The agentRuntimeConfig supports the following properties:

Property	Required	Description
`modelInferenceParameters`	Yes	Model configuration including `modelId` and `parameters` (temperature, maxTokens, stopSequences)
`instructions`	Yes	System prompt defining the agent's behavior
`tools`	Yes	Array of tool names from `toolRegistry`
`toolParameters`	Yes	Tool-specific configuration (can be empty `{}`)
`mcpServers`	Yes	Array of MCP server names (can be empty `[]`)
`conversationManager`	Yes	Strategy: `"sliding_window"`, `"summarization"`, or `"none"`
`description`	No	Optional description of the runtime
`memoryCfg`	No	Memory persistence configuration with `retentionDays`
`lifecycleCfg`	No	Lifecycle settings with `idleRuntimeSessionTimeoutInMinutes` and `maxLifetimeInHours`

Experiments Configuration (VPC & Batch)

The experiments feature uses AWS Batch (Fargate) to run synthetic test case generation. By default, this creates a new VPC with NAT gateways. If your account has limited VPC permissions, you can either provide an existing VPC or disable Batch infrastructure entirely.

`deployBatchInfrastructure`	`vpcId`	Result
`true` (default)	omitted	New VPC + Batch created automatically
`true` (default)	provided	Batch uses the existing VPC; no new VPC created
`false`	any	Batch infrastructure not deployed; experiment CRUD still works but automated generation is disabled; the Experiments page is hidden in the UI

Using an existing VPC:

experimentsConfig:
    supportedModels:
        Claude Haiku 4.5: "[REGION-PREFIX].anthropic.claude-haiku-4-5-20251001-v1:0"
    vpcId: "vpc-0123456789abcdef0"

⚠️ The provided VPC must have private subnets with NAT gateway access so Fargate tasks can pull container images and reach AWS APIs.

Disabling Batch infrastructure:

experimentsConfig:
    supportedModels:
        Claude Haiku 4.5: "[REGION-PREFIX].anthropic.claude-haiku-4-5-20251001-v1:0"
    deployBatchInfrastructure: false

This configuration:

Skips VPC and AWS Batch resource creation entirely
Experiment CRUD operations (create, list, update, delete) remain functional
The runExperiment mutation (automated test case generation) is unavailable
The "Experiments Generator" page is hidden from the UI navigation

Post-Deployment Steps

Note the outputs: CDK will display important information such as the CloudFront URL where the web application is hosted
Create Cognito user: Add users to the generated User Pool
Access application: Use the CloudFront URL from deployment outputs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How To Deploy with CDK

Pre-requisites

Deployment

Makefile

CDK Configuration

Configuration Structure

Example of Configuration File

Deployment Scenarios

Full Deployment (with Knowledge Base)

Minimal Deployment (without Knowledge Base)

Pre-configured Agent Runtime (via CDK)

Experiments Configuration (VPC & Batch)

Post-Deployment Steps

FilesExpand file tree

how-to-deploy.md

Latest commit

History

how-to-deploy.md

File metadata and controls

How To Deploy with CDK

Pre-requisites

Deployment

Makefile

CDK Configuration

Configuration Structure

Example of Configuration File

Deployment Scenarios

Full Deployment (with Knowledge Base)

Minimal Deployment (without Knowledge Base)

Pre-configured Agent Runtime (via CDK)

Experiments Configuration (VPC & Batch)

Post-Deployment Steps