temporalio · donald-pinckney · Feb 13, 2026 · Feb 13, 2026 · Feb 13, 2026 · Feb 13, 2026
diff --git a/references/core/ai-integration.md b/references/core/ai-integration.md
@@ -4,7 +4,9 @@
 
 Temporal provides durable execution for AI/LLM applications, handling retries, rate limits, and long-running operations automatically. These patterns apply across languages, with Python being the most mature for AI integration.
 
-For Python-specific implementation details and code examples, see `references/python/ai-patterns.md`.
+For Python-specific implementation details and code examples, see `references/python/ai-patterns.md`. Temporal's Python SDK also provides pre-built integrations with several LLM and agent SDKs, which can be leveraged to create agentic workflows with minimal effort (when working in Python).
+
+The remainder of this document describes general principles to follow when building AI/LLM applications in Temporal, particularly when from scratch instead of with an integration.
 
 ## Why Temporal for AI?
 
@@ -19,28 +21,24 @@ For Python-specific implementation details and code examples, see `references/py
 
 ## Core Patterns
 
-### Pattern 1: Generic LLM Activity
-
-Create flexible, reusable activities for LLM calls:
+### Pattern 1: Activities should Wrap LLM Calls
 
-```
-Activity: call_llm_generic(
-    model: string,
-    system_instructions: string,
-    user_input: string,
-    tools?: list,
-    response_format?: schema
-) -> response
-```
+- activity: call_llm
+  - inputs:
+    - model_id -> internally activity can route to different models, so we don't need 1 activity per unique model.
+    - prompt / chat history
+    - tools
+    - etc.
+  - returns model response, as a typed structured output
 
 **Benefits**:
 - Single activity handles multiple use cases
 - Consistent retry handling
 - Centralized configuration
 
-### Pattern 2: Activity-Based Separation
+### Pattern 2: Non-deterministic / heavy tools in Activities
 
-Isolate each operation in its own activity:
+Tools which are non-deterministic and/or heavy actions (file system, hitting APIs, etc.) should be placed in activities:
 
 ```
 Workflow:
@@ -55,55 +53,32 @@ Workflow:
 - Easier testing and mocking
 - Failure isolation
 
-### Pattern 3: Centralized Retry Management
+### Pattern 3: Tools that Mutate Agent State can be in the Workflow directly
 
-**Critical**: Disable retries in LLM client libraries, let Temporal handle retries.
+Generally, agent state is in bijection with workflow state. Thus, tools which mutate agent state and are deterministic (like TODO tools, just updating a hash map) typically belong in the workflow code rather than an activity.
 
 ```
-LLM Client Config:
-  max_retries = 0  ← Disable client retries
-
-Activity Retry Policy:
-  initial_interval = 1s
-  backoff_coefficient = 2.0
-  maximum_attempts = 5
-  maximum_interval = 60s
+Workflow:
+  ├── Activity: call_llm (tool selection: todos_write tool)
+  ├── Write new TODOs to workflow state (not in activity)
+  └── Activity: call_llm (continuing agent flow...)
 ```
 
+### Pattern 4: Centralized Retry Management
+
+Disable retries in LLM client libraries, let Temporal handle retries.
+
+- LLM Client Config:
+  - max_retries = 0  ← Disable client retries at the LLM client level
+
+Use either the default activity retry policy, or customize it as needed for the situation.
+
 **Why**:
 - Temporal retries are durable (survive crashes)
 - Single retry configuration point
 - Better visibility into retry attempts
 - Consistent backoff behavior
 
-### Pattern 4: Tool-Calling Agent
-
-Three-phase workflow for LLM agents with tools:
-
-```
-┌─────────────────────────────────────────────┐
-│ Phase 1: Tool Selection                      │
-│   Activity: Present tools to LLM             │
-│   LLM returns: tool_name, arguments          │
-└─────────────────────────────────────────────┘
-                    │
-                    ▼
-┌─────────────────────────────────────────────┐
-│ Phase 2: Tool Execution                      │
-│   Activity: Execute selected tool            │
-│   (Separate activity per tool type)          │
-└─────────────────────────────────────────────┘
-                    │
-                    ▼
-┌─────────────────────────────────────────────┐
-│ Phase 3: Result Interpretation               │
-│   Activity: Send results back to LLM         │
-│   LLM returns: final response or next tool   │
-└─────────────────────────────────────────────┘
-                    │
-                    ▼
-        Loop until LLM returns final answer
-```
 
 ### Pattern 5: Multi-Agent Orchestration
 
@@ -127,21 +102,6 @@ Deep Research Example:
 
 **Key Pattern**: Use parallel execution with `return_exceptions=True` to continue with partial results when some searches fail.
 
-### Pattern 6: Structured Outputs
-
-Define schemas for LLM responses:
-
-```
-Input: Raw LLM prompt
-Schema: { action: string, confidence: float, reasoning: string }
-Output: Validated, typed response
-```
-
-**Benefits**:
-- Type safety
-- Automatic validation
-- Easier downstream processing
-
 ## Timeout Recommendations
 
 | Operation Type | Recommended Timeout |
@@ -165,27 +125,14 @@ Output: Validated, typed response
 
 Parse rate limit info from API responses:
 
-```
-Response Headers:
-  Retry-After: 30
-  X-RateLimit-Remaining: 0
-
-Activity:
-  If rate limited:
-    Raise retryable error with retry_after hint
-    Temporal handles the delay
-```
-
-### Retry Policy Configuration
+- Response Headers:
+  - Retry-After: 30
+  - X-RateLimit-Remaining: 0
 
-```
-Retry Policy:
-  initial_interval: 1s (or from Retry-After header)
-  backoff_coefficient: 2.0
-  maximum_interval: 60s
-  maximum_attempts: 10
-  non_retryable_errors: [InvalidAPIKey, InvalidInput]
-```
+- Activity:
+  - If rate limited:
+    - Raise retryable error with retry_after hint
+    - Temporal handles the delay
 
 ## Error Handling
 
@@ -209,15 +156,13 @@ Retry Policy:
 4. **Use structured outputs** - For type safety and validation
 5. **Handle partial failures** - Continue with available results
 6. **Monitor costs** - Track LLM calls at activity level
-7. **Version prompts** - Track prompt changes in code
-8. **Test with mocks** - Mock LLM responses in tests
+7. **Test with mocks** - Mock LLM responses in tests
 
 ## Observability
 
-- **Activity duration**: Track LLM latency
-- **Retry counts**: Monitor rate limiting
-- **Token usage**: Log in activity output
-- **Cost attribution**: Tag workflows with cost centers
+See `references/python/observability.md` (or the language you are working in) for documentation on observability in Temporal. It is generally recommended to add observability for:
+- Token usage, via activity logging
+- any else to help track LLM usage and debug agentic flows, within moderation.
 
 ## Language-Specific Resources
 

diff --git a/references/core/common-gotchas.md b/references/core/common-gotchas.md
@@ -2,9 +2,7 @@
 
 Common mistakes and anti-patterns in Temporal development. Learning from these saves significant debugging time.
 
-## Idempotency Issues
-
-### Non-Idempotent Activities
+## Non-Idempotent Activities
 
 **The Problem**: Activities may execute more than once due to retries or Worker failures. If an activity calls an external service without an idempotency key, you may charge a customer twice, send duplicate emails, or create duplicate records.
 
@@ -14,38 +12,30 @@ Common mistakes and anti-patterns in Temporal development. Learning from these s
 
 **The Fix**: Always use idempotency keys when calling external services. Use the workflow ID, activity ID, or a domain-specific identifier (like order ID) as the key.
 
-### Local Activities
-
-Local Activities skip the task queue for lower latency, but they're still subject to retries. The same idempotency rules apply.
-
-## Replay Safety Violations
+**Note:** Local Activities skip the task queue for lower latency, but they're still subject to retries. The same idempotency rules apply.
 
-### Side Effects in Workflow Code
+## Side Effects & Non-Determinism in Workflow Code
 
-**The Problem**: Code in workflow functions runs on first execution AND on every replay. Any side effect (logging, notifications, metrics) will happen multiple times.
+**The Problem**: Code in workflow functions runs on first execution AND on every replay. Any side effect (logging, notifications, metrics, etc.) will happen multiple times and non-deterministic code (IO, current time, random numbers, threading, etc.) won't replay correctly.
 
 **Symptoms**:
+- Non-determinism errors
+- Sandbox violations, depending on SDK language
 - Duplicate log entries
 - Multiple notifications for the same event
 - Inflated metrics
 
 **The Fix**:
-- Use the SDK's replay-aware logger (only logs on first execution)
-- Put all side effects in Activities
-
-### Non-Deterministic Time
+- Use Temporal replay-aware managed side effects for common, non-business logic cases:
+    - Temporal workflow logging
+    - Temporal date time (`workflow.now()` in Python, `Date.now()` is auto-replaced in TypeScript)
+    - Temporal UUID generation
+    - Temporal random number generation
+- Put all other side effects in Activities
 
-**The Problem**: Using system time (`datetime.now()`, `Date.now()`) in workflow code returns different values on replay, causing non-determinism errors.
+See `references/core/determinism.md` for more info.
 
-**Symptoms**:
-- Non-determinism errors mentioning time-based decisions
-- Workflows that worked once but fail on replay
-
-**The Fix**: Use the SDK's deterministic time function (`workflow.now()` in Python, `Date.now()` is auto-replaced in TypeScript).
-
-## Worker Management Issues
-
-### Multiple Workers with Different Code
+## Multiple Workers with Different Code
 
 **The Problem**: If Worker A runs part of a workflow with code v1, then Worker B (with code v2) picks it up, replay may produce different Commands.
 
@@ -55,70 +45,45 @@ Local Activities skip the task queue for lower latency, but they're still subjec
 
 **The Fix**:
 - Use Worker Versioning for production deployments
+- Use patching APIs
 - During development: kill old workers before starting new ones
 - Ensure all workers run identical code
 
-### Stale Workflows During Development
-
-**The Problem**: Workflows started with old code continue running after you change the code.
+**Note:** Workflows started with old code continue running after you change the code, which can then induce the above issues. During development (NOT production), you may want to terminate stale workflows (`temporal workflow terminate --workflow-id <id>`), or use `find-stalled-workflows.sh` included in this skill to detect stuck workflows.
 
-**Symptoms**:
-- Workflows behave unexpectedly after code changes
-- Non-determinism errors on previously-working workflows
-
-**The Fix**:
-- Terminate stale workflows: `temporal workflow terminate --workflow-id <id>`
-- Use `find-stalled-workflows.sh` to detect stuck workflows
-- In production, use versioning for backward compatibility
-
-## Workflow Design Anti-Patterns
-
-### The Mega Workflow
-
-**The Problem**: Putting too much logic in a single workflow.
-
-**Issues**:
-- Hard to test and maintain
-- Event history grows unbounded
-- Single point of failure
-- Difficult to reason about
-
-**The Fix**:
-- Keep workflows focused on a single responsibility
-- Use Child Workflows for sub-processes
-- Use Continue-as-New for long-running workflows
+See `references/core/versioning.md` for more info.
 
-### Failing Too Quickly
+## Failing Activities Too Quickly
 
-**The Problem**: Using aggressive retry policies that give up too easily.
+**The Problem**: Using aggressive activity retry policies that give up too easily.
 
 **Symptoms**:
 - Workflows failing on transient errors
 - Unnecessary workflow failures during brief outages
 
-**The Fix**: Use appropriate retry policies. Let Temporal handle transient failures with exponential backoff. Reserve `maximum_attempts=1` for truly non-retryable operations.
+**The Fix**: Use appropriate activity retry policies. Let Temporal handle transient failures with exponential backoff. Reserve `maximum_attempts=1` for truly non-retryable operations.
 
-## Query Handler Mistakes
+## Query Handler & Update Validator Mistakes
 
-### Modifying State in Queries
+### Modifying State in Queries & Update Validators
 
-**The Problem**: Queries are read-only. Modifying state in a query handler causes non-determinism on replay because queries don't generate history events.
+**The Problem**: Queries and update validators are read-only. Modifying state causes non-determinism on replay, and must strictly be avoided.
 
 **Symptoms**:
 - State inconsistencies after workflow replay
 - Non-determinism errors
 
-**The Fix**: Queries must only read state. Use Updates for operations that need to modify state AND return a result.
+**The Fix**: Queries and update validators must only read state. Use Updates for operations that need to modify state AND return a result.
 
-### Blocking in Queries
+### Blocking in Queries & Update Validators
 
-**The Problem**: Queries must return immediately. They cannot await activities, child workflows, timers, or conditions.
+**The Problem**: Queries and update validators must return immediately. They cannot await activities, child workflows, timers, or conditions.
 
 **Symptoms**:
-- Query timeouts
+- Query / update validators timeouts
 - Deadlocks
 
-**The Fix**: Queries return current state only. Use Signals or Updates to trigger async operations.
+**The Fix**: Queries and update validators must only look at current state. Use Signals or Updates to trigger async operations.
 
 ### Query vs Signal vs Update
 

diff --git a/references/core/determinism.md b/references/core/determinism.md
@@ -8,7 +8,7 @@ Temporal workflows must be deterministic because of **history replay** - the mec
 
 ### The Replay Mechanism
 
-When a Worker needs to restore workflow state (after crash, cache eviction, or continuing after a long timer), it **re-executes the workflow code from the beginning**. But instead of re-running activities, it uses results stored in the Event History.
+When a Worker needs to restore workflow state (after crash, cache eviction, or continuing after a long timer), it **re-executes the workflow code from the beginning**. But instead of re-running external actions, it uses results stored in the Event History.
 
 ```
 Initial Execution:
@@ -22,7 +22,7 @@ Replay (Recovery):
 
 ### Commands and Events
 
-Every workflow operation generates a Command that becomes an Event:
+Every workflow operation generates a Command that becomes an Event, here are some examples:
 
 | Workflow Code | Command Generated | Event Stored |
 |--------------|-------------------|--------------|
@@ -95,6 +95,24 @@ Math.random()   // Returns seeded PRNG value
 new Date()      // Deterministic
 ```
 
+### Go `workflowcheck` static analyzer
+The Go SDK provides a workflowcheck CLI tool that:
+- Statically analyzes registered Workflow Definitions and their call graph
+- Flags common sources of non-determinism (e.g., time.Now, time.Sleep, goroutines, channels, map iteration, global math/rand, stdio)
+- Helps catch invalid constructs early in development, but cannot detect all issues (e.g., global var mutation, some reflection)
+
+```bash
+# Install
+go install go.temporal.io/sdk/contrib/tools/workflowcheck@latest
+
+# Run from your module root to scan all packages
+workflowcheck ./...
+
+# Optional: configure overrides / skips in workflowcheck.config.yaml
+# (e.g., mark a function as deterministic or skip files)
+workflowcheck -config workflowcheck.config.yaml ./...
+```
+
 ## Detecting Non-Determinism
 
 ### During Execution