Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions .claude/skills/update-source-references/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,26 @@ For all mdx/md files in @docs/canary-checker/ and @docs/mission-control/ that ha

2. For each documented struct, compare ALL public fields from the Go source against the documentation and:
- Add any missing fields
- Fix incorrect field names (check json/yaml tags - use the json/yaml tag name, not the Go field name)
- If json/yaml tag differ from each other, warn user
- Fix incorrect field names (check json/yaml tags - use the json tag name, not the Go field name)
- If json/yaml tags differ from each other, prefer the json tag and warn user
- Fix incorrect schemes/types (e.g., `Duration` vs `int`, `bool` vs `string`)
- Fix incorrect nested structures (check if fields are inline or nested under a parent key)
- Remove fields that don't exist in the Go struct
- For inline embedded structs, verify which fields they provide

3. For \_canary-spec.mdx, ensure all check types from CanarySpec are listed with correct field names matching the json/yaml tags
3. **For nested struct types (like `ExecConnections`, `GitConnection`, etc.), you MUST:**
- Find the actual struct definition in the codebase (may be in different packages like `duty/connection/`)
- Document ALL fields from that struct, not just the ones currently in docs
- Follow type references across packages to get complete field lists

4. For _canary-spec.mdx, ensure all check types from CanarySpec are listed with correct field names matching the json tags

Pay attention to:

- yaml tags like `yaml:"env"` mean the field name in docs should be `env`, not the Go field name
- Use json tags as the canonical field name (e.g., `json:"env"` means field name in docs should be `env`)
- If yaml and json tags differ, use json tag and warn the user about the discrepancy
- Inline embedded structs (e.g., `Connection`, `Description`, `Templatable`) - their fields appear at the same level
- Pointer vs value types for nested structs
- Deprecated fields should be marked as such
- ignore private fields
- Ignore private fields
- Connection types may be defined in `modules/duty/connection/` not just in the check's own file - always trace the import path to find the actual struct definition
106 changes: 106 additions & 0 deletions canary-checker/docs/concepts/distributed-canaries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
---
title: Distributed Canaries
sidebar_custom_props:
icon: network
sidebar_position: 6
---

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add that this will only work for mission control

Distributed canaries allow you to define a check once and have it automatically run on multiple agents. This is useful for monitoring services from different locations, clusters, or network segments.

:::info
This feature is only available in [Mission Control](https://flanksource.com/docs) since Canary Checker does not support agents
:::

## How It Works

When you specify an `agentSelector` on a canary:

1. The canary does **not** run locally on the server
2. A copy of the canary is created for each matched agent
3. Each agent runs the check independently and reports results back
4. The copies are kept in sync with the parent canary

A background job syncs agent selector canaries every 5 minutes. When agents are added or removed, the derived canaries are automatically created or cleaned up.
Comment on lines +16 to +23
Copy link

@coderabbitai coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Rewrite passive voice in the “How It Works” steps.

Lines 14–17 use passive voice. Please switch to active voice to match the documentation style guide.

Suggested rewrite
-1. The canary does **not** run locally on the server
-2. A copy of the canary is created for each matched agent
-3. Each agent runs the check independently and reports results back
-4. The copies are kept in sync with the parent canary
+1. The server does **not** run the canary locally.
+2. The system creates a copy of the canary for each matched agent.
+3. Each agent runs the check independently and reports results back.
+4. The system keeps the copies in sync with the parent canary.

As per coding guidelines, use active voice; never use passive voice.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
When you specify an `agentSelector` on a canary:
1. The canary does **not** run locally on the server
2. A copy of the canary is created for each matched agent
3. Each agent runs the check independently and reports results back
4. The copies are kept in sync with the parent canary
A background job syncs agent selector canaries every 5 minutes. When agents are added or removed, the derived canaries are automatically created or cleaned up.
When you specify an `agentSelector` on a canary:
1. The server does **not** run the canary locally.
2. The system creates a copy of the canary for each matched agent.
3. Each agent runs the check independently and reports results back.
4. The system keeps the copies in sync with the parent canary.
A background job syncs agent selector canaries every 5 minutes. When agents are added or removed, the derived canaries are automatically created or cleaned up.
🤖 Prompt for AI Agents
In `@canary-checker/docs/concepts/distributed-canaries.md` around lines 12 - 19,
Rewrite the four numbered "How It Works" bullets to use active voice: change
"The canary does not run locally on the server" to "Specifying an agentSelector
prevents the canary from running locally on the server" (or similar active
phrasing), change "A copy of the canary is created for each matched agent" to
"The system creates a copy of the canary for each matched agent," change "Each
agent runs the check independently and reports results back" to "Each matched
agent runs the check independently and reports the results," and change "The
copies are kept in sync with the parent canary" to "The system keeps the derived
copies in sync with the parent canary"; also rewrite the next sentence about
syncing to active voice (e.g., "A background job syncs agent selector canaries
every 5 minutes" is already active—ensure consistency).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Mission Control, not server and system

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@moshloop, thank you for the clarification! I'll update the suggestion to use "Mission Control" consistently.

Revised suggestion:

When you specify an `agentSelector` on a canary:

1. Mission Control does **not** run the canary locally.
2. Mission Control creates a copy of the canary for each matched agent.
3. Each agent runs the check independently and reports results back.
4. Mission Control keeps the copies in sync with the parent canary.

A background job syncs agent selector canaries every 5 minutes. When agents are added or removed, the derived canaries are automatically created or cleaned up.

✏️ Learnings added
Learnt from: moshloop
Repo: flanksource/docs PR: 469
File: canary-checker/docs/concepts/distributed-canaries.md:12-19
Timestamp: 2026-01-29T13:01:15.936Z
Learning: In flanksource/docs documentation, use "Mission Control" to refer to the central server/system component, not generic terms like "server" or "system".

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: flanksource/docs PR: 0
File: .cursor/rules/global.mdc:0-0
Timestamp: 2025-12-05T13:02:08.910Z
Learning: Applies to **/*.{md,markdown,txt} : Use active voice; never use passive voice


## Agent Selector Patterns

The `agentSelector` field accepts a list of patterns to match agent names:

| Pattern | Description |
| ------------------- | ------------------------------------ |
| `agent-1` | Exact match |
| `eu-west-*` | Prefix match (glob) |
| `*-prod` | Suffix match (glob) |
| `!staging` | Exclude agents matching this pattern |
| `team-*`, `!team-b` | Match all `team-*` except `team-b` |

## Example: HTTP Check on All Agents

This example creates an HTTP check for a Kubernetes service that runs on every agent matching the pattern:

```yaml title="distributed-http-check.yaml"
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: api-health
namespace: monitoring
spec:
schedule: '@every 1m'
http:
- name: api-endpoint
url: http://api-service.default.svc.cluster.local:8080/health
responseCodes: [200]
test:
expr: json.status == 'healthy'
agentSelector:
- '*' # Run on all agents
```

When this canary is created:

1. The check is executed locally only when `local` agent is provided in selector
2. A derived canary is created for each registered agent
3. Each agent executes the HTTP check against `api-service.default.svc.cluster.local:8080/health` in its own cluster
4. Results from all agents are aggregated and visible in the UI
Comment on lines +59 to +64
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use active voice in the execution steps.

Line 57 uses passive voice. Please switch to active voice.

Suggested rewrite
-1. The check is executed locally only when `local` agent is provided in selector
+1. The system executes the check locally only when you include the `local` agent in the selector.

As per coding guidelines, use active voice; never use passive voice.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
When this canary is created:
1. The check is executed locally only when `local` agent is provided in selector
2. A derived canary is created for each registered agent
3. Each agent executes the HTTP check against `api-service.default.svc.cluster.local:8080/health` in its own cluster
4. Results from all agents are aggregated and visible in the UI
When this canary is created:
1. The system executes the check locally only when you include the `local` agent in the selector.
2. A derived canary is created for each registered agent
3. Each agent executes the HTTP check against `api-service.default.svc.cluster.local:8080/health` in its own cluster
4. Results from all agents are aggregated and visible in the UI
🤖 Prompt for AI Agents
In `@canary-checker/docs/concepts/distributed-canaries.md` around lines 55 - 60,
Change the passive sentence "The check is executed locally only when `local`
agent is provided in selector" to active voice; locate the step that begins with
that phrase and rewrite it to something like "Execute the check locally only
when the `local` agent is specified in the selector" (or "The agent executes the
check locally only when the `local` agent is specified in the selector") so the
sentence is active while preserving meaning and the `local`/selector tokens.


## Example: Regional Monitoring

Monitor an external API from specific regions:

```yaml title="regional-monitoring.yaml"
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: external-api-latency
spec:
schedule: '@every 5m'
http:
- name: payment-gateway
url: https://api.payment-provider.com/health
responseCodes: [200]
maxResponseTime: 500
agentSelector:
- 'eu-*' # All EU agents
- 'us-*' # All US agents
- '!us-test' # Exclude test agent
- 'local' # Run on local instance as well
```

## Example: Exclude Specific Agents

Run checks on all agents except those in a specific environment:

```yaml title="production-only.yaml"
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
name: production-checks
spec:
schedule: '@every 2m'
http:
- name: internal-service
url: http://internal.example.com/status
agentSelector:
- '!*-dev' # Exclude all dev agents
- '!*-staging' # Exclude all staging agents
```
Loading
Loading