Skip to content

feat: make chainlink aware of validator mode#1227

Closed
thlorenz wants to merge 57 commits into
masterfrom
thlorenz/replica-aware-chainlink
Closed

feat: make chainlink aware of validator mode#1227
thlorenz wants to merge 57 commits into
masterfrom
thlorenz/replica-aware-chainlink

Conversation

@thlorenz
Copy link
Copy Markdown
Collaborator

@thlorenz thlorenz commented May 20, 2026

Summary

Make Chainlink and RPC behavior coordination-mode aware so replica/standby validators stay local-only until they become primary, and fully tie Chainlink remote runtime startup/shutdown to primary lifecycle transitions.

  • Reject non-primary transaction RPCs before transaction preparation, signature-cache mutation, account ensuring, or scheduler simulation.
  • Keep account reads available in replica mode without triggering remote Chainlink work.
  • Defer Chainlink remote runtime construction until primary enablement instead of constructing providers during validator startup.
  • Shut down Chainlink remote runtime state when leaving primary mode, including fetch cloning, pubsub/submux work, remote provider subscriptions, pending waiters, and removed-account eviction handling.
  • Move account-bank reset and Chainlink enablement to primary-readiness transitions.
  • Switch out of primary mode before standby setup during demotion, and roll back Chainlink enablement if publishing primary mode to the scheduler fails.
  • Add coverage for RPC side-effect prevention, Chainlink inactive startup behavior, runtime shutdown/restart, lower-layer pubsub/provider teardown, and replication lifecycle ordering.

ADDRESSES: #1203

Details

This PR tightens the boundary between primary and non-primary validator behavior. Transaction RPCs are rejected while a validator is non-primary, account read RPCs remain local-only, and Chainlink remote work now exists only while the validator is explicitly enabled as primary.

Steps when Switching Modes

Switching to Primary Mode

For standalone startup, the validator performs account-bank cleanup before enabling Chainlink and publishing primary scheduler mode. For replicated validators, the same cleanup runs during promotion after replica replay work is idle.

  1. During validator initialization, Chainlink stores the configuration needed to build its remote runtime, but does not start the remote account provider, pubsub clients, FetchCloner listener, removed-account subscription, or subscription bookkeeping.
  2. During promotion, wait for the scheduler to become idle so no replica/standby work is still being processed.
  3. Reset the account bank as primary-readiness cleanup. This removes non-delegated, non-blacklisted accounts from the local bank while preserving delegated, undelegating, ephemeral, and protected accounts.
  4. Call chainlink.enable_primary(). This builds and installs the active Chainlink runtime: remote account provider, WebSocket/gRPC/Laserstream pubsub clients, FetchCloner, removed-account subscription handling, and subscription management.
  5. Publish Primary mode to the scheduler/global coordination state. At this point primary traffic can be accepted and Chainlink remote paths are active.
  6. If publishing primary mode fails after Chainlink was enabled, disable Chainlink again so remote runtime state is not left active while the scheduler remains non-primary.
  7. Once primary mode is visible, Chainlink remote paths can fetch/clone missing state, subscribe to monitored accounts, set up program subscriptions, process clock/account updates, reconcile active account subscriptions, and submit eviction/undelegation-related work when those paths are requested.

Switching from Primary Mode

When a primary demotes into standby/replica behavior, the order is intentionally reversed around externally visible behavior: stop being primary first, then perform Chainlink shutdown and standby setup.

  1. Switch the scheduler/global coordination state to Replica immediately at the start of standby transition.
  2. Call chainlink.disable() to take and stop the active Chainlink runtime.
  3. Disablement aborts the removed-account subscription task, shuts down the FetchCloner, cancels its subscription listener, clears/fails pending fetch waiters, stops remote account provider/submux/pubsub work, and leaves Chainlink in the disabled lifecycle state.
  4. From that point, Chainlink public entry points are inactive: account ensure returns no-op results, fetches read local AccountsDb, undelegation subscription setup is skipped, removed-account eviction notifications cannot submit eviction transactions, and public calls do not recreate remote subscriptions.
  5. Transaction RPCs (sendTransaction and simulateTransaction) reject while non-primary, so they cannot trigger transaction preparation, account ensure, signature-cache mutation, or scheduler simulation.
  6. Account read RPCs remain local-only and do not invoke remote Chainlink ensure.
  7. Standby setup then creates the replica consumer and lock watcher. If that setup blocks or retries, the validator is already non-primary and Chainlink remote runtime state has been disabled.
  8. Promotion after shutdown creates a fresh Chainlink runtime, so delegated accounts can be fetched again after becoming primary.

magicblock-aperture

RPC dispatch checks the current coordination mode. sendTransaction is rejected while non-primary before it can affect the transaction cache, and simulateTransaction is rejected before transaction preparation, account ensuring, or scheduler simulation. Account reads remain usable against local state. Tests cover non-primary transaction RPC rejection and local account reads.

magicblock-chainlink

Chainlink now owns an explicit lifecycle state (Disabled, Starting, Enabled, Stopping) and an optional active runtime. Runtime construction is deferred until enable_primary(), while disable() tears down the active runtime asynchronously. The previous broad defensive remote-sync gates inside lower-level Chainlink paths were removed in favor of this active/inactive runtime boundary.

The shutdown path covers FetchCloner cancellation, pending waiter failure, remote account provider shutdown, deterministic pubsub/submux cleanup, deferred pubsub task shutdown, and removed-account subscription task cancellation. Tests cover inactive startup modes, repeated enable/disable cycles, public calls after disable, provider account updates after disable, removed-account notifications after disable, and fresh runtime startup after shutdown.

magicblock-api

Standalone validators now reset the account bank only as part of primary readiness, then enable Chainlink before publishing primary scheduler mode. If sending primary mode to the scheduler fails, Chainlink enablement is rolled back. Replicated validators skip startup reset and defer cleanup until promotion.

magicblock-replicator

Promotion ordering is documented and covered: wait for scheduler idle, reset account bank, enable Chainlink runtime, then enter primary mode. Demotion switches to replica mode before disabling Chainlink and before standby setup can block or retry, preventing stale primary behavior from continuing during transition. Tests cover promotion failure ordering, Chainlink enable/disable ordering, and replica-only behavior that must not reach the primary path.

Summary by CodeRabbit

  • Bug Fixes

    • Write-like RPCs (send/simulate transactions) are rejected when the node cannot act as primary; replica-mode reads return local account data without remote gating.
  • Refactor

    • Startup and role-transition ordering tightened to enforce primary readiness before promotion and consumer creation.
  • New Features

    • Added graceful shutdown and lifecycle controls for remote-sync, subscriptions, and background tasks; new routing/availability signals for RPC handling.
  • Tests

    • Expanded integration/unit tests for non-primary gating, lifecycle enable/disable, and shutdown behavior.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

📝 Walkthrough

Walkthrough

Adds coordination-mode checks to HTTP RPC handling to reject or short-circuit ensure/transaction flows based on coordination mode and Chainlink primary runtime readiness. Refactors Chainlink into a lifecycle/runtime model with enable_primary/disable and runtime_fetch_cloner gating, implements coordinated shutdown for FetchCloner, RemoteAccountProvider, and SubMuxClient (including test mocks), updates replicator/validator primary/standby sequencing to enforce ordering, and expands tests and metrics to cover routing decisions and shutdown behavior.

Suggested reviewers

  • GabrielePicco
  • snawaz
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch thlorenz/replica-aware-chainlink

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
magicblock-api/src/magic_validator.rs (1)

879-901: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't treat mode_tx.send(...) as a completed Primary transition.

This only queues SchedulerMode::Primary. The new HTTP/chainlink gates read CoordinationMode::current(), so startup can continue past enable_primary() while the old mode is still visible. Please make the coordination-mode switch synchronous here, or wait for an acknowledgment before enabling primary-only behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@magicblock-api/src/magic_validator.rs` around lines 879 - 901, The code
currently only enqueues SchedulerMode::Primary via mode_tx.send(...) which
doesn't guarantee CoordinationMode::current() has switched before calling
chainlink.enable_primary(); change this to perform a synchronous
coordination-mode transition or wait for an explicit acknowledgment: either (a)
invoke the coordination-mode switch synchronously (so
CoordinationMode::current() reflects Primary before calling
chainlink.enable_primary()), or (b) extend the scheduler handshake to
return/await an ack after processing SchedulerMode::Primary (e.g., send a
request that the scheduler replies to) and only call chainlink.enable_primary()
after receiving that ack; update the logic around mode_tx.send,
SchedulerMode::Primary, and chainlink.enable_primary to ensure the coordination
mode is visible before enabling primary-only behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@magicblock-replicator/src/service/context.rs`:
- Around line 244-249: The promotion/demotion path currently enqueues
SchedulerMode inside enter_primary_mode()/enter_replica_mode() but silently
drops send failures, so the rest of
run_primary_readiness_sequence()/run_replica_readiness_sequence() can proceed
while CoordinationMode::current() is stale or the receiver is gone; change
enter_primary_mode and enter_replica_mode to return Result<(), Error> (or a
suitable error type) and have them propagate the failure of mode_tx.send(...)
instead of ignoring it, then update the calls in run_primary_readiness_sequence
and the analogous replica sequence to invoke those methods through the fallible
closures you already pass (e.g. replace || self.enter_primary_mode() with a
closure that returns the Result), so the readiness sequence aborts on
send/send-failure and the mode switch is externally visible; ensure
SchedulerMode send failures are converted into the same error type you return so
callers can handle them.

---

Outside diff comments:
In `@magicblock-api/src/magic_validator.rs`:
- Around line 879-901: The code currently only enqueues SchedulerMode::Primary
via mode_tx.send(...) which doesn't guarantee CoordinationMode::current() has
switched before calling chainlink.enable_primary(); change this to perform a
synchronous coordination-mode transition or wait for an explicit acknowledgment:
either (a) invoke the coordination-mode switch synchronously (so
CoordinationMode::current() reflects Primary before calling
chainlink.enable_primary()), or (b) extend the scheduler handshake to
return/await an ack after processing SchedulerMode::Primary (e.g., send a
request that the scheduler replies to) and only call chainlink.enable_primary()
after receiving that ack; update the logic around mode_tx.send,
SchedulerMode::Primary, and chainlink.enable_primary to ensure the coordination
mode is visible before enabling primary-only behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 43480cb3-ed7b-4cd8-8ece-e9efe4b990f5

📥 Commits

Reviewing files that changed from the base of the PR and between 25df83c and fc9b68b.

📒 Files selected for processing (10)
  • magicblock-aperture/src/requests/http/mod.rs
  • magicblock-aperture/src/requests/http/send_transaction.rs
  • magicblock-aperture/src/requests/http/simulate_transaction.rs
  • magicblock-api/src/magic_validator.rs
  • magicblock-chainlink/src/chainlink/mod.rs
  • magicblock-chainlink/tests/10_non_primary_gating.rs
  • magicblock-chainlink/tests/utils/test_context.rs
  • magicblock-replicator/src/error.rs
  • magicblock-replicator/src/service/context.rs
  • magicblock-replicator/src/service/mod.rs

Comment thread magicblock-replicator/src/service/context.rs Outdated
@thlorenz thlorenz requested a review from bmuddha May 21, 2026 01:30
@thlorenz thlorenz changed the title feat: gate chainlink by validator mode feat: make chainlink aware of validator mode May 21, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@magicblock-aperture/src/requests/http/mod.rs`:
- Around line 207-210: The ensure_routing_decision calls are using hardcoded
labels instead of the passed RPC method parameter, which drops per-RPC context;
update each call site to pass the method variable rather than a string literal
(e.g., replace ensure_routing_decision("read_account") with
ensure_routing_decision(method)), and if the parameter is named _method rename
it to method (or remove the leading underscore) so it can be used; apply this
fix for all functions that accept a method label and currently call
ensure_routing_decision with fixed strings (the three occurrences shown).

In `@magicblock-metrics/src/metrics/mod.rs`:
- Around line 216-223: Replace the .unwrap() on the metric construction with
explicit error handling: call IntCounterVec::new(...) and match on the Result,
returning the created IntCounterVec on Ok or invoking panic! with a clear,
descriptive message including the error on Err. Apply the same change for the
other metric initialization referenced around lines 257-265; reference the
static names APERTURE_ENSURE_ROUTING_COUNT (and the other metric static) so you
replace .unwrap() with a match that panics with a contextual message and the
underlying error.

In `@magicblock-replicator/src/service/context.rs`:
- Around line 246-255: After calling chainlink.enable_primary().await (in the
block using chainlink.enable_primary()), ensure we rollback by calling
chainlink.disable_primary().await when subsequent steps fail: if
chainlink.primary_runtime_readiness() does not allow promotion or if
mode_tx.send(SchedulerMode::Primary).await fails, call disable_primary() and
propagate an error; for the mode_tx send failure ensure disable_primary() is
awaited even if send returns Err, and convert any disable error into/log
alongside the original error so Chainlink is not left enabled when the function
returns an Err.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 192b3c12-314f-401a-bc9d-a4a58900cc75

📥 Commits

Reviewing files that changed from the base of the PR and between e00e015 and 53d1653.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (20)
  • magicblock-aperture/Cargo.toml
  • magicblock-aperture/src/error.rs
  • magicblock-aperture/src/requests/http/get_account_info.rs
  • magicblock-aperture/src/requests/http/get_balance.rs
  • magicblock-aperture/src/requests/http/get_delegation_status.rs
  • magicblock-aperture/src/requests/http/get_multiple_accounts.rs
  • magicblock-aperture/src/requests/http/get_token_account_balance.rs
  • magicblock-aperture/src/requests/http/mod.rs
  • magicblock-aperture/src/requests/http/send_transaction.rs
  • magicblock-aperture/src/requests/http/simulate_transaction.rs
  • magicblock-api/src/magic_validator.rs
  • magicblock-chainlink/Cargo.toml
  • magicblock-chainlink/src/chainlink/errors.rs
  • magicblock-chainlink/src/chainlink/mod.rs
  • magicblock-chainlink/tests/10_non_primary_gating.rs
  • magicblock-chainlink/tests/11_lifecycle_shutdown.rs
  • magicblock-metrics/src/metrics/mod.rs
  • magicblock-replicator/Cargo.toml
  • magicblock-replicator/src/service/context.rs
  • magicblock-replicator/src/service/mod.rs

Comment on lines +207 to +210
_method: &'static str,
pubkey: &Pubkey,
) -> Option<AccountSharedData> {
let mark_empty_if_not_found = [*pubkey];
let _timer = ENSURE_ACCOUNTS_TIME
.with_label_values(&["account"])
.start_timer();
let _ = self
.chainlink
.ensure_accounts(
&[*pubkey],
Some(&mark_empty_if_not_found),
AccountFetchOrigin::GetAccount,
None,
)
.await
.inspect_err(|e| {
// There is nothing we can do if fetching the account fails
// Log the error and return whatever is in the accounts db
debug!(error = ?e, "Failed to ensure account");
});
self.accountsdb.get_account(pubkey)
) -> RpcResult<Option<AccountSharedData>> {
match self.ensure_routing_decision("read_account").await {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use the passed RPC method instead of hardcoded ensure labels.

Line 207, Line 249, and Line 354 accept a method label, but Line 210, Line 252, and Line 357 call ensure_routing_decision(...) with fixed strings. This drops per-RPC context, so call-site method names are ignored for routing metrics/log labels.

Suggested fix
 async fn read_account_with_ensure(
     &self,
-    _method: &'static str,
+    method: &'static str,
     pubkey: &Pubkey,
 ) -> RpcResult<Option<AccountSharedData>> {
-    match self.ensure_routing_decision("read_account").await {
+    match self.ensure_routing_decision(method).await {
         ...
     }
 }

 async fn read_accounts_with_ensure(
     &self,
-    _method: &'static str,
+    method: &'static str,
     pubkeys: &[Pubkey],
 ) -> RpcResult<Vec<Option<AccountSharedData>>> {
-    match self.ensure_routing_decision("read_accounts").await {
+    match self.ensure_routing_decision(method).await {
         ...
     }
 }

 async fn ensure_transaction_accounts(
     &self,
-    _method: &'static str,
+    method: &'static str,
     transaction: &SanitizedTransaction,
 ) -> RpcResult<()> {
-    match self.ensure_routing_decision("transaction").await {
+    match self.ensure_routing_decision(method).await {
         ...
     }
 }

Also applies to: 249-252, 354-357

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@magicblock-aperture/src/requests/http/mod.rs` around lines 207 - 210, The
ensure_routing_decision calls are using hardcoded labels instead of the passed
RPC method parameter, which drops per-RPC context; update each call site to pass
the method variable rather than a string literal (e.g., replace
ensure_routing_decision("read_account") with ensure_routing_decision(method)),
and if the parameter is named _method rename it to method (or remove the leading
underscore) so it can be used; apply this fix for all functions that accept a
method label and currently call ensure_routing_decision with fixed strings (the
three occurrences shown).

Comment on lines +216 to +223
pub static ref APERTURE_ENSURE_ROUTING_COUNT: IntCounterVec = IntCounterVec::new(
Opts::new(
"aperture_ensure_routing_count",
"Number of aperture ensure-routing decisions by method and decision",
),
&["method", "decision"],
).unwrap();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove .unwrap() from production metric initialization.

Both new metric constructors use .unwrap() in production code. Please switch to explicit error handling (or an explicit invariant path without unwrap/expect) to satisfy the repository rule.

Example adjustment pattern
+fn build_int_counter_vec(name: &str, help: &str, labels: &[&str]) -> IntCounterVec {
+    match IntCounterVec::new(Opts::new(name, help), labels) {
+        Ok(metric) => metric,
+        Err(err) => panic!("metric definition invariant violated for `{name}`: {err}"),
+    }
+}
...
-    pub static ref APERTURE_ENSURE_ROUTING_COUNT: IntCounterVec = IntCounterVec::new(
-        Opts::new(
-            "aperture_ensure_routing_count",
-            "Number of aperture ensure-routing decisions by method and decision",
-        ),
-        &["method", "decision"],
-    ).unwrap();
+    pub static ref APERTURE_ENSURE_ROUTING_COUNT: IntCounterVec = build_int_counter_vec(
+        "aperture_ensure_routing_count",
+        "Number of aperture ensure-routing decisions by method and decision",
+        &["method", "decision"],
+    );

As per coding guidelines {magicblock-*,programs,storage-proto}/**: Treat any usage of .unwrap() or .expect() in production Rust code as a MAJOR issue.

Also applies to: 257-265

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@magicblock-metrics/src/metrics/mod.rs` around lines 216 - 223, Replace the
.unwrap() on the metric construction with explicit error handling: call
IntCounterVec::new(...) and match on the Result, returning the created
IntCounterVec on Ok or invoking panic! with a clear, descriptive message
including the error on Err. Apply the same change for the other metric
initialization referenced around lines 257-265; reference the static names
APERTURE_ENSURE_ROUTING_COUNT (and the other metric static) so you replace
.unwrap() with a match that panics with a contextual message and the underlying
error.

Comment on lines +246 to +255
let outcome = chainlink.enable_primary().await.map_err(Error::from)?;
let readiness = chainlink.primary_runtime_readiness().await;
if !readiness.allows_primary_ensure() {
return Err(Error::Internal(format!(
"chainlink primary runtime not ready after enable: outcome={outcome:?}, readiness={readiness:?}"
)));
}
mode_tx.send(SchedulerMode::Primary).await.map_err(|e| {
Error::Internal(format!("failed to enter primary mode: {e}"))
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Rollback Chainlink if promotion fails after enable_primary().

After Line 246 succeeds, failures on Line 248 (readiness gate) or Line 253 (mode publish) return early without disabling Chainlink. That can leave remote runtime active while still non-primary.

Suggested rollback pattern
 let outcome = chainlink.enable_primary().await.map_err(Error::from)?;
 let readiness = chainlink.primary_runtime_readiness().await;
 if !readiness.allows_primary_ensure() {
+    let _ = chainlink.disable().await;
     return Err(Error::Internal(format!(
         "chainlink primary runtime not ready after enable: outcome={outcome:?}, readiness={readiness:?}"
     )));
 }
-mode_tx.send(SchedulerMode::Primary).await.map_err(|e| {
-    Error::Internal(format!("failed to enter primary mode: {e}"))
-})
+if let Err(e) = mode_tx.send(SchedulerMode::Primary).await {
+    let _ = chainlink.disable().await;
+    return Err(Error::Internal(format!("failed to enter primary mode: {e}")));
+}
+Ok(())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let outcome = chainlink.enable_primary().await.map_err(Error::from)?;
let readiness = chainlink.primary_runtime_readiness().await;
if !readiness.allows_primary_ensure() {
return Err(Error::Internal(format!(
"chainlink primary runtime not ready after enable: outcome={outcome:?}, readiness={readiness:?}"
)));
}
mode_tx.send(SchedulerMode::Primary).await.map_err(|e| {
Error::Internal(format!("failed to enter primary mode: {e}"))
})
let outcome = chainlink.enable_primary().await.map_err(Error::from)?;
let readiness = chainlink.primary_runtime_readiness().await;
if !readiness.allows_primary_ensure() {
let _ = chainlink.disable().await;
return Err(Error::Internal(format!(
"chainlink primary runtime not ready after enable: outcome={outcome:?}, readiness={readiness:?}"
)));
}
if let Err(e) = mode_tx.send(SchedulerMode::Primary).await {
let _ = chainlink.disable().await;
return Err(Error::Internal(format!("failed to enter primary mode: {e}")));
}
Ok(())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@magicblock-replicator/src/service/context.rs` around lines 246 - 255, After
calling chainlink.enable_primary().await (in the block using
chainlink.enable_primary()), ensure we rollback by calling
chainlink.disable_primary().await when subsequent steps fail: if
chainlink.primary_runtime_readiness() does not allow promotion or if
mode_tx.send(SchedulerMode::Primary).await fails, call disable_primary() and
propagate an error; for the mode_tx send failure ensure disable_primary() is
awaited even if send returns Err, and convert any disable error into/log
alongside the original error so Chainlink is not left enabled when the function
returns an Err.

@thlorenz
Copy link
Copy Markdown
Collaborator Author

Superseeded by simpler PR

@thlorenz thlorenz closed this May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant