Handle Telegram rate limits without stuck turns and harden bot restarts after binary upgrades by ai-da-stas · Pull Request #9 · Headcrab/telecodex

ai-da-stas · 2026-05-11T17:27:11Z

Summary

This PR fixes a failure mode where a Telegram 429 Too Many Requests response could leave a Telecodex
session stuck as busy.

What changed:

Recover interrupted turns on startup:
mark stale running turns as failed;
clear stale sessions.busy.
If Telegram rate-limits the initial placeholder message before Codex starts:
mark the turn as failed;
clear busy;
after Telegram's retry_after backoff, send a user-facing notice.
Add a Retry inline button and /retry <turn_id> command for failed/cancelled turns without
attachments.
Avoid failing a completed Codex turn when Telegram rate-limits extra output chunks.

Context

This came from real usage with multiple long-running Telecodex topics. One topic got stuck as busy: / status showed it as busy, but /stop said there was no active turn. The root cause was a Telegram 429
while creating the initial placeholder message, before Codex had actually started.

Telegram's Bot FAQ recommends respecting per-chat/group/global limits and the Bot API returns
retry_after for 429 responses, so the fix follows that signal instead of retrying immediately.

Testing

cargo test
cargo clippy --all-targets --all-features -- -D warnings
cargo build --release

Follow-up after local rollout testing:

Fixed /restart_bot when the running process points to an old replaced binary inode (current_exe()
ending up as a deleted executable after cargo build --release).
Under systemd, /restart_bot now exits cleanly and lets the configured service restart policy bring
the bot back instead of spawning a detached duplicate process.
The SQLite instance lock is released explicitly before restart, and SIGTERM is handled as a normal
shutdown signal.

coderabbitai · 2026-05-11T17:27:24Z

Warning

Rate limit exceeded

@ai-da-stas has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 41 minutes and 13 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 574c86e2-522a-4f6c-b212-abb98c90cd38

📥 Commits

Reviewing files that changed from the base of the PR and between 5776939 and 030f147.

📒 Files selected for processing (3)

src/app.rs
src/app/tests.rs
src/store.rs

Walkthrough

Adds a Telegram /retry <turn_id> bridge command, storage-backed retry construction, pre-Codex Telegram rate-limit retry notices with a “Retry” keyboard, instance heartbeat and lock lifecycle (claim/heartbeat/release/recover), a graceful shutdown helper, and related tests and README updates.

Changes

Retry Turn + Instance Management

Layer / File(s)	Summary
Command Definition & Parsing `src/commands.rs`	Adds `BridgeCommand::RetryTurn { turn_id: i64 }`. Parser accepts `/retry`, `/retry_turn`, `/retry-turn` with numeric arg. Adds help and default bot command entry. Tests updated for parsing and default-command coverage.
App Handler & Shutdown `src/app.rs`	Handles `BridgeCommand::RetryTurn` in `handle_command`: fetches retryable request via store, rejects non-retryable turns, fills `override_search_mode` when needed, and enqueues retry. Adds `shutdown_signal()` helper and uses `tokio::select!` for shutdown. Spawns `run_instance_heartbeat_loop` task and attempts to release instance lock on restart path.
Command Auth / Context Rules `src/app/support.rs`	Marks `RetryTurn` as requiring session context and Codex auth in command-usage checks. `spawn_restarted_process` now respects `SYSTEMD_EXEC_PID` and resolves executable via `restart_executable()` helper.
Storage: Instance Lock, Recovery, Retry Construction `src/store.rs`	`Store::open` generates per-process `instance_id`, claims `app_instance_lock`, runs `recover_interrupted_turns` (stale `running` → `failed`, clears stale session busy flags), and records recovery in audit. Adds `claim_instance_lock`, `heartbeat_instance`, `release_instance_lock`, Drop impl to release lock. Adds `retry_request_for_turn` returning a constructed `TurnRequest` only for `failed`/`cancelled` turns and parsing `review_json`. Schema gains `app_instance_lock`. Tests added/updated for recovery, lock behavior, and retry construction.
Turn Processing & Pre-Codex Rate-Limit Handling `src/app/turns.rs`	`process_turn` reads latest limits snapshot and degrades on lookup failure. Placeholder creation now uses `send_message` and on failure runs `finish_pre_codex_turn_failure`, cancels chat-action, clears busy, records failed turn, and may send a delayed rate-limit retry notice. `LiveTurnSink::flush` now suppresses Telegram send errors that include `retry_after` by backing off. Adds helpers: `finish_pre_codex_turn_failure`, `notify_pre_codex_rate_limit`, `render_pre_codex_rate_limit_notice`, and `rate_limit_retry_keyboard` (produces inline keyboard with callback `cmd:/retry <turn_id>`).
Tests & Docs `src/app/tests.rs`, `README.md`, `README.ru.md`	Adds `builds_rate_limit_retry_keyboard` test verifying callback data `cmd:/retry 42`. README and README.ru updated to document `/retry <turn_id>` as a bridge-handled command retrying failed/cancelled turns without attachments.

sequenceDiagram
    participant User as User / Telegram
    participant App as App Handler
    participant Store as Store (DB)
    participant Queue as Turn Queue
    participant Processor as Turn Processor

    User->>App: /retry 42
    App->>Store: retry_request_for_turn(42, session_key, user_id)
    alt Not retryable
        Store-->>App: None
        App->>User: "Turn is not retryable"
    else Retryable
        Store-->>App: TurnRequest { prompt, review_mode, ... }
        App->>Queue: enqueue_turn(TurnRequest)
        Queue-->>Processor: process_turn()
        alt Pre-Codex rate limit
            Processor->>App: pre-Codex send fails w/ retry-after
            App->>User: Send delayed retry notice with rate_limit_retry_keyboard
            User->>App: presses Retry button (cmd:/retry <turn_id>)
        else Success
            Processor->>User: assistant response
        end
    end

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~40 minutes

Possibly related PRs

Headcrab/telecodex#8: Related to Telegram turn-notification/finalization flow and post-completion notify changes.
Headcrab/telecodex#6: Related command-handling changes and addition of BridgeCommand variants.

🐰 I stitched a retry button with gentle care,
A failed turn hops back into the air.
Press /retry, and watch it mend,
The bridge will hop and try again—my friend! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.38% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly summarizes the main changes: handling Telegram rate limits to prevent stuck turns and hardening bot restarts after binary upgrades.
Description check	✅ Passed	The description is directly related to the changeset, providing clear context for the fixes (Telegram 429 rate-limit handling, startup recovery, retry UX, and bot restart improvements).
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/app/turns.rs (1)

537-549: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Backoff on a new chunk can permanently suppress the final reply.

If send_message() for a later chunk hits 429, defer_after_retry_after() sets edit_backoff_until and flush() returns early. The later finish(true) call hits the same guard and also returns Ok(()), so the turn no longer fails, but the remaining output may never be delivered unless the user manually runs /copy.

Also applies to: 578-593, 669-676

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/app/turns.rs`:
- Around line 32-45: The Err branch of latest_limits_snapshot_from_shared(...)
is aborting the entire turn (clearing cancel_slot, cancel.cancel(), awaiting
chat_action_task, calling finish_pre_codex_turn_failure and finish_turn_cleanup)
which hides prompts; change this to recover gracefully by setting limits_inline
to None (or an appropriate empty banner), remove the calls that cancel and
return (do not call finish_pre_codex_turn_failure or finish_turn_cleanup), and
instead log the error (or record it) so the turn proceeds to Codex; keep
references to cancel_slot, cancel, chat_action_task only if you must clean up
non-fatal resources but do not abort the turn flow in the
latest_limits_snapshot_from_shared error path and let format_limits_inline be
optional.

In `@src/commands.rs`:
- Around line 228-230: Help text for the "/retry" command is inaccurate
(mentions only "failed" turns) while the command supports retrying both failed
and cancelled turns; update the text passed to text_help for the "/retry" match
arm and the corresponding bot-command description (the other occurrence around
the handler near line 266) to say "failed or cancelled" (e.g., "Usage: /retry
<turn_id>\n\nRetries a failed or cancelled turn without attachments.") so users
know both states are eligible.

In `@src/store.rs`:
- Around line 448-488: retry_request_for_turn is rebuilding retryable turns with
attachments: vec![] which drops any original attachments; modify the function to
query the persisted attachments for the given turn_id (e.g., select attachment
metadata/path from the attachments table where turn_id = ?) and populate
TurnRequest.attachments with that list instead of always using an empty vec, so
the returned TurnRequest preserves the original files when reconstructing a
failed/cancelled turn.
- Around line 57-74: The recovery currently clears all `running` turns and
`busy` sessions on `Store::open()` via `recover_interrupted_turns()` which can
corrupt a live instance; change the logic so recovery only touches state that is
stale (e.g., last_updated/heartbeat older than a configurable cutoff) and do not
clear anything until this process first claims ownership of the DB (implement a
DB-backed instance/lock row or an exclusive SQLite advisory lock in
`Store::open()` to prove this process is sole owner). Update
`recover_interrupted_turns()` to accept/compute a cutoff timestamp and only mark
turns/sessions older than that as recovered, and alter `Store::open()` to
acquire the lock/claim, verify no other live instance (or that other instance is
stale by the same cutoff) before calling `recover_interrupted_turns()`, leaving
`seed_admins()` and `audit()` behavior unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 82b35fa5-6216-4779-9934-a9f013550dc5

📥 Commits

Reviewing files that changed from the base of the PR and between dbda356 and 78578ad.

📒 Files selected for processing (8)

README.md
README.ru.md
src/app.rs
src/app/support.rs
src/app/tests.rs
src/app/turns.rs
src/commands.rs
src/store.rs

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

src/store.rs (2)

784-791: 💤 Low value

Stale cutoff parsing is sensible; consider a max bound.

unwrap_or(120) with a > 0 filter is good. Worth adding a sanity upper bound (e.g., a few hours) to prevent a typo like TELECODEX_INSTANCE_STALE_SECONDS=8640000 from effectively disabling recovery while still passing the > 0 check. Optional.

♻️ Optional refinement

 fn stale_instance_cutoff() -> String {
+    const MAX_STALE_SECONDS: i64 = 24 * 60 * 60;
     let seconds = std::env::var("TELECODEX_INSTANCE_STALE_SECONDS")
         .ok()
         .and_then(|value| value.parse::<i64>().ok())
-        .filter(|value| *value > 0)
+        .filter(|value| *value > 0 && *value <= MAX_STALE_SECONDS)
         .unwrap_or(120);
     (Utc::now() - Duration::seconds(seconds)).to_rfc3339()
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/store.rs` around lines 784 - 791, The stale_instance_cutoff function
currently accepts any positive TELECODEX_INSTANCE_STALE_SECONDS value; add a
sanity upper bound to avoid extremely large values (e.g., limit to a few hours)
by clamping the parsed seconds to a max (e.g., MAX_STALE_SECONDS) before using
it to compute the cutoff. Update stale_instance_cutoff to parse the env var as
now, then apply a min(parsed_seconds, MAX_STALE_SECONDS) (or similar clamp) so
typos like huge numbers cannot disable recovery; reference
TELECODEX_INSTANCE_STALE_SECONDS and the stale_instance_cutoff function when
implementing the clamp.

700-706: 💤 Low value

Drop-based release is fine but explicitly best-effort.

Drop uses conn.get_mut() which fails if the mutex is poisoned (panicked while holding the lock), and std::process::exit skips destructors entirely. That's why the RestartBot path in src/app.rs explicitly calls release_instance_lock() before exit(0) — good belt-and-braces. Consider a brief doc comment here pointing out that callers needing deterministic release must call release_instance_lock() explicitly.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/store.rs` around lines 700 - 706, Add a brief doc comment above the Drop
impl for Store explaining that Store::drop uses conn.get_mut() and performs a
best-effort release via release_instance_lock_for, which can fail if the mutex
is poisoned and will be skipped by std::process::exit; callers that need
deterministic release must call release_instance_lock() explicitly (see the
RestartBot path in RestartBot/Restart code for an example). Ensure the comment
names the methods release_instance_lock_for and release_instance_lock and
mentions the conn.get_mut() failure mode so future maintainers know to call the
explicit release before exiting.

src/app/support.rs (2)

200-215: ⚡ Quick win

restart_executable silently returns a non-existent path.

If current_exe() doesn’t exist and either the path has no (deleted) suffix or the stripped replacement is also missing, this falls through to Ok(exe) and returns the original (non-existent) path. The downstream Command::new(exe).spawn().context("failed to spawn restarted bot")? will then fail with a generic spawn error, hiding the real cause (binary not found).

Fail fast with a clear error so the user sees why the restart can't proceed.

♻️ Suggested fix

 fn restart_executable() -> Result<PathBuf> {
     let exe = std::env::current_exe().context("failed to resolve current executable")?;
     if exe.exists() {
         return Ok(exe);
     }

     let exe_text = exe.as_os_str().to_string_lossy();
     if let Some(stripped) = exe_text.strip_suffix(" (deleted)") {
         let replacement = PathBuf::from(stripped);
         if replacement.exists() {
             return Ok(replacement);
         }
     }

-    Ok(exe)
+    bail!(
+        "restart executable not found on disk: {} (original binary may have been removed without a replacement)",
+        exe.display()
+    )
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/support.rs` around lines 200 - 215, The function restart_executable
currently may return a non-existent PathBuf; update restart_executable to fail
fast when no existing executable is found: after obtaining exe from
std::env::current_exe() (and after the existing checks for the " (deleted)"
suffix and the replacement PathBuf), if neither exe nor replacement exists
return an Err with a clear context message (e.g., "executable not found for
restart" including the attempted path text) instead of falling through to
Ok(exe); ensure the error uses the same error type/context mechanism used
elsewhere (the Result/anyhow::Context pattern) so callers (e.g.,
Command::new(exe).spawn()) receive a descriptive failure.

196-198: 💤 Low value

Match SYSTEMD_EXEC_PID against std::process::id() to identify directly-managed processes.

SYSTEMD_EXEC_PID is inherited by all descendant processes from a systemd-managed unit. A bare presence check will prevent restart even when the bot is run as a grandchild of a systemd-managed process (e.g., shell script launched by systemd). Comparing the environment value against the current process ID ensures only directly-managed instances skip restart.

♻️ Optional refinement

-fn is_systemd_managed_process() -> bool {
-    std::env::var_os("SYSTEMD_EXEC_PID").is_some()
-}
+fn is_systemd_managed_process() -> bool {
+    let Some(value) = std::env::var_os("SYSTEMD_EXEC_PID") else {
+        return false;
+    };
+    value
+        .to_str()
+        .and_then(|raw| raw.parse::<u32>().ok())
+        .map(|pid| pid == std::process::id())
+        .unwrap_or(false)
+}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/support.rs` around lines 196 - 198, The current
is_systemd_managed_process() only checks presence of SYSTEMD_EXEC_PID; change it
to read the SYSTEMD_EXEC_PID env var and compare it to the current process id
(std::process::id()) so it returns true only when SYSTEMD_EXEC_PID == current
PID; update the function to parse the env value to an integer (u32/usize as
appropriate), handle parse errors by treating them as non-matching, and return
false when the env var is absent or does not equal the current process id.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/app.rs`:
- Around line 299-309: The heartbeat loop currently logs any error from
heartbeat_instance() and keeps running, which allows split-brain when the DB
lock is lost; modify run_instance_heartbeat_loop so that when
self.shared.store.heartbeat_instance() returns an Err whose message indicates
the lock loss (exact text "Telecodex database instance lock was lost" from
heartbeat_instance), it triggers a graceful shutdown instead of continuing:
e.g., send a shutdown signal via the existing CancellationToken or oneshot in
shared (or call the existing shutdown path) so the main tokio::select! handles
cleanup and exit; for other errors keep logging and continue. Ensure you
reference run_instance_heartbeat_loop and heartbeat_instance and use the shared
shutdown mechanism rather than calling process::exit directly.

---

Nitpick comments:
In `@src/app/support.rs`:
- Around line 200-215: The function restart_executable currently may return a
non-existent PathBuf; update restart_executable to fail fast when no existing
executable is found: after obtaining exe from std::env::current_exe() (and after
the existing checks for the " (deleted)" suffix and the replacement PathBuf), if
neither exe nor replacement exists return an Err with a clear context message
(e.g., "executable not found for restart" including the attempted path text)
instead of falling through to Ok(exe); ensure the error uses the same error
type/context mechanism used elsewhere (the Result/anyhow::Context pattern) so
callers (e.g., Command::new(exe).spawn()) receive a descriptive failure.
- Around line 196-198: The current is_systemd_managed_process() only checks
presence of SYSTEMD_EXEC_PID; change it to read the SYSTEMD_EXEC_PID env var and
compare it to the current process id (std::process::id()) so it returns true
only when SYSTEMD_EXEC_PID == current PID; update the function to parse the env
value to an integer (u32/usize as appropriate), handle parse errors by treating
them as non-matching, and return false when the env var is absent or does not
equal the current process id.

In `@src/store.rs`:
- Around line 784-791: The stale_instance_cutoff function currently accepts any
positive TELECODEX_INSTANCE_STALE_SECONDS value; add a sanity upper bound to
avoid extremely large values (e.g., limit to a few hours) by clamping the parsed
seconds to a max (e.g., MAX_STALE_SECONDS) before using it to compute the
cutoff. Update stale_instance_cutoff to parse the env var as now, then apply a
min(parsed_seconds, MAX_STALE_SECONDS) (or similar clamp) so typos like huge
numbers cannot disable recovery; reference TELECODEX_INSTANCE_STALE_SECONDS and
the stale_instance_cutoff function when implementing the clamp.
- Around line 700-706: Add a brief doc comment above the Drop impl for Store
explaining that Store::drop uses conn.get_mut() and performs a best-effort
release via release_instance_lock_for, which can fail if the mutex is poisoned
and will be skipped by std::process::exit; callers that need deterministic
release must call release_instance_lock() explicitly (see the RestartBot path in
RestartBot/Restart code for an example). Ensure the comment names the methods
release_instance_lock_for and release_instance_lock and mentions the
conn.get_mut() failure mode so future maintainers know to call the explicit
release before exiting.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ce95d40e-9759-42fc-b38e-cd303b076b57

📥 Commits

Reviewing files that changed from the base of the PR and between 78578ad and 5776939.

📒 Files selected for processing (5)

src/app.rs
src/app/support.rs
src/app/turns.rs
src/commands.rs
src/store.rs

🚧 Files skipped from review as they are similar to previous changes (2)

src/commands.rs
src/app/turns.rs

coderabbitai · 2026-05-11T19:45:24Z

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

CodeRabbit commands

These commands are invoked using PR/Issue comments.

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai rate limit to show your current review rate limit status.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai evaluate custom pre-merge check --instructions <custom-checks-instructions> --name <custom-checks-title> [--mode <error|warning>] to validate the custom pre-merge checks instructions. Defaults to error when --mode is omitted.
@coderabbitai ignore pre-merge checks to override pre-merge checks and get an approval on PR.
@coderabbitai run pre-merge checks to run pre-merge checks on the pull request.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve merge conflict to automatically resolve merge conflicts.
@coderabbitai autofix to automatically fix issues identified in unresolved review comments.
@coderabbitai resolve to resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai generate configuration to create a PR that adds the current resolved configuration as .coderabbit.yaml (or show it if already present).
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit configuration file (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
You can also validate your configuration using the online YAML validator.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, support, documentation and community

Visit our status page to check the current availability of CodeRabbit.
Create a ticket on our support page for assistance with any issues or questions.
Visit our documentation site for detailed information on how to use CodeRabbit.
Join our Discord community to connect with other users and get help from the community.
Follow us on X/Twitter for updates and announcements.

ai-da-stas · 2026-05-11T19:46:36Z

I pushed a fix in commit 030f147, please review it.

ai-da-stas · 2026-05-14T13:20:20Z

Closing this PR for now. I want to do another pass on the implementation and send the related changes together as one combined PR, so the review surface is cleaner. Thanks for the review so far.

Handle Telegram rate limits without stuck turns

78578ad

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread src/app/turns.rs Outdated

Comment thread src/commands.rs

Comment thread src/store.rs Outdated

Comment thread src/store.rs

ai-da-stas added 3 commits May 11, 2026 21:49

Keep turns running when limits snapshot fails

c132dde

Clarify retry help for cancelled turns

c64f867

Guard startup recovery with DB instance lock

fc7bb08

ai-da-stas marked this pull request as draft May 11, 2026 18:19

Make bot restarts work after binary upgrades

5776939

ai-da-stas changed the title ~~Handle Telegram rate limits without stuck turns~~ Handle Telegram rate limits without stuck turns and harden bot restarts after binary upgrades May 11, 2026

ai-da-stas marked this pull request as ready for review May 11, 2026 18:29

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread src/app.rs

Stop gracefully when DB lock heartbeat is lost

030f147

ai-da-stas marked this pull request as draft May 11, 2026 18:48

ai-da-stas marked this pull request as ready for review May 11, 2026 18:48

This comment was marked as outdated.

Sign in to view

ai-da-stas mentioned this pull request May 11, 2026

Add per-session Codex fast mode control #10

Closed

ai-da-stas closed this May 14, 2026

ai-da-stas mentioned this pull request May 14, 2026

Add Telegram delivery pacing and session runtime controls #12

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle Telegram rate limits without stuck turns and harden bot restarts after binary upgrades#9

Handle Telegram rate limits without stuck turns and harden bot restarts after binary upgrades#9
ai-da-stas wants to merge 6 commits into
Headcrab:masterfrom
codex-k8s:codex/telegram-rate-limit-recovery

ai-da-stas commented May 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

This comment was marked as outdated.

coderabbitai Bot commented May 11, 2026

Uh oh!

ai-da-stas commented May 11, 2026

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ai-da-stas commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Context

Testing

Follow-up after local rollout testing:

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated Code Review Effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

coderabbitai Bot commented May 11, 2026

Chat

CodeRabbit commands

Other keywords and placeholders

CodeRabbit configuration file (.coderabbit.yaml)

Status, support, documentation and community

Uh oh!

ai-da-stas commented May 11, 2026

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ai-da-stas commented May 11, 2026 •

edited

Loading

coderabbitai Bot commented May 11, 2026 •

edited

Loading

CodeRabbit configuration file (`.coderabbit.yaml`)