Add Telegram delivery pacing and session runtime controls by ai-da-stas · Pull Request #12 · Headcrab/telecodex

ai-da-stas · 2026-05-14T13:41:16Z

Summary

This replaces the now-closed #9 and #10 so the related runtime and Telegram delivery changes can be reviewed together.

add Telegram pre-Codex 429 recovery with retry guidance and retry buttons
queue/pause outbound Telegram deliveries per chat_id, use a safer group/topic cadence, and honor Bot API retry_after
use sendMessageDraft previews in private chats while keeping group/topic previews on throttled editMessageText
keep turns running when Codex limit snapshot collection fails
guard startup recovery with a database instance lock and stop gracefully if that lock is lost
make bot restarts work after binary upgrades
add per-session /fast mode and surface fast mode in turn-start status
document the Telegram delivery pacing and draft/edit preview behavior in both READMEs

Notes

Transparency note: this is an honest vibe-coded contribution. I mostly work in Go and used Codex heavily for the Rust implementation, so I kept the changes focused and added tests around the touched behavior.

Validation

cargo fmt
cargo test
cargo clippy --all-targets --all-features -- -D warnings
cargo build --release

coderabbitai · 2026-05-14T13:41:24Z

Warning

Rate limit exceeded

@ai-da-stas has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 30 minutes and 12 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 08dc9378-8f86-471d-ab56-294cac13ca89

📥 Commits

Reviewing files that changed from the base of the PR and between 3772b8b and cbb8f98.

📒 Files selected for processing (1)

src/store.rs

Walkthrough

This PR hardens Telecodex against Telegram rate limits and bot restarts. It introduces database instance locking and interrupted turn recovery on startup to handle stale sessions; implements outbound rate limiting per chat with retry_after backoff; adds /retry <turn_id> command and retry buttons for failed turns; implements /fast [on|off|status] service tier toggling; and fixes /restart_bot to handle deleted binaries and systemd-managed processes.

Changes

Database Instance Safety & Turn Recovery

Layer / File(s)	Summary
Instance lock, heartbeat, and recovery mechanisms `src/store.rs`, `src/app.rs`	`Store::open` now claims an `app_instance_lock` to prevent concurrent instances, assigns `instance_id`, recovers stale `running` turns as `failed` and clears stale `sessions.busy`. Heartbeat task checks lock health and cancels shutdown if lost. `release_instance_lock` explicitly releases on restart via `Drop` implementation.
Turn retry request reconstruction `src/store.rs`, `src/app/turns.rs`	`retry_request_for_turn` rebuilds a `TurnRequest` from a failed/cancelled turn by joining stored `turns` and `sessions` data. Used by the retry flow to re-enqueue turns without user attachments.

Telegram Rate Limit Resilience & Retry Flow

Layer / File(s)	Summary
Outbound rate limiter and backoff `src/telegram.rs`	New `OutboundRateLimiter` gates chat-scoped API calls (`sendMessage`, `editMessageText`, topic/draft operations) with per-chat + global spacing and respects Telegram's `retry_after` backoff. `post_outbound` helper wraps API calls to apply limiter and extract retry-after delays.
Draft-aware placeholder and pre-Codex failure handling `src/app/turns.rs`, `src/models.rs`, `src/store.rs`	`process_turn` attempts to send draft placeholder for private chats (via `LiveTurnSink::new_draft`), falling back to preview sink on failure. Pre-Codex Telegram rate-limit failures are caught, the turn marked `failed`, and a user-facing retry notice scheduled with backoff.
Retry button, command parsing, and enqueue `src/commands.rs`, `src/app.rs`, `src/app/tests.rs`	`/retry <turn_id>` command parses turn IDs, and `rate_limit_retry_keyboard(turn_id)` builds inline buttons. `BridgeCommand::RetryTurn` loads a retry request from the store, auto-sets search mode, and enqueues the turn for re-execution.
Message draft payload and serialization `src/telegram.rs`	New `SendMessageDraft` struct with `draft_id` field replaces separate parameters in `send_message_draft`. Constructor `html(...)` builds HTML payloads for serialization into the Telegram API.

Service Tier & Bot Restart Improvements

Layer / File(s)	Summary
Service tier field, session storage, and propagation `src/models.rs`, `src/store.rs`, `src/codex.rs`, `src/app/presentation.rs`	`SessionRecord` gains optional `service_tier: Option<String>` field, stored in `sessions` table and read/written by `set_session_service_tier` and recovery logic. `build_review_command` and `build_turn_start_params` pass `service_tier` to Codex via CLI config and JSON params. Session status display shows `fast: on/off`.
Fast mode command and session state updates `src/commands.rs`, `src/app.rs`	New `FastMode` enum (`Status`, `On`, `Off`) and `/fast [on
Restart hardening: systemd detection and deleted binary recovery `src/app/support.rs`, `src/app.rs`	`spawn_restarted_process` now checks `SYSTEMD_EXEC_PID` and exits cleanly if managed by systemd. `restart_executable` resolves `current_exe()` and falls back to stripped path if inode is marked `(deleted)` after binary upgrades. `RestartBot` spawns delayed task to release instance lock before exit. `shutdown_signal` helper unifies Ctrl-C, SIGTERM (Unix), and `CancellationToken` shutdown.
Test coverage for recovery, retry, and service tier `src/app/tests.rs`, `src/store.rs`, `src/codex.rs`, `src/app/turns.rs`	Tests updated to initialize `service_tier: None` and new `shutdown` token. Added `builds_rate_limit_retry_keyboard`, `turn_start_banner` "fast" mode tests, and `retry_request_for_turn` tests. Store tests validate instance lock claim/release and interrupted turn recovery.
Documentation: rate limits, retry, and service tier `README.md`, `README.ru.md`	Updated to describe progress streaming via drafts/edits, per-chat rate-limit pacing with `retry_after` backoff, `/retry <turn_id>` command, `/fast [on

Sequence Diagram(s)

sequenceDiagram
    participant App as App Startup
    participant Store as Store
    participant DB as SQLite DB
    participant Heartbeat as Heartbeat Loop
    
    App->>Store: Store::open()
    activate Store
    Store->>Store: Generate instance_id
    Store->>DB: Claim app_instance_lock
    alt Lock held by live instance
        DB-->>Store: Lock error
        Store-->>App: Fail (concurrent instance)
    else Lock acquired
        Store->>DB: Query stale running turns
        DB-->>Store: Stale turn IDs
        Store->>DB: Mark stale turns failed
        Store-->>App: Success, instance_id
        deactivate Store
        
        App->>Heartbeat: Spawn heartbeat loop
        activate Heartbeat
        loop Every 30 seconds
            Heartbeat->>Store: heartbeat_instance()
            Store->>DB: UPDATE lock heartbeat_at
            DB-->>Store: OK
            alt Lock lost
                Heartbeat-->>App: Cancel shutdown
                deactivate Heartbeat
            else Lock held
                Heartbeat-->>Heartbeat: Continue
            end
        end
    end

sequenceDiagram
    participant User as User/Telegram
    participant App as App
    participant Limiter as Outbound<br/>Limiter
    participant TG as Telegram API
    
    User->>App: /turn request
    alt Private chat + use_message_drafts
        App->>Limiter: Wait per-chat interval
        Limiter->>TG: sendMessageDraft
        alt 429 Too Many Requests
            TG-->>Limiter: Error + retry_after
            Limiter->>Limiter: Record backoff
            Limiter-->>App: Fail
            App->>App: Mark turn failed
            App->>User: Rate-limit notice + Retry button
        else Success
            TG-->>App: draft_id
            App->>App: Create LiveTurnSink
        end
    else Group/Topic
        App->>App: Create preview sink
    end

sequenceDiagram
    participant User as User
    participant App as App
    participant Store as Store
    participant Codex as Codex
    
    User->>App: /retry 42
    App->>Store: retry_request_for_turn(42)
    alt Turn is failed/cancelled
        Store-->>App: TurnRequest
        App->>App: Set search_mode
        App->>Codex: Execute turn
        Codex-->>App: Response
        App->>Store: Update turn status
        App-->>User: Results
    else Turn not retryable
        App-->>User: Not retryable
    end

sequenceDiagram
    participant User as User
    participant App as App
    participant Store as Store
    
    User->>App: /fast on
    App->>Store: set_session_service_tier(key, "fast")
    Store-->>App: OK
    App-->>User: Fast mode enabled
    Note over App,Store: Subsequent turns include<br/>service_tier in Codex config

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A rabbit rejoices at resilience won,
With locks and heartbeats, no more stale turns undone—
Rate limits respected, retries flow smooth,
Fast mode toggled, /restart runs true.
From Telegram's 429 to systemd's care,
Telecodex bounces back, aware! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 34.01% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main changes: adding Telegram delivery pacing (rate limiting per chat_id, honoring retry_after) and session runtime controls (per-session /fast mode).
Description check	✅ Passed	The description is well-related to the changeset, outlining the key objectives including Telegram recovery, delivery pacing, draft-based previews, instance locking, binary upgrade handling, /fast mode, and documentation updates.
Linked Issues check	✅ Passed	The PR addresses all major requirements from `#9`: startup recovery for stuck turns, pre-Codex 429 handling with retry notices, /retry command support, graceful handling of rate-limited chunks, SQLite instance locking, binary upgrade hardening, and systemd integration.
Out of Scope Changes check	✅ Passed	All changes are within scope: documentation updates, Telegram rate-limiting infrastructure, database recovery/locking, session service_tier field, /fast and /retry command support, and related test coverage align with stated objectives.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ai-da-stas · 2026-05-14T15:50:19Z

I tested this branch on my real Telecodex instance connected to a Telegram forum chat. After building the release binary and restarting the service, I ran multiple concurrent tasks in separate Telegram topics/sessions. During the test, I also started one additional turn while the others were still active to check behavior under live concurrent load.

What I checked:

concurrent turns across different sessions;
live update delivery to Telegram;
no stuck busy sessions after completion;
SQLite state after completion;
fresh service logs;
database instance lock / heartbeat.

Result: all turns completed successfully, the final state was running=0 and busy=0, the lock heartbeat kept updating, and there were no new WARN, ERROR, or Telegram 429 entries during the test. For my real-world setup, these changes work correctly.

ai-da-stas · 2026-05-14T15:52:06Z

и еще раз спасибо за отличный проект! Вам срочно надо подключить монетизацию, я бы стал одним из первых кто готов поддержать монетой=)

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/app/turns.rs (1)

624-633: 💤 Low value

Redundant guard in flush().

The first guard (!force && last_flushed_text == visible_text && elapsed < min_flush_interval) is strictly subsumed by the second (!force && elapsed < min_flush_interval). Whenever the first would return, the second would too, so the text-equality check has no effect. Either drop the first guard, or — if the intent was to throttle unchanged text more aggressively than changed text — give the two checks distinct intervals.

♻️ Proposed simplification (remove the dead branch)

         let visible_text = self.visible_text(force);
         let min_flush_interval = self.min_flush_interval();
-        if !force
-            && self.last_flushed_text == visible_text
-            && self.last_flush_at.elapsed() < min_flush_interval
-        {
-            return Ok(());
-        }
         if !force && self.last_flush_at.elapsed() < min_flush_interval {
             return Ok(());
         }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/turns.rs` around lines 624 - 633, The flush() method contains a
redundant guard: the first if-check using min_flush_interval, last_flushed_text
== visible_text, last_flush_at.elapsed() and !force is subsumed by the
subsequent if-check that also tests !force and last_flush_at.elapsed() <
min_flush_interval; remove the dead branch to simplify. Locate the flush()
function and eliminate the entire first if block that references
min_flush_interval(), self.last_flushed_text == visible_text, and
self.last_flush_at.elapsed(), or alternatively implement distinct intervals
(e.g., a shorter min_flush_interval and a separate unchanged_text_interval) and
adjust the two guards so they check different thresholds; update any comments
referencing the old behavior accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/store.rs`:
- Around line 531-552: The query currently reads turns.review_json into a String
which will fail on NULL; change the row mapping to get review_json as
Option<String> (e.g., row.get::<_, Option<String>>(1)) and then handle the None
case before deserializing: only call serde_json::from_str when
review_json.is_some(), otherwise set review_mode = None (or appropriate default)
so retry reconstruction for failed/cancelled turns does not error; update
references to review_json and review_mode accordingly in the surrounding logic
(look for variables review_json, review_mode, and the row mapping closure).

---

Nitpick comments:
In `@src/app/turns.rs`:
- Around line 624-633: The flush() method contains a redundant guard: the first
if-check using min_flush_interval, last_flushed_text == visible_text,
last_flush_at.elapsed() and !force is subsumed by the subsequent if-check that
also tests !force and last_flush_at.elapsed() < min_flush_interval; remove the
dead branch to simplify. Locate the flush() function and eliminate the entire
first if block that references min_flush_interval(), self.last_flushed_text ==
visible_text, and self.last_flush_at.elapsed(), or alternatively implement
distinct intervals (e.g., a shorter min_flush_interval and a separate
unchanged_text_interval) and adjust the two guards so they check different
thresholds; update any comments referencing the old behavior accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dc69dd61-a8db-493a-bd68-27cde1c4b395

📥 Commits

Reviewing files that changed from the base of the PR and between dbda356 and 3772b8b.

📒 Files selected for processing (12)

README.md
README.ru.md
src/app.rs
src/app/presentation.rs
src/app/support.rs
src/app/tests.rs
src/app/turns.rs
src/codex.rs
src/commands.rs
src/models.rs
src/store.rs
src/telegram.rs

ai-da-stas · 2026-05-14T16:23:40Z

@Headcrab, Alex, i'am done=)

Headcrab · 2026-05-14T16:27:15Z

🥳🥳🥳

ai-da-stas added 8 commits May 11, 2026 21:21

Handle Telegram rate limits without stuck turns

78578ad

Keep turns running when limits snapshot fails

c132dde

Clarify retry help for cancelled turns

c64f867

Guard startup recovery with DB instance lock

fc7bb08

Make bot restarts work after binary upgrades

5776939

Stop gracefully when DB lock heartbeat is lost

030f147

Add per-session fast mode command

34a1df3

Queue Telegram outbound deliveries by chat

3772b8b

ai-da-stas marked this pull request as ready for review May 14, 2026 15:50

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Comment thread src/store.rs Outdated

Handle null review metadata in retry requests

cbb8f98

Headcrab merged commit cc19b6d into Headcrab:master May 14, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Telegram delivery pacing and session runtime controls#12

Add Telegram delivery pacing and session runtime controls#12
Headcrab merged 9 commits into
Headcrab:masterfrom
codex-k8s:codex/telegram-outbox-rate-limiter

ai-da-stas commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

Uh oh!

Headcrab commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ai-da-stas commented May 14, 2026

Summary

Notes

Validation

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ai-da-stas commented May 14, 2026

Uh oh!

Uh oh!

Headcrab commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 14, 2026 •

edited

Loading