Skip to content

Add Telegram delivery pacing and session runtime controls#12

Merged
Headcrab merged 9 commits into
Headcrab:masterfrom
codex-k8s:codex/telegram-outbox-rate-limiter
May 14, 2026
Merged

Add Telegram delivery pacing and session runtime controls#12
Headcrab merged 9 commits into
Headcrab:masterfrom
codex-k8s:codex/telegram-outbox-rate-limiter

Conversation

@ai-da-stas
Copy link
Copy Markdown
Contributor

Summary

This replaces the now-closed #9 and #10 so the related runtime and Telegram delivery changes can be reviewed together.

  • add Telegram pre-Codex 429 recovery with retry guidance and retry buttons
  • queue/pause outbound Telegram deliveries per chat_id, use a safer group/topic cadence, and honor Bot API retry_after
  • use sendMessageDraft previews in private chats while keeping group/topic previews on throttled editMessageText
  • keep turns running when Codex limit snapshot collection fails
  • guard startup recovery with a database instance lock and stop gracefully if that lock is lost
  • make bot restarts work after binary upgrades
  • add per-session /fast mode and surface fast mode in turn-start status
  • document the Telegram delivery pacing and draft/edit preview behavior in both READMEs

Notes

Transparency note: this is an honest vibe-coded contribution. I mostly work in Go and used Codex heavily for the Rust implementation, so I kept the changes focused and added tests around the touched behavior.

Validation

  • cargo fmt
  • cargo test
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo build --release

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Warning

Rate limit exceeded

@ai-da-stas has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 30 minutes and 12 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 08dc9378-8f86-471d-ab56-294cac13ca89

📥 Commits

Reviewing files that changed from the base of the PR and between 3772b8b and cbb8f98.

📒 Files selected for processing (1)
  • src/store.rs

Walkthrough

This PR hardens Telecodex against Telegram rate limits and bot restarts. It introduces database instance locking and interrupted turn recovery on startup to handle stale sessions; implements outbound rate limiting per chat with retry_after backoff; adds /retry <turn_id> command and retry buttons for failed turns; implements /fast [on|off|status] service tier toggling; and fixes /restart_bot to handle deleted binaries and systemd-managed processes.

Changes

Database Instance Safety & Turn Recovery

Layer / File(s) Summary
Instance lock, heartbeat, and recovery mechanisms
src/store.rs, src/app.rs
Store::open now claims an app_instance_lock to prevent concurrent instances, assigns instance_id, recovers stale running turns as failed and clears stale sessions.busy. Heartbeat task checks lock health and cancels shutdown if lost. release_instance_lock explicitly releases on restart via Drop implementation.
Turn retry request reconstruction
src/store.rs, src/app/turns.rs
retry_request_for_turn rebuilds a TurnRequest from a failed/cancelled turn by joining stored turns and sessions data. Used by the retry flow to re-enqueue turns without user attachments.

Telegram Rate Limit Resilience & Retry Flow

Layer / File(s) Summary
Outbound rate limiter and backoff
src/telegram.rs
New OutboundRateLimiter gates chat-scoped API calls (sendMessage, editMessageText, topic/draft operations) with per-chat + global spacing and respects Telegram's retry_after backoff. post_outbound helper wraps API calls to apply limiter and extract retry-after delays.
Draft-aware placeholder and pre-Codex failure handling
src/app/turns.rs, src/models.rs, src/store.rs
process_turn attempts to send draft placeholder for private chats (via LiveTurnSink::new_draft), falling back to preview sink on failure. Pre-Codex Telegram rate-limit failures are caught, the turn marked failed, and a user-facing retry notice scheduled with backoff.
Retry button, command parsing, and enqueue
src/commands.rs, src/app.rs, src/app/tests.rs
/retry <turn_id> command parses turn IDs, and rate_limit_retry_keyboard(turn_id) builds inline buttons. BridgeCommand::RetryTurn loads a retry request from the store, auto-sets search mode, and enqueues the turn for re-execution.
Message draft payload and serialization
src/telegram.rs
New SendMessageDraft struct with draft_id field replaces separate parameters in send_message_draft. Constructor html(...) builds HTML payloads for serialization into the Telegram API.

Service Tier & Bot Restart Improvements

Layer / File(s) Summary
Service tier field, session storage, and propagation
src/models.rs, src/store.rs, src/codex.rs, src/app/presentation.rs
SessionRecord gains optional service_tier: Option<String> field, stored in sessions table and read/written by set_session_service_tier and recovery logic. build_review_command and build_turn_start_params pass service_tier to Codex via CLI config and JSON params. Session status display shows fast: on/off.
Fast mode command and session state updates
src/commands.rs, src/app.rs
New FastMode enum (Status, On, Off) and `/fast [on
Restart hardening: systemd detection and deleted binary recovery
src/app/support.rs, src/app.rs
spawn_restarted_process now checks SYSTEMD_EXEC_PID and exits cleanly if managed by systemd. restart_executable resolves current_exe() and falls back to stripped path if inode is marked (deleted) after binary upgrades. RestartBot spawns delayed task to release instance lock before exit. shutdown_signal helper unifies Ctrl-C, SIGTERM (Unix), and CancellationToken shutdown.
Test coverage for recovery, retry, and service tier
src/app/tests.rs, src/store.rs, src/codex.rs, src/app/turns.rs
Tests updated to initialize service_tier: None and new shutdown token. Added builds_rate_limit_retry_keyboard, turn_start_banner "fast" mode tests, and retry_request_for_turn tests. Store tests validate instance lock claim/release and interrupted turn recovery.
Documentation: rate limits, retry, and service tier
README.md, README.ru.md
Updated to describe progress streaming via drafts/edits, per-chat rate-limit pacing with retry_after backoff, /retry <turn_id> command, `/fast [on

Sequence Diagram(s)

sequenceDiagram
    participant App as App Startup
    participant Store as Store
    participant DB as SQLite DB
    participant Heartbeat as Heartbeat Loop
    
    App->>Store: Store::open()
    activate Store
    Store->>Store: Generate instance_id
    Store->>DB: Claim app_instance_lock
    alt Lock held by live instance
        DB-->>Store: Lock error
        Store-->>App: Fail (concurrent instance)
    else Lock acquired
        Store->>DB: Query stale running turns
        DB-->>Store: Stale turn IDs
        Store->>DB: Mark stale turns failed
        Store-->>App: Success, instance_id
        deactivate Store
        
        App->>Heartbeat: Spawn heartbeat loop
        activate Heartbeat
        loop Every 30 seconds
            Heartbeat->>Store: heartbeat_instance()
            Store->>DB: UPDATE lock heartbeat_at
            DB-->>Store: OK
            alt Lock lost
                Heartbeat-->>App: Cancel shutdown
                deactivate Heartbeat
            else Lock held
                Heartbeat-->>Heartbeat: Continue
            end
        end
    end
Loading
sequenceDiagram
    participant User as User/Telegram
    participant App as App
    participant Limiter as Outbound<br/>Limiter
    participant TG as Telegram API
    
    User->>App: /turn request
    alt Private chat + use_message_drafts
        App->>Limiter: Wait per-chat interval
        Limiter->>TG: sendMessageDraft
        alt 429 Too Many Requests
            TG-->>Limiter: Error + retry_after
            Limiter->>Limiter: Record backoff
            Limiter-->>App: Fail
            App->>App: Mark turn failed
            App->>User: Rate-limit notice + Retry button
        else Success
            TG-->>App: draft_id
            App->>App: Create LiveTurnSink
        end
    else Group/Topic
        App->>App: Create preview sink
    end
Loading
sequenceDiagram
    participant User as User
    participant App as App
    participant Store as Store
    participant Codex as Codex
    
    User->>App: /retry 42
    App->>Store: retry_request_for_turn(42)
    alt Turn is failed/cancelled
        Store-->>App: TurnRequest
        App->>App: Set search_mode
        App->>Codex: Execute turn
        Codex-->>App: Response
        App->>Store: Update turn status
        App-->>User: Results
    else Turn not retryable
        App-->>User: Not retryable
    end
Loading
sequenceDiagram
    participant User as User
    participant App as App
    participant Store as Store
    
    User->>App: /fast on
    App->>Store: set_session_service_tier(key, "fast")
    Store-->>App: OK
    App-->>User: Fast mode enabled
    Note over App,Store: Subsequent turns include<br/>service_tier in Codex config
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Poem

🐰 A rabbit rejoices at resilience won,
With locks and heartbeats, no more stale turns undone—
Rate limits respected, retries flow smooth,
Fast mode toggled, /restart runs true.
From Telegram's 429 to systemd's care,
Telecodex bounces back, aware! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.01% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: adding Telegram delivery pacing (rate limiting per chat_id, honoring retry_after) and session runtime controls (per-session /fast mode).
Description check ✅ Passed The description is well-related to the changeset, outlining the key objectives including Telegram recovery, delivery pacing, draft-based previews, instance locking, binary upgrade handling, /fast mode, and documentation updates.
Linked Issues check ✅ Passed The PR addresses all major requirements from #9: startup recovery for stuck turns, pre-Codex 429 handling with retry notices, /retry command support, graceful handling of rate-limited chunks, SQLite instance locking, binary upgrade hardening, and systemd integration.
Out of Scope Changes check ✅ Passed All changes are within scope: documentation updates, Telegram rate-limiting infrastructure, database recovery/locking, session service_tier field, /fast and /retry command support, and related test coverage align with stated objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor Author

I tested this branch on my real Telecodex instance connected to a Telegram forum chat. After building the release binary and restarting the service, I ran multiple concurrent tasks in separate Telegram topics/sessions. During the test, I also started one additional turn while the others were still active to check behavior under live concurrent load.

What I checked:

  • concurrent turns across different sessions;
  • live update delivery to Telegram;
  • no stuck busy sessions after completion;
  • SQLite state after completion;
  • fresh service logs;
  • database instance lock / heartbeat.

Result: all turns completed successfully, the final state was running=0 and busy=0, the lock heartbeat kept updating, and there were no new WARN, ERROR, or Telegram 429 entries during the test. For my real-world setup, these changes work correctly.

@ai-da-stas ai-da-stas marked this pull request as ready for review May 14, 2026 15:50
@ai-da-stas
Copy link
Copy Markdown
Contributor Author

и еще раз спасибо за отличный проект! Вам срочно надо подключить монетизацию, я бы стал одним из первых кто готов поддержать монетой=)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/app/turns.rs (1)

624-633: 💤 Low value

Redundant guard in flush().

The first guard (!force && last_flushed_text == visible_text && elapsed < min_flush_interval) is strictly subsumed by the second (!force && elapsed < min_flush_interval). Whenever the first would return, the second would too, so the text-equality check has no effect. Either drop the first guard, or — if the intent was to throttle unchanged text more aggressively than changed text — give the two checks distinct intervals.

♻️ Proposed simplification (remove the dead branch)
         let visible_text = self.visible_text(force);
         let min_flush_interval = self.min_flush_interval();
-        if !force
-            && self.last_flushed_text == visible_text
-            && self.last_flush_at.elapsed() < min_flush_interval
-        {
-            return Ok(());
-        }
         if !force && self.last_flush_at.elapsed() < min_flush_interval {
             return Ok(());
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/app/turns.rs` around lines 624 - 633, The flush() method contains a
redundant guard: the first if-check using min_flush_interval, last_flushed_text
== visible_text, last_flush_at.elapsed() and !force is subsumed by the
subsequent if-check that also tests !force and last_flush_at.elapsed() <
min_flush_interval; remove the dead branch to simplify. Locate the flush()
function and eliminate the entire first if block that references
min_flush_interval(), self.last_flushed_text == visible_text, and
self.last_flush_at.elapsed(), or alternatively implement distinct intervals
(e.g., a shorter min_flush_interval and a separate unchanged_text_interval) and
adjust the two guards so they check different thresholds; update any comments
referencing the old behavior accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/store.rs`:
- Around line 531-552: The query currently reads turns.review_json into a String
which will fail on NULL; change the row mapping to get review_json as
Option<String> (e.g., row.get::<_, Option<String>>(1)) and then handle the None
case before deserializing: only call serde_json::from_str when
review_json.is_some(), otherwise set review_mode = None (or appropriate default)
so retry reconstruction for failed/cancelled turns does not error; update
references to review_json and review_mode accordingly in the surrounding logic
(look for variables review_json, review_mode, and the row mapping closure).

---

Nitpick comments:
In `@src/app/turns.rs`:
- Around line 624-633: The flush() method contains a redundant guard: the first
if-check using min_flush_interval, last_flushed_text == visible_text,
last_flush_at.elapsed() and !force is subsumed by the subsequent if-check that
also tests !force and last_flush_at.elapsed() < min_flush_interval; remove the
dead branch to simplify. Locate the flush() function and eliminate the entire
first if block that references min_flush_interval(), self.last_flushed_text ==
visible_text, and self.last_flush_at.elapsed(), or alternatively implement
distinct intervals (e.g., a shorter min_flush_interval and a separate
unchanged_text_interval) and adjust the two guards so they check different
thresholds; update any comments referencing the old behavior accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dc69dd61-a8db-493a-bd68-27cde1c4b395

📥 Commits

Reviewing files that changed from the base of the PR and between dbda356 and 3772b8b.

📒 Files selected for processing (12)
  • README.md
  • README.ru.md
  • src/app.rs
  • src/app/presentation.rs
  • src/app/support.rs
  • src/app/tests.rs
  • src/app/turns.rs
  • src/codex.rs
  • src/commands.rs
  • src/models.rs
  • src/store.rs
  • src/telegram.rs

Comment thread src/store.rs Outdated
@ai-da-stas
Copy link
Copy Markdown
Contributor Author

@Headcrab, Alex, i'am done=)

@Headcrab Headcrab merged commit cc19b6d into Headcrab:master May 14, 2026
2 checks passed
@Headcrab
Copy link
Copy Markdown
Owner

🥳🥳🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants