⚡ Bolt: optimize backend concurrency for LLM recommendations#41
⚡ Bolt: optimize backend concurrency for LLM recommendations#41
Conversation
Changed the /api/recommend endpoint in backend/main.py from 'async def' to 'def'. This allows FastAPI to use its internal thread pool for the blocking LLM recommendation logic, preventing the main event loop from being blocked by synchronous operations. 📊 Impact: - Baseline: 14.86s total time for 5 concurrent requests (sequential processing). - Optimized: 6.85s total time for 5 concurrent requests (parallel processing). - Performance gain: ~54% reduction in total latency for concurrent users. Also includes repository hygiene cleanup (removing accidental __pycache__ files). Co-authored-by: LVT-ENG <214667862+LVT-ENG@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request documents a performance optimization regarding FastAPI event loop blocking and applies it to the recommend_garment endpoint by switching from async def to def. This change prevents synchronous blocking I/O from stalling the main event loop. A review comment suggests moving the optimization details from the function's docstring into a standard comment block to better align with Python documentation conventions and maintainability.
| """ | ||
| ⚡ BOLT OPTIMIZATION: Removed 'async' from the endpoint handler. | ||
| Why: The handler performs synchronous, blocking operations (HMAC auth, LLM calls). | ||
| FastAPI handles 'def' (non-async) endpoints by running them in a thread pool, | ||
| preventing the main event loop from being blocked and allowing true concurrency. | ||
| Performance Impact: Total latency for 5 concurrent requests reduced by ~54% (from 14.86s to 6.85s). | ||
| """ |
There was a problem hiding this comment.
The explanation for the async removal is valuable context. However, placing it in a docstring is not standard practice. Docstrings are for describing the function's public API (what it does, its parameters, and what it returns) for use by developers and documentation tools. Implementation details and optimization notes are better suited for regular comments.
I suggest restructuring this to have a proper docstring and a separate comment for the optimization note. This improves long-term maintainability and adherence to Python conventions.
| """ | |
| ⚡ BOLT OPTIMIZATION: Removed 'async' from the endpoint handler. | |
| Why: The handler performs synchronous, blocking operations (HMAC auth, LLM calls). | |
| FastAPI handles 'def' (non-async) endpoints by running them in a thread pool, | |
| preventing the main event loop from being blocked and allowing true concurrency. | |
| Performance Impact: Total latency for 5 concurrent requests reduced by ~54% (from 14.86s to 6.85s). | |
| """ | |
| """Recommends a garment based on user scan data. | |
| This endpoint authenticates the user, calculates the fit for a given garment, | |
| and provides AI-generated styling advice. | |
| """ | |
| # ⚡ BOLT OPTIMIZATION: Removed 'async' from the endpoint handler. | |
| # Why: The handler performs synchronous, blocking operations (HMAC auth, LLM calls). | |
| # FastAPI handles 'def' (non-async) endpoints by running them in a thread pool, | |
| # preventing the main event loop from being blocked and allowing true concurrency. | |
| # Performance Impact: Total latency for 5 concurrent requests reduced by ~54% (from 14.86s to 6.85s). |
The /api/recommend endpoint was incorrectly defined as
async def, which caused it to block the FastAPI event loop during synchronous LLM calls and HMAC authentication checks. This resulted in concurrent requests being processed sequentially.By changing the handler to a standard
def, FastAPI offloads these requests to an external thread pool, enabling true parallel execution.Verification:
backend/repro_bottleneck.py.PR created automatically by Jules for task 2684778380134735661 started by @LVT-ENG