Skip to content

feat: optimize LLM batch processing and improve async handling#10

Open
kunalbhardwaj2006 wants to merge 1 commit intoAOSSIE-Org:mainfrom
kunalbhardwaj2006:feature/async-batching-clean-2
Open

feat: optimize LLM batch processing and improve async handling#10
kunalbhardwaj2006 wants to merge 1 commit intoAOSSIE-Org:mainfrom
kunalbhardwaj2006:feature/async-batching-clean-2

Conversation

@kunalbhardwaj2006
Copy link
Copy Markdown
Contributor

@kunalbhardwaj2006 kunalbhardwaj2006 commented Mar 22, 2026

🚀 Overview

This PR improves LLM batch processing by optimizing async execution and reducing redundant calls.

🔧 Changes

  • Improved async handling for LLM requests
  • Optimized batch processing logic
  • Reduced redundant computations
  • Improved logging and error handling

⚡ Impact

  • Faster LLM response generation
  • Better performance for batch processing
  • Cleaner and maintainable code

🧪 Testing

  • Verified batch processing flow
  • No breaking changes observed

📌 Notes

This PR focuses only on LLM optimization. Classification pipeline improvements will be submitted separately.

Summary by CodeRabbit

  • New Features

    • Added concurrent text generation capabilities for batch processing multiple prompts
    • Implemented concurrency rate limiting to ensure system stability
  • Improvements

    • Streamlined error handling with improved logging

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 22, 2026

Walkthrough

The module adds async and concurrent batch processing support to text generation. Three new async functions enable concurrent prompt processing with a concurrency limit of 5 simultaneous requests. The existing generate_text() function was simplified by removing redundant logging and error handling boilerplate.

Changes

Cohort / File(s) Summary
Async Functions & Concurrency
generator/src/llm_utils.py
Added generate_text_async() to run blocking calls in thread executor, controlled_generate_text() to wrap calls with semaphore-based concurrency limiting (max 5 concurrent), and process_batch() for concurrent generation of multiple prompts. Introduced module-level asyncio.Semaphore(5) and updated imports to include asyncio and partial.
Error Handling & Control Flow Simplification
generator/src/llm_utils.py
Removed inner logging/validation block with empty-response warning and length logs; removed outer parsing try/except; now directly parses resp.json() and returns data.get("response",""). Consolidated error logging to single "LLM error: ..." catch-all message while retaining empty string fallback.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant process_batch as process_batch()
    participant controlled_gen as controlled_generate_text()
    participant Semaphore
    participant generate_async as generate_text_async()
    participant Executor as Thread Executor
    participant generate_text as generate_text()
    participant LLM as LLM Service

    Client->>process_batch: process_batch(prompts)
    Note over process_batch: Creates tasks via asyncio.gather
    process_batch->>controlled_gen: multiple concurrent calls
    controlled_gen->>Semaphore: acquire()
    activate Semaphore
    Semaphore-->>controlled_gen: granted
    controlled_gen->>generate_async: generate_text_async(prompt)
    generate_async->>Executor: run_in_executor(generate_text)
    activate Executor
    Executor->>generate_text: generate_text(prompt)
    generate_text->>LLM: HTTP request
    LLM-->>generate_text: response
    generate_text-->>Executor: parsed response
    deactivate Executor
    Executor-->>generate_async: result
    generate_async-->>controlled_gen: awaited result
    Semaphore->>Semaphore: release()
    deactivate Semaphore
    controlled_gen-->>process_batch: response
    process_batch-->>Client: all responses via gather
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Suggested labels

Python Lang

Poem

🐰 Async hops across the meadow bright,
Five at a time, a batching delight,
No more blocking in single-thread's night,
Semaphore gates keep the load just right,
Concurrency dances—oh what a sight!

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding async support and batch processing optimization to the LLM utilities module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

@github-actions github-actions bot added size/S and removed size/S labels Mar 22, 2026
@kunalbhardwaj2006
Copy link
Copy Markdown
Contributor Author

Hi @imxade , @Zahnentferner

All checks have completed successfully. Please let me know if any improvements are needed.

Thanks!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@generator/src/llm_utils.py`:
- Around line 38-40: In the except block that currently calls logger.error(f"LLM
error: {e}", exc_info=True) (inside the LLM call/error handling in
generator/src/llm_utils.py), replace that call with logger.exception("LLM error:
%s", e) to use the idiomatic logger.exception() which automatically includes the
traceback and also switch to %s formatting for lazy evaluation; keep the return
"" behavior unchanged.
- Around line 54-56: The process_batch coroutine uses asyncio.gather(...,
return_exceptions=True) which mixes Exception objects with successful string
results; add a docstring to process_batch explaining that results may be
Exception instances (from controlled_generate_text), callers must check each
result with isinstance(result, Exception) before treating it as a valid
response, and show an example pattern for handling errors vs. successful strings
so consumers know to inspect and handle exceptions returned as values.
- Around line 43-46: Replace the deprecated asyncio.get_event_loop() call in
generate_text_async with asyncio.get_running_loop() so the coroutine uses the
current running loop; in function generate_text_async (which is invoked from
controlled_generate_text → process_batch) call loop =
asyncio.get_running_loop(), keep the rest (partial(generate_text, ...),
loop.run_in_executor) unchanged, and ensure this surfaces a RuntimeError if no
loop is running.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 51d0382c-b3f0-4cb0-b47b-050568815762

📥 Commits

Reviewing files that changed from the base of the PR and between f0dcda6 and 56cf47f.

📒 Files selected for processing (1)
  • generator/src/llm_utils.py

Comment on lines 38 to 40
except Exception as e:
logger.error(f"Error parsing Ollama response: {e}", exc_info=True)
logger.error(f"LLM error: {e}", exc_info=True)
return ""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Use logger.exception() instead of logger.error() with exc_info=True.

Per static analysis hint, logger.exception() is the idiomatic way to log exceptions with traceback. Also consider using %s formatting for lazy evaluation.

♻️ Proposed fix
     except Exception as e:
-        logger.error(f"LLM error: {e}", exc_info=True)
+        logger.exception("LLM error: %s", e)
         return ""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
logger.error(f"Error parsing Ollama response: {e}", exc_info=True)
logger.error(f"LLM error: {e}", exc_info=True)
return ""
except Exception as e:
logger.exception("LLM error: %s", e)
return ""
🧰 Tools
🪛 Ruff (0.15.6)

[warning] 39-39: Logging .exception(...) should be used instead of .error(..., exc_info=True)

(G201)


[warning] 39-39: Logging statement uses f-string

(G004)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@generator/src/llm_utils.py` around lines 38 - 40, In the except block that
currently calls logger.error(f"LLM error: {e}", exc_info=True) (inside the LLM
call/error handling in generator/src/llm_utils.py), replace that call with
logger.exception("LLM error: %s", e) to use the idiomatic logger.exception()
which automatically includes the traceback and also switch to %s formatting for
lazy evaluation; keep the return "" behavior unchanged.

Comment on lines +43 to +46
async def generate_text_async(prompt, model=None, url_override=None):
loop = asyncio.get_event_loop()
func = partial(generate_text, prompt, model, url_override)
return await loop.run_in_executor(None, func)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

asyncio.get_event_loop deprecated Python 3.10 get_running_loop

💡 Result:

Yes, asyncio.get_event_loop is deprecated in Python 3.10 when called from coroutines or callbacks without a running event loop, or in certain contexts where no current event loop exists. Instead, use asyncio.get_running_loop inside async functions (coroutines or callbacks), as it raises RuntimeError if no loop is running, ensuring safer usage. asyncio.get_running_loop (added in Python 3.7) returns the running event loop in the current OS thread and can only be called from a coroutine or callback. For top-level entry points, prefer asyncio.run(main), which handles loop creation and cleanup automatically, avoiding manual loop management altogether. Example migration: - Old (deprecated in async context): loop = asyncio.get_event_loop - New: loop = asyncio.get_running_loop Deprecation warnings appear in Python 3.10+ when get_event_loop is used improperly, and in later versions (e.g., 3.14+), it may raise RuntimeError if no current loop exists. Multiple projects (Langchain, Kingpin) have updated codebases to replace it with get_running_loop.

Citations:


🏁 Script executed:

# Find project configuration files to determine Python version requirement
git ls-files | grep -E "(setup\.py|pyproject\.toml|requirements|tox\.ini|\.python-version)" | head -20

Repository: AOSSIE-Org/LibrEd

Length of output: 87


🏁 Script executed:

# Search for usages of generate_text_async to understand the calling context
rg "generate_text_async" --type py -B 2 -A 2

Repository: AOSSIE-Org/LibrEd

Length of output: 625


🏁 Script executed:

# Check the context around the function to see if there are any comments or tests
cat -n generator/src/llm_utils.py | head -60

Repository: AOSSIE-Org/LibrEd

Length of output: 1956


🏁 Script executed:

cat generator/requirements.txt

Repository: AOSSIE-Org/LibrEd

Length of output: 221


🏁 Script executed:

git ls-files | grep -E "setup\.(py|cfg)|pyproject\.toml|\.python-version"

Repository: AOSSIE-Org/LibrEd

Length of output: 43


🏁 Script executed:

git ls-files | grep -E "workflow|github|\.yml|\.yaml" | head -10

Repository: AOSSIE-Org/LibrEd

Length of output: 397


🏁 Script executed:

cat .github/workflows/ci.yml | head -50

Repository: AOSSIE-Org/LibrEd

Length of output: 1162


Use asyncio.get_running_loop() instead of asyncio.get_event_loop().

asyncio.get_event_loop() is deprecated in Python 3.10+ when called from a coroutine and may return an unexpected loop. Since this is an async function always called from an async context (controlled_generate_textprocess_batch), use asyncio.get_running_loop() which correctly returns the running loop or raises RuntimeError if none is running.

Proposed fix
 async def generate_text_async(prompt, model=None, url_override=None):
-    loop = asyncio.get_event_loop()
+    loop = asyncio.get_running_loop()
     func = partial(generate_text, prompt, model, url_override)
     return await loop.run_in_executor(None, func)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@generator/src/llm_utils.py` around lines 43 - 46, Replace the deprecated
asyncio.get_event_loop() call in generate_text_async with
asyncio.get_running_loop() so the coroutine uses the current running loop; in
function generate_text_async (which is invoked from controlled_generate_text →
process_batch) call loop = asyncio.get_running_loop(), keep the rest
(partial(generate_text, ...), loop.run_in_executor) unchanged, and ensure this
surfaces a RuntimeError if no loop is running.

Comment on lines +54 to +56
async def process_batch(prompts):
tasks = [controlled_generate_text(p) for p in prompts]
return await asyncio.gather(*tasks, return_exceptions=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add docstring clarifying that exceptions are returned as values.

With return_exceptions=True, failed requests return Exception objects mixed with successful str results. Callers must check isinstance(result, Exception) to distinguish errors from valid responses.

📝 Proposed fix with docstring
 async def process_batch(prompts):
+    """Process multiple prompts concurrently with limited concurrency.
+    
+    Args:
+        prompts: List of prompt strings to process.
+        
+    Returns:
+        List of results where each element is either:
+        - str: The generated text response
+        - Exception: If the request failed
+        
+    Callers should check isinstance(result, Exception) to handle errors.
+    """
     tasks = [controlled_generate_text(p) for p in prompts]
     return await asyncio.gather(*tasks, return_exceptions=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@generator/src/llm_utils.py` around lines 54 - 56, The process_batch coroutine
uses asyncio.gather(..., return_exceptions=True) which mixes Exception objects
with successful string results; add a docstring to process_batch explaining that
results may be Exception instances (from controlled_generate_text), callers must
check each result with isinstance(result, Exception) before treating it as a
valid response, and show an example pattern for handling errors vs. successful
strings so consumers know to inspect and handle exceptions returned as values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant