-
Notifications
You must be signed in to change notification settings - Fork 120
feat(weaviate): implement persistent connection management #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(weaviate): implement persistent connection management #234
Conversation
Enhance Weaviate client to reuse connections across LLM responses instead of reconnecting on each request. - Add _connected flag to track connection state - Add ensure_connected() to reuse existing connections - Add close_weaviate_client() for proper cleanup on shutdown - Integrate cleanup in FastAPI lifespan handler Closes AOSSIE-Org#227
|
Warning Rate limit exceeded@vaishcodescape has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 13 minutes and 52 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughPersistent connection management for the Weaviate client was added: Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Lifespan as FastAPI lifespan
participant EC as ensure_connected()
participant WC as Weaviate Client
participant State as Global state (_client/_connected)
App->>Lifespan: Run application
App->>EC: First DB operation -> ensure_connected()
EC->>State: Check _connected (False)
EC->>WC: client.connect()
WC-->>EC: Connected
EC->>State: Set _client and _connected = True
EC-->>App: Return connected client
App->>EC: Subsequent DB operation -> ensure_connected()
EC->>State: Check _connected (True)
EC-->>App: Return existing client
Lifespan->>WC: Shutdown -> close_weaviate_client()
WC->>State: Check _client/_connected
alt Connected
WC->>WC: client.close()
WC->>State: Reset _client and _connected = False
end
WC-->>Lifespan: Closed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @backend/app/database/weaviate/client.py:
- Around line 20-30: ensure_connected() has a race where multiple coroutines may
call client.connect() concurrently; add a module-level asyncio.Lock (e.g.,
_connect_lock) and wrap the connection initialization with "async with
_connect_lock:" so only one coroutine performs the connect. Inside the locked
section, re-check the conditions (use get_client(), _connected, and
client.is_connected()) before calling client.connect(), set _connected = True
and log on success, and let other coroutines simply return the client after the
lock. Ensure the lock is awaited with async with to release on exceptions and
reference the existing symbols ensure_connected, get_client, client.connect,
_connected, and _client.
In @backend/main.py:
- Around line 103-104: The shutdown currently awaits close_weaviate_client()
before awaiting app_instance.stop_background_tasks(), which can cause background
workers to hit a closed Weaviate connection; reverse the calls so you await
app_instance.stop_background_tasks() first (letting workers finish or be
stopped) and then call await close_weaviate_client(); ensure
stop_background_tasks() completes or signals worker termination before tearing
down the Weaviate client.
🧹 Nitpick comments (3)
backend/main.py (1)
5-5: Consider consolidating imports from the same module.Both
close_weaviate_clientandget_weaviate_client(line 15) are imported from the same module. Consider combining them:-from app.database.weaviate.client import close_weaviate_client ... -from app.database.weaviate.client import get_weaviate_client +from app.database.weaviate.client import close_weaviate_client, get_weaviate_clientbackend/app/database/weaviate/client.py (2)
39-41: Uselogging.exceptionto include traceback automatically.Per static analysis,
logging.exceptionis preferred overlogging.errorwhen re-raising, as it automatically includes the traceback:except Exception as e: - logger.error("Weaviate client error: %s", e) + logger.exception("Weaviate client error: %s", e) raise
44-57: LGTM with minor observation.The shutdown logic correctly handles the happy path. The broad
Exceptioncatch is appropriate here since shutdown cleanup should be resilient. The redundant_connected = Falsein both the try block and finally is harmless.One edge case: if
_clientisNonebut_connectedis somehowTrue(shouldn't happen with current code), the flag won't be reset. Consider resetting unconditionally:async def close_weaviate_client(): """Close the Weaviate client. Call this on application shutdown.""" global _client, _connected if _client is not None and _connected: try: await _client.close() - _connected = False logger.info("Weaviate client closed") except Exception as e: logger.warning("Error closing Weaviate client: %s", e) finally: _client = None - _connected = False + + _client = None + _connected = False
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/database/weaviate/client.pybackend/main.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/main.py (1)
backend/app/database/weaviate/client.py (1)
close_weaviate_client(44-57)
🪛 Ruff (0.14.10)
backend/app/database/weaviate/client.py
40-40: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
53-53: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (1)
backend/app/database/weaviate/client.py (1)
9-9: Good addition of explicit connection tracking.The
_connectedflag combined withclient.is_connected()check provides defense-in-depth for detecting stale connections.
Refactor the lifespan handler in FastAPI to ensure the Weaviate client is properly closed during application shutdown. Introduce a connection lock in the Weaviate client to prevent race conditions when establishing connections. - Move close_weaviate_client() to the correct position in the lifespan handler - Implement connection locking in ensure_connected() to manage concurrent access This change improves the stability and reliability of the Weaviate client connection management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @backend/app/database/weaviate/client.py:
- Around line 50-63: close_weaviate_client() can race with ensure_connected();
acquire the shared client lock (e.g., _client_lock) at function start, then
check if _client is not None and attempt to close it regardless of _connected (a
partially initialized client still needs cleanup), await _client.close() inside
the locked section or atomically set a flag to prevent reconnects, set _client =
None and _connected = False under the same lock, and release the lock; also
ensure ensure_connected() uses the same _client_lock so connection attempts and
shutdown are serialized.
- Around line 10-11: Module-level asyncio.Lock() (_connect_lock) ties the lock
to the event loop at import time and can raise RuntimeError; change to lazy
initialization by removing the module-level asyncio.Lock() creation and add a
helper function _get_connect_lock() that creates and returns a global
_connect_lock = asyncio.Lock() if not set (binding it to the current loop on
first use), then update ensure_connected() to call and await the lock returned
by _get_connect_lock() instead of using the former module-level _connect_lock;
keep the _connected boolean logic intact and ensure _get_connect_lock() is used
anywhere the lock was previously referenced.
🧹 Nitpick comments (1)
backend/app/database/weaviate/client.py (1)
45-47: Uselogging.exceptionto preserve stack trace.When catching and re-raising exceptions,
logger.exception()automatically includes the traceback, which aids debugging. Per static analysis hint (TRY400).Proposed fix
except Exception as e: - logger.error("Weaviate client error: %s", e) + logger.exception("Weaviate client error: %s", e) raise
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/database/weaviate/client.pybackend/main.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/main.py
🧰 Additional context used
🪛 Ruff (0.14.10)
backend/app/database/weaviate/client.py
46-46: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
59-59: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (1)
backend/app/database/weaviate/client.py (1)
22-36: Double-checked locking pattern looks correct, but consider edge case handling.The implementation correctly uses double-checked locking for thread-safe connection establishment. However, if the connection drops unexpectedly (i.e.,
is_connected()returnsFalsebut_connectedis stillTrue), the reconnection succeeds but_connectedwas never set toFalsefirst. This works but could be cleaner:Optional: Reset flag before reconnecting for clarity
if not _connected or not client.is_connected(): + _connected = False # Reset before attempting reconnect await client.connect() _connected = True logger.info("Weaviate client connected")
Replace the existing connection lock with a dedicated client lock to enhance concurrency control. Introduce a helper function to manage the client lock and ensure proper locking during connection establishment and closure. - Replace _connect_lock with _client_lock for better clarity - Implement _get_client_lock() to initialize the client lock as needed - Update ensure_connected() and close_weaviate_client() to use the new locking mechanism This change aims to further stabilize the Weaviate client connection management.
|
Hey! @smokeyScraper @chandansgowda I have added the required enhancements now this should work properly and the connection should stay stable also I took care of the lock issues and race conditions which were constantly occurring have a look ! |
Closes #227
📝 Description
This PR fixes the issue where the Weaviate client automatically connects and disconnects when the LLM gives a response. The previous implementation was creating a new connection and closing it on every request, which was inefficient and could cause performance issues
🔧 Changes Made
_connectedflag andensure_connected()to reuse active connectionsget_weaviate_client()to avoid closing connections per requestclose_weaviate_client()for graceful shutdown cleanup%formattingbackend/main.py)📷 Screenshots or Visual Changes (if applicable)
None
🤝 Collaboration
None
✅ Checklist
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.