-
Notifications
You must be signed in to change notification settings - Fork 103
Open
Description
We are using an async worker system that uses a fork which is already quite problematic with GRPC on a MAC.
But we are careful to make sure there is no weaviate client that gets initialized pre fork and that each worker process gets it's own client.
But that still leads to dynamic batching being stuck
def my_funct():
# Do some processing
client = weaviate.connect_to_weaviate_cloud(...)
with collection.batch.dynamic() as batch:
for row in data_rows:
_ = batch.add_object(properties=row['properties'], uuid=row['uuid']) # pyright: ignore[reportAny]I call this using
async def my_async_func():
# Do some async stuff here
await asyncio.to_thread(my_func)If I do no fork and just use a single worker process there is no issue. But if I fork there is a high change of the batch add_object calls getting stuck.
I think there is a global initialization somewhere when I import some weaviate code path because I notice that there is one call that always happens outside of my otel span
https://<our_url>.aws.weaviate.cloud/v1/nodes
It returns
{'nodes': [{'batchStats': {'ratePerSecond': 0}, 'gitHash': '15ca21c', 'name': 'weaviate-0', 'shards': None, 'status': 'HEALTHY', 'version': '1.32.16'}, {'batchStats': {'ratePerSecond': 0}, 'gitHash': '15ca21c', 'name': 'weaviate-1', 'shards': None, 'status': 'HEALTHY', 'version': '1.32.16'}, {'batchStats': {'ratePerSecond': 0}, 'gitHash': '15ca21c', 'name': 'weaviate-2', 'shards': None, 'status': 'HEALTHY', 'version': '1.32.16'}]}
Metadata
Metadata
Assignees
Labels
No labels