-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: Add default (de-)serialization for init parameter objects that define to_dict/from_dict
#10421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Pull Request Test Coverage Report for Build 21257841866Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
|
@sjrl While this PR is draft, I'd already like to get your opinion on it. In particular, do you think |
haystack/core/serialization.py
Outdated
| serialized_params = {} | ||
| for key, value in init_parameters.items(): | ||
| if isinstance(value, (Secret, ComponentDevice)): | ||
| if isinstance(value, (Secret, ComponentDevice)) or _is_document_store(value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies if this is a silly question, but why not just check if the value has the method to_dict? We already do this type of checking in our other serialization methods like used in for State and in PipelineSnapshot.
Then we trust the user to have serialized the object properly. This would also be a nifty way to support even custom objects or components made by users.
haystack/core/serialization.py
Outdated
| def _is_serialized_document_store(value: Any) -> bool: | ||
| """ | ||
| Check if a value is a serialized DocumentStore dictionary. | ||
| A dictionary is considered a serialized DocumentStore if: | ||
| - It has a "type" key with a string value that is a fully qualified class name | ||
| - It has an "init_parameters" key with a dict value | ||
| - The class can be imported and has a "from_dict" method | ||
| This matches the structure produced by DocumentStore.to_dict(). | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a parallel comment to this one https://github.com/deepset-ai/haystack/pull/10421/files#r2717180576 I wonder if we should be a little more lax here and just looks to see if value is a dict with at least key type and check if the loaded class has a from_dict method. Then if it does we go ahead and try to run cls.from_dict. So again putting trust into the user that they are following the expected serialization sturcture of a dict with type and init_parameters as keys and that the imported cls has the from_dict method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick feedback! Makes sense. I'll update the draft PR accordingly.
…haystack into default-serde-document-store
to_dict/from_dict
Related Issues
Proposed Changes:
default_to_dictso that we callto_dict()for any init parameters that have a callableto_dictdefault_from_dictso that we callfrom_dictfor any dict values wheretypeis a class that can be imported and that has a callablefrom_dict. Or we calldefault_from_dictif thetypeis a class that can be imported but that has no callablefrom_dict.component_to_dict/component_from_dict, which internally usedefault_to_dict/default_from_dictkeywhen deserialization failsSecret.from_dictinstead ofdeserialize_secrets_inplaceHow did you test it?
unit tests
Notes for the reviewer
Different DocumentStores don't have any init parameters in common. For example, ChromaDocumentStore does not haveindexas init parameter. I don't have a good idea how to better check that a dict is a serialized DocumentStore other than trying to import the class and checking if it looks like a DocumentStore. I am unsure if the complexity added by_is_serialized_document_storeis worth it.@sjrl suggested to simply check if to_dict/from_dict is defined and can be called.
Checklist
fix:,feat:,build:,chore:,ci:,docs:,style:,refactor:,perf:,test:and added!in case the PR includes breaking changes.