-
Notifications
You must be signed in to change notification settings - Fork 57
feat: add __copy__, __deepcopy__, and __replace__ for FFI objects
#438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @junrushao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the interoperability between C++ FFI objects and Python by introducing robust copying mechanisms. It allows users to perform shallow and deep copies, as well as field-based replacements, on C++-backed objects directly from Python, aligning their behavior with standard Python objects. This change improves the usability and flexibility of FFI objects within Python workflows. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces __copy__, __deepcopy__, and __replace__ for FFI objects, which is a valuable addition for better Python integration. The implementation is well-designed, with a C++ ObjectDeepCopier that correctly handles complex object graphs using a two-phase, topological sort-based approach. The Python bindings correctly wire up the new functionality for both @register_object and @c_class paths, and the new tests are comprehensive.
I have one suggestion to improve the robustness of the C++ deep copy implementation by making container traversal iterative to prevent potential stack overflows with deeply nested structures.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces __copy__, __deepcopy__, and __replace__ methods for FFI objects, which is a valuable addition for Python users. The implementation includes a C++ deep copier that correctly handles object graphs with cycles and shared references, and the Python-side wiring is well-executed. The new functionality is supported by a comprehensive test suite.
I've identified a critical issue in the C++ deep copy logic that could lead to errors when copying objects with read-only fields. I've also suggested adding documentation for a design limitation that requires object fields to be writable for deep copy to work. Apart from these points, the changes look solid.
| def _get_deep_copy_func() -> core.Function: | ||
| global _deep_copy_func_cache # noqa: PLW0603 | ||
| if _deep_copy_func_cache is None: | ||
| _deep_copy_func_cache = get_global_func("ffi.DeepCopy") | ||
| return _deep_copy_func_cache | ||
|
|
||
|
|
||
| _deep_copy_func_cache: core.Function | None = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably don't need this - just call get_global_func every time
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces __copy__, __deepcopy__, and __replace__ functionality for FFI objects, which is a great feature for improving the Python ergonomics. The implementation is robust, featuring a C++ refl::enable_copy() helper for opting into copy support, a well-designed deep copy algorithm in C++ that correctly handles cycles and shared references, and the necessary Python bindings to integrate with the standard copy module. The new functionality is also well-covered by a comprehensive suite of new tests. The overall quality of the code is high. I have a couple of minor suggestions to improve the clarity of comments in the C++ deep copy implementation.
36a33e8 to
c666a90
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces __copy__, __deepcopy__, and __replace__ functionality for FFI objects, which is a great feature for improving the ergonomics of working with these objects in Python. The implementation is robust, handling object graphs with shared references and cycles correctly through a multi-pass deep copy algorithm in C++. The Python-side integration is clean, dynamically adding the appropriate copy methods based on whether the C++ type has copy support enabled. The new tests are comprehensive and cover a wide range of scenarios, including basic copies, nested objects, cycles, and error conditions. I have one suggestion to improve the efficiency of the C++ type registration logic.
| TVM_FFI_INLINE ObjectDef& def([[maybe_unused]] enable_copy ec, Extra&&... extra) { | ||
| // Register __ffi_shallow_copy__ as an instance method | ||
| RegisterMethod(kShallowCopyMethodName, false, &enable_copy::template shallow_copy<Class>, | ||
| std::forward<Extra>(extra)...); | ||
| // Also register as a type attribute for generic deep copy lookup | ||
| Function copy_fn = GetMethod(std::string(type_key_) + "." + kShallowCopyMethodName, | ||
| &enable_copy::template shallow_copy<Class>); | ||
| TVMFFIByteArray attr_name = {kShallowCopyMethodName, | ||
| std::char_traits<char>::length(kShallowCopyMethodName)}; | ||
| TVMFFIAny attr_value = AnyView(copy_fn).CopyToTVMFFIAny(); | ||
| TVM_FFI_CHECK_SAFE_CALL(TVMFFITypeRegisterAttr(type_index_, &attr_name, &attr_value)); | ||
| return *this; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation creates the Function object for the shallow copy method twice: once within RegisterMethod on line 657 and again explicitly via GetMethod on line 660. This is slightly inefficient as it involves redundant work during type registration.
You can avoid this by retrieving the method that was just registered using TVMFFITypeGetMethod and then using that to register the type attribute. This makes the code more efficient and avoids the redundant object creation.
TVM_FFI_INLINE ObjectDef& def([[maybe_unused]] enable_copy ec, Extra&&... extra) {
// Register __ffi_shallow_copy__ as an instance method
RegisterMethod(kShallowCopyMethodName, false, &enable_copy::template shallow_copy<Class>,
std::forward<Extra>(extra)...);
// Also register as a type attribute for generic deep copy lookup
TVMFFIMethodInfo method_info;
TVMFFIByteArray method_name = {kShallowCopyMethodName,
std::char_traits<char>::length(kShallowCopyMethodName)};
TVM_FFI_CHECK_SAFE_CALL(TVMFFITypeGetMethod(type_index_, &method_name, &method_info));
TVM_FFI_CHECK_SAFE_CALL(TVMFFITypeRegisterAttr(type_index_, &method_name, &method_info.method));
return *this;
}0be24b8 to
6a51521
Compare
| * \brief Reflection-based object copy utilities | ||
| */ | ||
| #ifndef TVM_FFI_EXTRA_COPY_H_ | ||
| #define TVM_FFI_EXTRA_COPY_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deep_copy.h?
| * \param value The value to deep copy. | ||
| * \return The deep copied value. | ||
| */ | ||
| TVM_FFI_EXTRA_CXX_API Any DeepCopyValue(const Any& value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel DeepCopy as an API name is fine here
| * | ||
| * \note The object type must have a copy constructor. | ||
| */ | ||
| struct enable_copy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be enable_deep_copy to be explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's for shallow copy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i understand, seems the confusion part is that it enables a shallow copy API that was required for deep copy?
src/ffi/extra/copy.cc
Outdated
| * | ||
| * Algorithm: | ||
| * Phase 1: Walk the entire object graph (objects, arrays, maps) using BFS, | ||
| * collecting all reachable nodes in a vector. An unordered_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely this algorithm can be simplified to a single pass algorithm with memo_ that maps previously mapped Object into a new one.
The RunCopy will internally look up the memo and return copied objects if found. If not, it will then attempt to run the deep copy recursively and store the result into the memo. There is no need to record things like node order, visited will be part of the memo_ map that dedup double visits
ccfc66c to
abd9ddd
Compare
|
Still - implication of cyclic reference and CoW containers (Array, Map) makes it pretty challenging to have an absolutely correct implementation of |
abd9ddd to
1aaade0
Compare
Summary
refl::enable_copy()to the C++ reflection API, which registers a__ffi_shallow_copy__method using the copy constructorObjectDeepCopierinsrc/ffi/extra/copy.ccthat walks the object graph (DFS with explicit stack), then copies in topological order (leaves first) with field replacement__copy__,__deepcopy__, and__replace__for both@register_objectand@c_classpaths; types withoutenable_copy()raiseTypeErrorTest plan
17 new tests in
tests/python/test_copy.pycovering:copy.copy(): basic fields, new object identity, mutable fields, mutation isolation, derived type preservation, unsupported type errorcopy.deepcopy(): basic fields, nested object copying, shared reference preservation, mutation isolation, unsupported type error__replace__(): single/multiple field replacement, no-kwargs copy, original unchanged, readonly field error, unsupported type errorMisc
--error-on-warningto thetypre-commit hook