-
Notifications
You must be signed in to change notification settings - Fork 24
Named pipelines, | operator and reverse pipeline #173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Named pipelines, | operator and reverse pipeline #173
Conversation
Sub-pipelines are now virtually stored inside the pipeline steps names. This ensure that merging pipelines using | operator works and preserve the names.
Pull Request Test Coverage Report for Build 18666409006Details
💛 - Coveralls |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
I'll need to do some more diagnosis I guess. I was able to reproduce your pipeline notebook just fine. With the .reversed, I think it might be better to make it .reversed() if it's returning a new reversed pipeline rather than being the same pipeline with a .reversed attribute. It depends on how you expect people to use it. I'm happy with it staying as an attribute if the ideas is that people will use it that way. Like if someone modifies the state of the pipeline somehow, what happens to the reversed one? e.g. if they insert a new step in the middle, what should the reversed pipeline do? I don't think there's any "best" answer, it just needs to be documented. If .reversed is an attribute which will walk the steps applying undo, that's fine. But if you expect to get a copy of the pipeline with separate state, then it should be a method. I don't have an opinion about which one to choose. |
This reverts commit d8ca63f. Save nested pipelines in a dictionary.
| return graph, prior_step | ||
|
|
||
|
|
||
| class NotReversiblePipeline(RuntimeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this "ForwardOnly"? Just a thought
…r that" This reverts commit d075868. Trying to reverse a pipeline by reversing each underlying operation is more complicated than expected, due to operations that loose their state when transposed using `.T`. One example is the ToNumpy operation, which has multiple private members that won't be initialised after transposition, leading to a non-functional reversed operation. As multiple operations might have the save behavior, it seems easier to reverse the whole pipeline as a black box instead of each step.
Treat the ReversePipeline class as an opaque class, not meant to expose steps of of the underlying pipeline, etc. The Operation class provides the right API to capture the minimum operations this class needs (apply and undo) and makes it easy to integrate the ReversedPipeline in an existing pipeline.
|
@tennlee I think the PR is almost ready. Few things that we could change before merging:
|
|
So, it turned out that the All in all, it means that a preprocessing pipeline (including data sources) can be reversed without the need to carefully split and reassemble the reversible steps. I have updated the example notebook to show this at the end with a simple persistence-like model. I think this also means that users would need a good understanding of the undo behavior of each steps to avoid bad surprises. |
CoordinateFlatten only supports sequence of strings as input parameters now.
|
The API and notebook docs have been updated, @tennlee this PR is now ready :). |
This PR implements the following changes:
Few things need to be done to finish this PR: