-
Notifications
You must be signed in to change notification settings - Fork 12
feat: introduce the recoverable crate #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #18 +/- ##
=======================================
Coverage 100.0% 100.0%
=======================================
Files 90 91 +1
Lines 6497 6554 +57
=======================================
+ Hits 6497 6554 +57 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new recoverable crate that provides standardized types for error classification and recovery behavior in resilience patterns. The crate enables consistent determination of whether conditions are recoverable (transient) or non-recoverable (permanent/successful).
- Core recovery classification system with
Recovery,RecoveryKind, andRecovertrait - Support for retry timing hints through delay metadata
- Service outage detection capabilities for widespread failures
- Comprehensive documentation and examples for proper usage patterns
Reviewed Changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/recoverable/src/lib.rs | Main implementation with Recovery struct, RecoveryKind enum, Recover trait, and comprehensive test suite |
| crates/recoverable/Cargo.toml | Package configuration for the new recoverable crate |
| crates/recoverable/README.md | Documentation and usage examples for the crate |
| crates/recoverable/CHANGELOG.md | Empty changelog file for future version tracking |
| crates/recoverable/logo.png | Git LFS tracked logo image for the crate |
| Cargo.toml | Workspace updates to include recoverable crate and static_assertions dependency |
| README.md | Updated main README to reference the new recoverable crate |
| CHANGELOG.md | Updated main changelog to reference recoverable's changelog |
|
I think the names of the types aren't quite right English-wise. I would recommend the following: RecoveryMetadata - type for classifying errors with recovery metadata As a first step. And then, I'm still not sure why the retry delay can't be incorporated directly into the RecoveryKind enum which would eliminate a separate type. You could add a delay function for this enum which would return an updated RecoveryKind with the given delay, so it would match the semantics that the current delay function has. So what scenario does this complicate? |
Regarding the type names, I brainstormed these a while back with @ralfbiedert. I feel that
This is what I had in previous iteration and it introduced friction. For example, one komponent evaluates the recovery kind, while the other extracts the let mut recovery = detect_recovery(...);
// ...
// little later
if let Some(delay) = extract_delay(...) {
recovery = recovery.delay(delay);
}In addition, I plan to introduce a Important part, that most consumers won't care about, is evaluation of recovery metadata. This will be done mostly once by individual resilience middleware. Here, the inspection is flattened out, and middleware looks at each property individually. (so check I tried the current simplified model in internal project and it indeed simplifies how the recovery metadata are consumed. Of course, this can still change as we will go through more usage patterns. But currently, I would like to try this latest approach. |
|
Well, even if you think my suggested names are too long, the existing names aren't right. "Recover" is a verb, it's not appropriate as a trait name. |
|
What about: Recover -> Recoverable (declares capability) |
Yeah, that works :-) |
That's not true, there are several lenses to look at trait naming, compare API Guidelines: Linguistically:
Functionally:
However, as a universal convention trait names are generally short, unless compound With that said I agree About |
| ServiceUnavailable { retry_after: Option<Duration> }, | ||
| } | ||
|
|
||
| impl Recoverable for NetworkError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be Recovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@geeknoid Would you agree with the rename?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't think it's right.
This doesn't do anything about recovery, it merely says that an error can be recovered from. When I see "recovery", I think "action", this IS the recovery to the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This trait only tells that particular error or result is recoverable but does not nor it is able to do the recovery action.
It's up to the caller to do the action itself based on the conditions.
For example, caller might decide ahead of time that particular action can be retried and he needs to put aside all information/parameters required to do that recovery action. This trait only gives the information that such recovery is possible.
Our names should reflect that. With this context in mind, I find the Recoverable more appropriate than Recovery.
(capability vs action)
db34508 to
088ba4f
Compare
088ba4f to
e5b4ce9
Compare
|
b88f39c to
821d37d
Compare
eef2bfc to
1e5a1d4
Compare
1e5a1d4 to
fd326dd
Compare
| //! | ||
| //! The recovery information describes whether recovering from an operation might help, not whether | ||
| //! the operation succeeded or failed. Both successful operations and permanent failures | ||
| //! should use [`RecoveryInfo::never`][RecoveryInfo::never] since recovery won't change the outcome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| //! should use [`RecoveryInfo::never`][RecoveryInfo::never] since recovery won't change the outcome. | |
| //! should use [`RecoveryInfo::never`] since recovery is not necessary or desirable. |
| // conventions because setters are used much more frequently than getters in typical usage patterns. | ||
| // The `get_` prefix on getters helps distinguish them from their corresponding setters. | ||
|
|
||
| /// Represents the recovery information associated with an operation or condition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// Represents the recovery information associated with an operation or condition. | |
| /// The recovery information associated with an operation or condition. |
| /// let recovery = RecoveryInfo::retry(); | ||
| /// assert_eq!(recovery.kind(), RecoveryKind::Retry); | ||
| /// ``` | ||
| #[derive(Debug, PartialEq, Clone, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you ever want a hash table of HashMap<RecoveryInfo, xxx>? (Hash trait)
| /// | ||
| /// To retrieve the recovery kind from a `RecoveryInfo` instance, use the [`RecoveryInfo::kind`] method. | ||
| /// | ||
| /// # Examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have a recommendation of what code is expected to do in general when it receives these different kinds. In particular, what should code do/assume when it gets Unknown? And what should it do if it gets a kind that not currently in the enum (given the non_exhaustive attribute).
| #![doc(html_logo_url = "https://media.githubusercontent.com/media/microsoft/oxidizer/refs/heads/main/crates/recoverable/logo.png")] | ||
| #![doc(html_favicon_url = "https://media.githubusercontent.com/media/microsoft/oxidizer/refs/heads/main/crates/recoverable/favicon.ico")] | ||
|
|
||
| //! Recovery information and classification for resilience patterns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some instances where a service fails and provides a hint to an alternate endpoint to connect to instead. Just like delay(), would it make sense to have a fallback(url) hint?
| /// assert_eq!(recovery.kind(), RecoveryKind::Retry); | ||
| /// ``` | ||
| #[derive(Debug, PartialEq, Clone, Eq)] | ||
| #[non_exhaustive] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need non_exhaustive here? It has no effect given the private fields.
Add a new
recoverablecrate that provides standardized types for classifyingerror conditions as recoverable or non-recoverable, enabling consistent retry
behavior across different error types and resilience middleware.
Core features:
RecoveryInfotype for classifying errors with recovery metadataRecoverabletrait for types that can determine their recoverabilityRecoveryKindenum distinguishing between retry, outage, never, and unknowndelay()method