-
Notifications
You must be signed in to change notification settings - Fork 31
feat: Implement optional driver GetMachineStatus method #334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
b8e2bff to
ae88578
Compare
ae88578 to
5e609d1
Compare
aaronfern
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Can you please explain a little more about the backing reason for this PR?
Will GetmachineStatus now need to be implemented for all providers for GEP-28?
pkg/driver/executor/executor.go
Outdated
| func (ex *Executor) getMachineByID(ctx context.Context, serverID string) (*servers.Server, error) { | ||
| klog.V(2).Infof("finding server with [ID=%q]", serverID) | ||
| // GetMachineByID fetches the data for a server based on a provider-encoded ID. | ||
| func (ex *Executor) GetMachineByID(ctx context.Context, providerID string) (*servers.Server, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please change the name of this method to GetMachineByProviderID so that it's clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also run integration tests and add those logs here? (Please reach out if you need assistance with that)
| // to store tags in a respective field and do a server-side filtering. To avoid incompatibility with older versions | ||
| // we will continue making the filtering clientside. | ||
| func (ex *Executor) getMachineByName(ctx context.Context, machineName string) (*servers.Server, error) { | ||
| func (ex *Executor) GetMachineByName(ctx context.Context, machineName string) (*servers.Server, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please update this method to also return providerID? You can construct it using encodeProviderID()
You can then use this provider id in the GetMachineStatusResponse of GetMachineStatus()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get the exact change you want, GetMachineByName returns the machine (similar to GetMachineByProviderID) from which I in return extract the providerID of the machine that was found, so it is already part of the returned Server object, and used in the GetMachineStatusResponse.
|
Hi @aaronfern! Thanks for the first review. I'll check your comments and do the testing. For the motivation: Working on GEP-28 (self-hosted shoots, managed infra scenario) we noticed that the control plane worker node never became ready due to the logic in the mcm not adding the required node label on the I noticed that I can also update the PR description soon. |
5e609d1 to
ed75679
Compare
|
Hey @jamand
Yes, please, that would be really helpful |
|
@aaronfern I added comments and updated the PR description. I'll be on holidays the next weeks. I haven't been able to run the integration tests on our environment yet unfortunately, is there a possibility to run it on the |
|
Thank you @jamand, I will re-review your PR
Sure, not a problem, if there are still some more concerns or changes, you can take it up when you're back
Yes, but it is still a manual process. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamand I am actually kinda opposed to the PR.
When I made the initial commits, I decided explicitly to skip enabling the optional GetMachineStatus. The reason was that back then we didn't have the InitializeMachine and because of that, things that need to be done post server creation had the chance that if they failed, then a GetMachineStatus like this implementation would leave machines potentially undeleted but also uninitialized.
You would have to refactor everything that happens after the server creation to the InitializeMachine call and even that I consider inferior in the sense that with the GetMachineStatus unimplemented, the CreateMachine call is a full reconciliation of the machine rather than an imperative call for certain operations. I would still think it is okay if you decide to go the InitializeMachine for consistency reasons but not sure if it's an improvement. 🤷
|
With regards to
is it that MCM calls the list everytime before making the create call or is there some kind of list call there that I forgot 😅 |
How to categorize this PR?
/area quality
/kind enhancement
/platform openstack
What this PR does / why we need it:
This PR implements the GetMachineStatus method of the mcm-provider-openstack, which is optional as per the Machine Error code handling (but which is implemented for other providers like aws or gcp).
Working on GEP-28 (self-hosted shoots, managed infra scenario) we noticed that the control plane worker node never became ready due to the logic in the mcm not adding the required node label on the Machine objects. Would the GetMachineStatus have been implemented it would have done so. As the implementation is optional, @maboehm implemented a fix in the gardener/machine-controller-manager#1050 itself. The implementation of the
GetMachineStatusis now more of a "nice to have" which could potentially reduce some loads on the OpenStack API Server by doing a lookup by provider ID first if provided and then by name instead of having to callCreateof the driver which has to fetch the list of all Servers in the project every time.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Tested in Self-hosted Managed infrastructure flow. Requires running e2e.
Release note: