allocation checks: check branch.status instead of making call to k8s#427
allocation checks: check branch.status instead of making call to k8s#427boddumanohar wants to merge 1 commit intomainfrom
Conversation
9244f7d to
ae12ecb
Compare
|
Do you have numbers on how long the current approach takes? |
|
Generally create branch API takes around 10-15 secs to return 201 response. |
|
|
|
if a project has 100 branches, the function From the branch status, we exclude the branches with status So during branch create, we make 100*6 = 600 network calls for the sake of validations. This seems very costly. |
|
I see the point. I think 100 branches is on the upper limit of what we can expect, and we should be able to parallelize this significantly. Still, it's a lot. Do you think it'd be feasible to modify the pod to include a readiness/liveness probe? I'm trying to find a way where we don't have to implement everything ourselves but can use builtins instead. |
noctarius
left a comment
There was a problem hiding this comment.
The comment should be brought up tomorrow morning on the standup. I think, otherwise, this PR is ok, since the branch status is already persisted in the database.
| .join(Branch) | ||
| .where( | ||
| Branch.project_id == project_id, | ||
| not_(status_column.in_([BranchServiceStatus.STOPPED, BranchServiceStatus.DELETING])), |
There was a problem hiding this comment.
This is dangerous (and I guess it was also wrong with the old code). Stopped branches don't count towards CPU / RAM allocation, but I think storage (db, object) still count, even when stopped. Only deleted branches are out of the game. WDYT @boddumanohar?
ae12ecb to
4b71d2a
Compare
Currently during branch create we have checks to check against provisioning limits. These changes takes time because it goes through all the branches and queries k8s for the resources information.
This is very expensive operation. So removed
_collect_branch_statusesfunction call and replaced it with logic to get the same from branch.statusInstead we could query branch.status. The tradeoff here is branch.status is around 1 minute old.