feat: run instance creation in parallel #2

4censord · 2025-05-02T13:51:52Z

Basically just changes the loops in Increase and Decrease to run in parallel using go routines.

So now instead of waiting until one instance is completely created on the cloudscale side (Servers.WaitFor) and only then continuing on, it creates all instances at the same time.

I'd assume one might run into rate limits with huge setups, but for us that hasn't happened yet.

href · 2025-05-06T06:36:18Z

Thank you for your PR. I took a look at it, and I think the direction is right, but there are some issues with the approach, due to backend constraints you could not be aware of:

The server API DELETE and POST calls are largely serial in the backend. If they arrive at the endpoint at the same time, they will mostly be worked on in a serialized fashion. This is partly an implementation detail, but also has to do with scheduling: The algorithm that places a VM in OpenStack considers one VM at a time, as scheduling multiple VMs in one call would be orders of magnituted harder from algorithm standpoint.

So if you were to time it, you would find that you do not gain much with concurrency.

What does matter is waiting for servers to be actually started (via the WaitFor call). So I suggest to update this PR as follows:

Create servers serially, then await them concurrently. Maybe the main thread can create the servers in a loop, send the resulting server instances to workers via a channel, then wait for all the workers to await the server.
Limit how many servers are created concurrently. Maybe if a channel is used to await the servers, the channel could be size-constrainted, so that the main loop would block if too many workers are already awaiting servers. I mean this mainly as a some kind of upper bound for sanity that should never be breached (e.g., I don't expect there to ever be more than 10 servers launched in parallel).
Do not use concurrency for delete requests. As noted, those are serial anyway, and in our experience deleting a number of servers serially, vs. concurrently does not make a difference in time. So the additional code complexity there is not worth it in my opinion.

Let me know what you think.

4censord · 2025-05-06T12:50:21Z

So I've implemented the “create in serial, await in parallel” approach, and reverted the deletion to serial.

But, instead of having a precreated/limited number of awaiters, it dynamically creates goroutines to wait for servers becoming ready.

Mostly because I dislike having an upper bound in the plugin, as these features are already present as part of the GitLab runner configuration (max_instances combined with scale_throttle.limit and scale_throttle.burst settings. 1 [2] )

Also, some pipelines we currently have more than 10 jobs that run in parallel, so that would slow these down.

[2]: https://docs.gitlab.com/runner/configuration/advanced-configuration/#the-runnersautoscalerscale_throttle-section )

href · 2025-05-07T07:02:08Z

Thanks, I think the approach looks fair now. Setting an upper bound borders a bit on the paranoid, so I think you make a good argument. Since server creation is serial now, I don't foresee the API being a bottleneck either: Checking the server state is fairly fast.

I didn't do a functional test yet, as I would like to do that in one go, together with the other PR, once that is read.

href

I had some trouble getting this to work. It looks like in its current form the Increase call never returns.

provider.go

href · 2025-05-13T08:44:45Z

Thank you. I'll go ahead and merge this PR, and the other one once the tests pass. Afterward I want to create a beta release and integrate it into https://github.com/cloudscale-ch/gitlab-runner. This will give me some additional testing mileage and would also make it possible for you to run this already, pending a proper release.

However, quay.io currently throws HTTP 500 whenever I push a container, so this is currently slowing me down a bit. I'll wait and see first if this resolves itself, before maying publishing it elsewhere. Either way, I'm aiming to have at least the beta release this week, and a proper release by next week.

In the meantime, thank you for your contributions and your patience 😅

feat: run instance creation in parallel

dffe954

4censord force-pushed the 4c/create-in-parallel branch from 72b064c to dffe954 Compare May 2, 2025 13:52

revert: dont use concurency for instance deletion

525df38

4censord mentioned this pull request May 7, 2025

feat: allow configuring in which network instances are created #1

Merged

href suggested changes May 8, 2025

View reviewed changes

provider.go Outdated Show resolved Hide resolved

provider.go Show resolved Hide resolved

provider.go Outdated Show resolved Hide resolved

provider.go Outdated Show resolved Hide resolved

4censord force-pushed the 4c/create-in-parallel branch from 42c7958 to 311753d Compare May 9, 2025 06:46

4censord added 2 commits May 9, 2025 08:48

fix: create instances in serial, wait in parallel

93d3efd

fix: Use mutex instead of channel magic

6d34e38

4censord force-pushed the 4c/create-in-parallel branch from 311753d to 6d34e38 Compare May 9, 2025 06:48

href merged commit 411ea6b into cloudscale-ch:main May 13, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: run instance creation in parallel #2

feat: run instance creation in parallel #2

Uh oh!

4censord commented May 2, 2025 •

edited

Loading

Uh oh!

href commented May 6, 2025

Uh oh!

4censord commented May 6, 2025 •

edited

Loading

Uh oh!

href commented May 7, 2025

Uh oh!

href left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

href commented May 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: run instance creation in parallel #2

feat: run instance creation in parallel #2

Uh oh!

Conversation

4censord commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

href commented May 6, 2025

Uh oh!

4censord commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

href commented May 7, 2025

Uh oh!

href left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

href commented May 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

4censord commented May 2, 2025 •

edited

Loading

4censord commented May 6, 2025 •

edited

Loading