Stream POST request in order to handle large files by maelle-le-bon · Pull Request #161 · CTFd/ctfcli

maelle-le-bon · 2024-11-10T20:25:35Z

The bug

I ran into a problem last year: when I tried to create or synchronize a challenge containing a large file (i.e. a forensics challenge with a 15 GB disk image), the entire file was put into memory before starting the request

This causes crashes since I only have 16GB of RAM in my computer.

The cause

Although the requests module supports body streaming when you pass a file pointer to the data parameter, it is not capable of streaming form-data.

When the requests module prepares the headers, it tries to calculate the Content-Length. As a result, the entire body will be stored in memory.

The fix

One solution would be to switch to another HTTP client, capable of streaming form-data.

I chose to modify as little code as possible. I made the choice to delegate the body encoding to the MultipartEncoder from the requests-toolbelt module. This requires a few modifications to the API class, since the MultipartEncoder takes parameters differently from requests.

As a result files must be sent with a filename hint:

# Before
api.post("/api/v1/files", files=[
    ( "file", open("./file.ova") )
], data={ ... })

# After
api.post("/api/v1/files", files={
    "file": ( "file.ova", open("./file.ova"))
}, data={ ... })

# If you want to send multiple files under the key "file", you can use tuple or list instead of dict
api.post("/api/v1/files", files=[
    ("file", ( "file.ova", open("./file.ova"))),
    ("file", ( "description.txt", open("./description.txt")))
], data={ ... })

The Multipart encoder helps requests to upload large files without the need to read the entire file in memory

ColdHeat · 2024-11-26T01:03:19Z

Did you try compressing the file? I feel like streaming is okay but I think really the issue is that disk images are really huge.

maelle-le-bon · 2024-11-26T01:08:11Z

Even when compressing asset, it is really common to upload very large files. It could be .ova or android disk image for example.

Storing large file into memory just to send a HTTP request is not ok. It's a known limitation of the requests module and it's a shame we need to use another module to achieve what should be a normal behavior.

(Sorry if I made grammar mistakes it's 2am in France)

ColdHeat · 2024-11-26T01:19:13Z

My point is more that instead of switching out the behavior in ctfcli, it probably would have been better to just compress your file.

While I am roughly okay with the PR and using streaming, I am not sure if ctfcli should support behavior like this. No one really wants to download a 16 GB file.

maelle-le-bon · 2024-11-26T01:26:07Z

I understand but sometimes we have simply no choice 😅

For example, this is a CTF we organize each year. The last edition, we got files up to 2.5GB. During deployments this caused big spikes of RAM usage in order to send the file, and sometimes it caused OOM errors.
https://github.com/BreizhCTF/breizhctf-2024/blob/main/Forensic/Tampered/dist/bzhctf.ova

Forensic challenges can be really big.16GB was the largest archive I've ever seen (I was a disk dump from a Windows server if I remember correctly)

pl4nty · 2024-11-30T01:57:33Z

This would help me too, I often have compressed forensics artefacts in the 1-5GB range. I upload to CTFd Cloud using beefy CI runners which don't crash, but synchronous reading into memory makes the upload pretty slow. My other workaround was manually uploading to external blob storage, then linking from CTFd

Zeecka · 2026-05-17T08:07:00Z

I ran into the same issue. No one really wants to download a 16 GB file. -> +16 GB compressed disc images for Forensic local training is definitely a use case. I'm already aware of the importance of using high ratio compression on such files. But CTFd is not only intended to be used on remote CTF with low bandwidth.

ColdHeat · 2026-05-18T16:37:03Z

I think that's a stronger rationale and I think in truth limits need to truly enforced versus how they are ultimately hit.

I will fix the merge conflicts and accept.

Copilot

Pull request overview

This PR replaces requests' built-in multipart encoding with requests_toolbelt.MultipartEncoder in ctfcli.core.api.API.request, so that large file uploads (e.g. multi-GB challenge artifacts) are streamed instead of being buffered in memory to compute Content-Length. The new encoder takes fields in a different shape, so callers must now pass each file as (filename, fileobj) tuples, and int form values are coerced to strings since MultipartEncoder rejects non-string values.

Changes:

Rewrite API.request to detect form-data calls and route them through MultipartEncoder, setting Content-Type from the encoder.
Update file-upload call sites (_create_file, _create_all_files) to the new (filename, fileobj) value shape.
Add requests-toolbelt==1.0.0 dependency and regenerate uv.lock.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

File	Description
ctfcli/core/api.py	Builds multipart body via MultipartEncoder for streaming uploads
ctfcli/core/challenge.py	Updates `_create_file`/`_create_all_files` to the new files= shape
pyproject.toml	Adds `requests-toolbelt==1.0.0` dependency
uv.lock	Regenerated lockfile (revision bump and upload_time fields)

Note: _create_solution in ctfcli/core/challenge.py (around line 587–594) still passes ("file", open(...)) to files=, which is incompatible with the new MultipartEncoder contract; this is outside the diff regions so a comment could not be attached there, but it should be fixed in the same PR.

Comments suppressed due to low confidence (1)

ctfcli/core/api.py:67

Only dict is handled here, but the outer guard on line 58 also admits any Mapping. A non-dict Mapping (e.g. an OrderedDict subclass with a custom type, or another mapping) would skip the value-stringification path and fields would be left empty. Use isinstance(data, Mapping) for consistency with the outer check.

            if isinstance(data, dict):
                # int are not allowed as value in MultipartEncoder
                fields = list(map(lambda v: (v[0], str(v[1]) if isinstance(v[1], int) else v[1]), data.items()))

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            if isinstance(data, dict):
+                # int are not allowed as value in MultipartEncoder
+                fields = list(map(lambda v: (v[0], str(v[1]) if isinstance(v[1], int) else v[1]), data.items()))
+
+            if files is not None:
+                if isinstance(files, dict):


+                # int are not allowed as value in MultipartEncoder
+                fields = list(map(lambda v: (v[0], str(v[1]) if isinstance(v[1], int) else v[1]), data.items()))


+
+            return super(API, self).request(
+                method,
+                url,
+                data=multipart,
+                headers={"Content-Type": multipart.content_type},


+            fields = list()
+            if isinstance(data, dict):
+                # int are not allowed as value in MultipartEncoder
+                fields = list(map(lambda v: (v[0], str(v[1]) if isinstance(v[1], int) else v[1]), data.items()))
+
+            if files is not None:
+                if isinstance(files, dict):
+                    files = list(files.items())
+                fields.extend(files)  # type: ignore
+
+            multipart = MultipartEncoder(fields)
+
+            return super(API, self).request(
+                method,
+                url,
+                data=multipart,
+                headers={"Content-Type": multipart.content_type},


maelle-le-bon added 3 commits November 10, 2024 21:03

Add MultipartEncoder to support request streaming

5d7d13a

The Multipart encoder helps requests to upload large files without the need to read the entire file in memory

Remove unused typing

f0d7a8c

Remove duplicate call

27e87e1

Merge branch 'CTFd:master' into master

fe447fa

Merge branch 'master' of github.com:CTFd/ctfcli

56fb7b9

ColdHeat requested a review from Copilot May 18, 2026 16:47

Copilot started reviewing on behalf of ColdHeat May 18, 2026 16:48 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

ColdHeat added 2 commits May 18, 2026 12:58

Update _create_solution to use new file upload strategy

0b10cf5

Only override content type instead of all headers

fa1dd47

ColdHeat merged commit 44da596 into CTFd:master May 18, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream POST request in order to handle large files#161

Stream POST request in order to handle large files#161
ColdHeat merged 7 commits into
CTFd:masterfrom
maelle-le-bon:master

maelle-le-bon commented Nov 10, 2024 •

edited

Loading

Uh oh!

ColdHeat commented Nov 26, 2024

Uh oh!

maelle-le-bon commented Nov 26, 2024 •

edited

Loading

Uh oh!

ColdHeat commented Nov 26, 2024

Uh oh!

maelle-le-bon commented Nov 26, 2024 •

edited

Loading

Uh oh!

pl4nty commented Nov 30, 2024

Uh oh!

Zeecka commented May 17, 2026

Uh oh!

ColdHeat commented May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		# int are not allowed as value in MultipartEncoder
		fields = list(map(lambda v: (v[0], str(v[1]) if isinstance(v[1], int) else v[1]), data.items()))

Conversation

maelle-le-bon commented Nov 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The bug

The cause

The fix

Uh oh!

ColdHeat commented Nov 26, 2024

Uh oh!

maelle-le-bon commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ColdHeat commented Nov 26, 2024

Uh oh!

maelle-le-bon commented Nov 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pl4nty commented Nov 30, 2024

Uh oh!

Zeecka commented May 17, 2026

Uh oh!

ColdHeat commented May 18, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

maelle-le-bon commented Nov 10, 2024 •

edited

Loading

maelle-le-bon commented Nov 26, 2024 •

edited

Loading

maelle-le-bon commented Nov 26, 2024 •

edited

Loading