fix: buffered writes when using `io.Copy` #318

frezbo · 2025-12-22T14:54:04Z

When copying files to a fat filesystem and if using io.Copy on the filesystem.OpenFile it was slow which the entries were flushed for each write which is in-efficient. Only flush once on Close().

When copying files to a fat filesystem and if using `io.Copy` on the `filesystem.OpenFile` it was slow which the entries were flushed for each write which is in-efficient. Only flush once on `Close()`. Signed-off-by: Noel Georgi <git@frezbo.dev>

frezbo · 2025-12-22T14:56:16Z

@deitch this is more of a draft stuff and I'll leave it upto the maintainers to do any fixes, most of this was generated with Claude so I don't feel fully confident in the implementation, I've tried to add tests.

The benchmark test shows how bad is when using io.Copy

Can be run as go test -bench=BenchmarkFAT32Write -benchmem -benchtime=3x ./filesystem/fat32/ -run=NONE

frezbo · 2025-12-22T14:57:29Z

filesystem/fat32/file.go

 	return totalRead, retErr
 }

+// ReadFrom reads data from r until EOF and writes it to the file.


from go doc io.Copy

go doc io.Copy package io // import "io" func Copy(dst Writer, src Reader) (written int64, err error) Copy copies from src to dst until either EOF is reached on src or an error occurs. It returns the number of bytes copied and the first error encountered while copying, if any. A successful Copy returns err == nil, not err == EOF. Because Copy is defined to read from src until EOF, it does not treat an EOF from Read as an error to be reported. If src implements WriterTo, the copy is implemented by calling src.WriteTo(dst). Otherwise, if dst implements ReaderFrom, the copy is implemented by calling dst.ReadFrom(src).

otherwise it was calling Write with a default buffer which is extremely slow

frezbo · 2025-12-22T14:58:15Z

filesystem/fat32/file.go

+	// Try to determine source size for optimal allocation using Seeker interface
+	var knownSize int64 = -1
+
+	if seeker, ok := r.(io.Seeker); ok {


this helps pre-allocate by checking if we source file implements io.Seeker if reading from a sourcefile with os.Open this works great

frezbo · 2025-12-22T14:59:10Z

filesystem/fat32/write_benchmark_test.go

+)
+
+const (
+	benchmarkFileSize = (4 * 1024 * 1024 * 1024) - 1 // 4GB


tried to use the max fat32 size, since that would be the most extreme case

deitch · 2025-12-24T13:46:18Z

Do you mind explaining the logic of the changes?

I really like that you did the two benchmark tests, which is nice to pull the contrast together. As far as I can tell, they are identical except for one does:

		_, err = destFile.Write(data)

And the other does:

		_, err = io.Copy(destFile, srcFile)

If the problem is just that io.Copy() has a small buffer, I would think that would be a fix for whoever calls it using CopyBuffer(). Or are you saying that even with that, the issue is the implementation provided by fat32.File, which has this comment you quoted:

    If src implements WriterTo, the copy is implemented by calling
    src.WriteTo(dst). Otherwise, if dst implements ReaderFrom, the copy is
    implemented by calling dst.ReadFrom(src).

Which are you saying is, or is not implemented? Or is it ReadFrom() that was missing? I am somewhat confused.

deitch · 2025-12-24T13:49:15Z

Also, does need a rebase.

frezbo · 2025-12-24T13:51:57Z

If the problem is just that io.Copy() has a small buffer,

it;'s not about the buffer, I had benchmark test that used a large buffer with similar bad performance

If ReaderFrom is not there, it uses Write with a default 32KB buffer, and each write re-calucated clusters if what I understood, causing massive allocations

deitch · 2025-12-25T15:34:56Z

OK, so the problem is that Write(b []byte) treats its argument as complete, which is the purpose of write. As such, it thinks it knows the entire size to write (len(b)), allocates the clusters for that size, and is done. But if you actually have a total size of 1000*len(b), then the write itself is fine, but it will allocate the clusters 1000 times, which is slow and inefficient. If it only knew, "this byte slice I am getting is not the whole thing, but one of many," then it might be more efficient about it.

You are trying to fix this by having it implement ReadFrom(), using it as a way to pass it an io.Reader, hence a stream, and use that to say, "this might not be everything you are going to read, I gave you a stream instead of a slice, so you know that." And then inside ReadFrom(), you add logic to try and guess the total size if you can; if not, then use an allocation algorithm that tries to be efficient about it.

Now that I get it (although I haven't dug into your allocation algorithm), I have a few questions.

First, if src implements WriterTo, then all of this will not work, since io.Copy(dst, src) calls src.WriteTo() before even trying dst.ReadFrom(). So even if this logic works, it needs to be implemented equally for Write(). Which does not obviate the need for ReadFrom() - it is a good thing to do - but it does mean the heuristic needs to be used for both Write() and ReadFrom(). I get that it is easier with ReadFrom() - single call, you know when you are done, can preallocate at the beginning and cleanup at the end - but we would need to account for both cases.

Second, do we know algorithm the fat32 driver uses, maybe in Linux? It must have a similar issue when someone calls write(2), which eventually sends data to a file descriptor, which is held open by the kernel for a particular filesystem driver, i.e. fat32.

A naive approach might be similar to what you did: always create a large number of clusters, keep them cached, use them from the cache, when the cache runs dry, allocate another large chunk. I don't recall offhand how fat32 filesystem handles clusters created but not for a particular file, if that is doable.

fix: buffered writes when using io.Copy

3a50f43

When copying files to a fat filesystem and if using `io.Copy` on the `filesystem.OpenFile` it was slow which the entries were flushed for each write which is in-efficient. Only flush once on `Close()`. Signed-off-by: Noel Georgi <git@frezbo.dev>

frezbo commented Dec 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: buffered writes when using `io.Copy` #318

fix: buffered writes when using `io.Copy` #318

Uh oh!

frezbo commented Dec 22, 2025

Uh oh!

frezbo commented Dec 22, 2025

Uh oh!

frezbo Dec 22, 2025

Uh oh!

frezbo Dec 22, 2025

Uh oh!

frezbo Dec 22, 2025

Uh oh!

deitch commented Dec 24, 2025 •

edited

Loading

Uh oh!

deitch commented Dec 24, 2025

Uh oh!

frezbo commented Dec 24, 2025

Uh oh!

deitch commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: buffered writes when using io.Copy #318

Are you sure you want to change the base?

fix: buffered writes when using io.Copy #318

Uh oh!

Conversation

frezbo commented Dec 22, 2025

Uh oh!

frezbo commented Dec 22, 2025

Uh oh!

frezbo Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

frezbo Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

frezbo Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

deitch commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

deitch commented Dec 24, 2025

Uh oh!

frezbo commented Dec 24, 2025

Uh oh!

deitch commented Dec 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: buffered writes when using `io.Copy` #318

fix: buffered writes when using `io.Copy` #318

deitch commented Dec 24, 2025 •

edited

Loading