Skip to content

Conversation

@frezbo
Copy link
Contributor

@frezbo frezbo commented Dec 22, 2025

When copying files to a fat filesystem and if using io.Copy on the filesystem.OpenFile it was slow which the entries were flushed for each write which is in-efficient. Only flush once on Close().

When copying files to a fat filesystem and if using `io.Copy` on the `filesystem.OpenFile` it was slow which the entries were flushed for each write which is in-efficient.
Only flush once on `Close()`.

Signed-off-by: Noel Georgi <git@frezbo.dev>
@frezbo
Copy link
Contributor Author

frezbo commented Dec 22, 2025

@deitch this is more of a draft stuff and I'll leave it upto the maintainers to do any fixes, most of this was generated with Claude so I don't feel fully confident in the implementation, I've tried to add tests.

The benchmark test shows how bad is when using io.Copy

Can be run as go test -bench=BenchmarkFAT32Write -benchmem -benchtime=3x ./filesystem/fat32/ -run=NONE

return totalRead, retErr
}

// ReadFrom reads data from r until EOF and writes it to the file.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from go doc io.Copy

go doc io.Copy
package io // import "io"

func Copy(dst Writer, src Reader) (written int64, err error)
    Copy copies from src to dst until either EOF is reached on src or an
    error occurs. It returns the number of bytes copied and the first error
    encountered while copying, if any.

    A successful Copy returns err == nil, not err == EOF. Because Copy is
    defined to read from src until EOF, it does not treat an EOF from Read as an
    error to be reported.

    If src implements WriterTo, the copy is implemented by calling
    src.WriteTo(dst). Otherwise, if dst implements ReaderFrom, the copy is
    implemented by calling dst.ReadFrom(src).

otherwise it was calling Write with a default buffer which is extremely slow

// Try to determine source size for optimal allocation using Seeker interface
var knownSize int64 = -1

if seeker, ok := r.(io.Seeker); ok {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this helps pre-allocate by checking if we source file implements io.Seeker if reading from a sourcefile with os.Open this works great

)

const (
benchmarkFileSize = (4 * 1024 * 1024 * 1024) - 1 // 4GB
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tried to use the max fat32 size, since that would be the most extreme case

@deitch
Copy link
Collaborator

deitch commented Dec 24, 2025

Do you mind explaining the logic of the changes?

I really like that you did the two benchmark tests, which is nice to pull the contrast together. As far as I can tell, they are identical except for one does:

		_, err = destFile.Write(data)

And the other does:

		_, err = io.Copy(destFile, srcFile)

If the problem is just that io.Copy() has a small buffer, I would think that would be a fix for whoever calls it using CopyBuffer(). Or are you saying that even with that, the issue is the implementation provided by fat32.File, which has this comment you quoted:

    If src implements WriterTo, the copy is implemented by calling
    src.WriteTo(dst). Otherwise, if dst implements ReaderFrom, the copy is
    implemented by calling dst.ReadFrom(src).

Which are you saying is, or is not implemented? Or is it ReadFrom() that was missing? I am somewhat confused.

@deitch
Copy link
Collaborator

deitch commented Dec 24, 2025

Also, does need a rebase.

@frezbo
Copy link
Contributor Author

frezbo commented Dec 24, 2025

If the problem is just that io.Copy() has a small buffer,

it;'s not about the buffer, I had benchmark test that used a large buffer with similar bad performance

If ReaderFrom is not there, it uses Write with a default 32KB buffer, and each write re-calucated clusters if what I understood, causing massive allocations

@deitch
Copy link
Collaborator

deitch commented Dec 25, 2025

OK, so the problem is that Write(b []byte) treats its argument as complete, which is the purpose of write. As such, it thinks it knows the entire size to write (len(b)), allocates the clusters for that size, and is done. But if you actually have a total size of 1000*len(b), then the write itself is fine, but it will allocate the clusters 1000 times, which is slow and inefficient. If it only knew, "this byte slice I am getting is not the whole thing, but one of many," then it might be more efficient about it.

You are trying to fix this by having it implement ReadFrom(), using it as a way to pass it an io.Reader, hence a stream, and use that to say, "this might not be everything you are going to read, I gave you a stream instead of a slice, so you know that." And then inside ReadFrom(), you add logic to try and guess the total size if you can; if not, then use an allocation algorithm that tries to be efficient about it.

Now that I get it (although I haven't dug into your allocation algorithm), I have a few questions.

First, if src implements WriterTo, then all of this will not work, since io.Copy(dst, src) calls src.WriteTo() before even trying dst.ReadFrom(). So even if this logic works, it needs to be implemented equally for Write(). Which does not obviate the need for ReadFrom() - it is a good thing to do - but it does mean the heuristic needs to be used for both Write() and ReadFrom(). I get that it is easier with ReadFrom() - single call, you know when you are done, can preallocate at the beginning and cleanup at the end - but we would need to account for both cases.

Second, do we know algorithm the fat32 driver uses, maybe in Linux? It must have a similar issue when someone calls write(2), which eventually sends data to a file descriptor, which is held open by the kernel for a particular filesystem driver, i.e. fat32.

A naive approach might be similar to what you did: always create a large number of clusters, keep them cached, use them from the cache, when the cache runs dry, allocate another large chunk. I don't recall offhand how fat32 filesystem handles clusters created but not for a particular file, if that is doable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants