Skip to content

Fuzzy time comparisons for s3 sync #10017

@RyanFitzSimmonsAK

Description

@RyanFitzSimmonsAK

Describe the feature

From #599 (comment),

I think a simple solution would be to introduce a fuzz factor. If it normally wouldn't take more than 5 minutes for the local -> S3 copy, then use a 10 minute fuzz factor on subsequent time comparisons, and treat relative times within 10 minutes as equal. If the S3 time is more than 10 minutes newer then sync from S3 -> local. Perhaps add --fuzz=10m as an option.

Use Case

After the first sync command (local->s3), the local files will have an mtime of 0, and the contents in s3 will have a LastModified time of 10 (using relative offsets). When we run the second aws s3 sync command, which is syncing from s3 to local we'll first do the file size check. In this case the file sizes are the same so we look at the last modified time checks. In this case they are different (local == 0, s3 == 10). If we were doing a strict equality comparison then, because the last modified times are different, we would unnecessarily sync the files from s3 to local. So we can say that if the file sizes are the same and the last modified time in s3 is greater (newer) than the local file, then we don't sync. This is the current behavior.

However, this creates a problem if the remote file is updated out of band (via the console or some other SDK) and the size remains the same. If we run aws s3 sync s3://bucket local/ we will not sync the remote file even though we're suppose to.

Metadata

Metadata

Labels

feature-requestA feature should be added or improved.needs-reviewThis issue or pull request needs review from a core team member.p2This is a standard priority issues3

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions