Skip to content

Importing processes too many files #292

@ricardog

Description

@ricardog

Describe the bug

Yesterday I ran garmindb_cli.py --all --download --import --latest. When I ran it again today (one day later) I see the following output

Processing sleep data
100%|██████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 22.78files/s]
Processing rhr data
100%|██████████████████████████████████████████████████████████████████████████████| 1634/1634 [00:00<00:00, 2082.99files/s]
Processing activities tcx data
100%|██████████████████████████████████████████████████████████████████████████████████| 443/443 [00:44<00:00,  9.95files/s]
Processing latest activities summary data from /Users/me/HealthData/FitFiles/Activities
100%|███████████████████████████████████████████████████████████████████████████████| 1767/1767 [00:02<00:00, 691.66files/s]
Processing activities detail data
100%|███████████████████████████████████████████████████████████████████████████████| 1767/1767 [00:02<00:00, 691.14files/s]
Processing [<FileType.activity: 4>] FIT data from /Users/me/HealthData/FitFiles/Activities
100%|████████████████████████████████████████████████████████████████████████████████| 1324/1324 [12:17<00:00,  1.80files/s]

I wouldn't expect the script to reprocess so many activity files (which takes a bit of time).

To Reproduce
Run garmindb_cli.py --all --download --import --latest in consecutive days (or even just back-to-back).

Expected behavior
I would expect only the "latest" files to be processed.

Additional context
I think this is due to the "latest" logic in idbutils.file_processor.py

    @classmethod
    def dir_to_files(cls, input_dir, file_regex, latest=False, recursive=False):
        """Search a directory, possibly recursively, and return a list of all files matching a regex."""
        file_names = []
        latest_threshold = datetime.datetime.now() - datetime.timedelta(days=1)

Which looks for files created in the last day. If I did the first import yesterday, for 24 hours the script will continue to process too many files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions