Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion src/OSmOSE/utils/core_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -794,7 +794,21 @@ def add_entry_for_APLOSE(path: str, file: str, info: pd.DataFrame):

if dataset_csv.exists():
meta = pd.read_csv(dataset_csv)
meta = pd.concat([meta, info], ignore_index=True).sort_values(
info.spectro_duration = info.spectro_duration.map(int)
info.dataset_sr = info.dataset_sr.map(int)
info.path = info.path.map(str)
meta = pd.concat(
(
meta[
(meta.path != str(info.iloc[0].path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the purpose here is not rather to handle multiple spectroduration X dataset_sr configuration for a same dataset (name AND path).

Copy link
Contributor Author

@Gautzilla Gautzilla Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's the point, but that's already what the changes do (unless I misunderstood your statement! 🥸):

pd.concat(
            (
                meta[  # Keeps any of the datasets that differs either in:
                    (meta.path != str(info.iloc[0].path)) # Path (includes name)
                    | (meta.spectro_duration != info.iloc[0].spectro_duration) # OR duration
                    | (meta.dataset_sr != info.iloc[0].dataset_sr) # OR sample rate
                ],
                info, # Adds the current dataset
            )
)

This way, if one creates a new dataset that only differs in sample rate, it will be added in addition to the previous dataset since (meta.dataset_sr != info.iloc[0].dataset_sr) will return True.

| (meta.spectro_duration != info.iloc[0].spectro_duration)
| (meta.dataset_sr != info.iloc[0].dataset_sr)
],
info,
),
ignore_index=True,
)
meta = meta.sort_values(
by=["path", "dataset"],
ascending=True,
)
Expand Down