feat(dataset-versioning): support running versioned experiments #1517

marliessophie · 2026-02-09T14:29:30Z

Important

This PR adds support for running experiments on versioned datasets by propagating a version timestamp through get_dataset() into DatasetClient and using it during experiment execution, verified by a new integration test.

Behavior:
- Adds optional version timestamp to get_dataset() in client.py, propagating it to DatasetClient.
- Uses datasetVersion in dataset_run_items.create(...) during experiment execution in client.py.
- Ensures only historical items are processed in versioned experiments.
Classes/Functions:
- Updates DatasetClient in datasets.py to store and use version timestamp.
- Modifies run_experiment() in client.py to accept _dataset_version and pass it to _run_experiment_async().
Tests:
- Adds test_run_experiment_with_versioned_dataset() in test_datasets.py to verify versioned dataset behavior.
- Fixes import placement and redundancy in test_datasets.py.

^{This description was created by}^{for 7de4f28. You can customize this summary. It will automatically update as commits are pushed.}

Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

This PR threads an optional dataset “version” timestamp through get_dataset() into DatasetClient, and then forwards that timestamp into experiment execution so that dataset_run_items.create(...) is called with datasetVersion when running experiments.

The change mainly touches the client-facing dataset wrapper (langfuse/_client/datasets.py) and the experiment runner in the main client (langfuse/_client/client.py), plus adds an integration test to verify that a versioned dataset only runs its historical items.

Confidence Score: 4/5

This PR looks safe to merge after addressing a small test hygiene issue.
Core logic is a straightforward propagation of a version timestamp into dataset run item creation, with minimal surface area and no obvious runtime-breaking changes in the SDK paths reviewed. The main issue found is a repo-rule violation (imports inside a test function) plus redundant imports, which should be fixed for consistency.
tests/test_datasets.py

Important Files Changed

Filename	Overview
langfuse/_client/client.py	Propagates an optional dataset version through get_dataset() and experiment run creation by adding DatasetClient.version and passing datasetVersion when creating dataset_run_items.
langfuse/_client/datasets.py	Extends DatasetClient to store an optional version timestamp and forwards it to Langfuse.run_experiment via private _dataset_version parameter.
tests/test_datasets.py	Adds a new integration test for running experiments on a versioned dataset, but introduces in-function imports that violate the repo import rule and duplicates existing imports.

Sequence Diagram

sequenceDiagram
    participant U as User code
    participant DC as DatasetClient
    participant LC as Langfuse client
    participant API as Langfuse API

    U->>LC: get_dataset(name, version=ts)
    LC->>API: datasets.get(dataset_name)
    loop paginate items
        LC->>API: dataset_items.list(dataset_name, page, limit, version=ts)
        API-->>LC: page of items
    end
    LC-->>U: DatasetClient(items, version=ts)

    U->>DC: run_experiment(...)
    DC->>LC: run_experiment(..., _dataset_version=DC.version)
    LC->>LC: _run_experiment_async(..., dataset_version=_dataset_version)
    loop each dataset item
        LC->>API: dataset_run_items.create(runName, datasetItemId, traceId, observationId, datasetVersion=ts)
        API-->>LC: dataset_run_item (dataset_run_id)
    end
    LC-->>U: ExperimentResult(item_results, dataset_run_id/url)

_{(5/5) You can turn off certain types of comments like style here!}

Context used:

Rule from dashboard - Move imports to the top of the module instead of placing them within functions or methods. (source)

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

feat(dataset-versioning): support running versioned experiments

96ec626

greptile-apps bot reviewed Feb 9, 2026

View reviewed changes

marliessophie added 2 commits February 9, 2026 15:35

test: add dataset item creation in versioned experiment test

88bd0c4

test: fix

7de4f28

marliessophie merged commit 317e705 into main Feb 9, 2026
7 of 12 checks passed

marliessophie deleted the marlies/lfe-experiment-versioning branch February 9, 2026 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dataset-versioning): support running versioned experiments #1517

feat(dataset-versioning): support running versioned experiments #1517

marliessophie commented Feb 9, 2026 •

edited by ellipsis-dev bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat(dataset-versioning): support running versioned experiments #1517

feat(dataset-versioning): support running versioned experiments #1517

Conversation

marliessophie commented Feb 9, 2026 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

marliessophie commented Feb 9, 2026 •

edited by ellipsis-dev bot

Loading