Skip to content

Relax strict 1:1 verification in DatasetBatchManager.update_records (Support 1:N generation) #265

@JaoMarcos

Description

@JaoMarcos

Priority Level

Medium (Nice to have)

Is your feature request related to a problem? Please describe.

I'm working on a plugin and ran into an issue with the strict verification in the DatasetBatchManager.update_records function. Currently, it enforces that the number of incoming records matches the current buffer size.

The Use Case I need to support cases where a single input record produces multiple output records (1:N), essentially "exploding" the dataframe.

The main driver for this is cost and efficiency with LLMs. For complex prompts with large input contexts, if I need multiple variations (e.g., "Generate 5 variations of X"), it is significantly cheaper and faster to ask the model to generate all 5 in a single API call rather than making 5 separate calls with the same large input.

Generating them in a single pass also often improves quality/variance, as the model has "in-context" awareness of the other variations it is generating, preventing duplicates.

Describe the solution you'd like

add a flag (e.g., strict_mapping=False) to update_records?

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions