Skip to content

Vllm example#453

Draft
ZhenXu587 wants to merge 4 commits intoNVIDIA-NeMo:mainfrom
ZhenXu587:vllm_example
Draft

Vllm example#453
ZhenXu587 wants to merge 4 commits intoNVIDIA-NeMo:mainfrom
ZhenXu587:vllm_example

Conversation

@ZhenXu587
Copy link

No description provided.

@ZhenXu587 ZhenXu587 requested a review from a team as a code owner March 23, 2026 18:11
@github-actions
Copy link
Contributor

Thank you for your submission! We ask that you sign our Developer Certificate of Origin before we can accept your contribution. You can sign the DCO by adding a comment below using this text:


I have read the DCO document and I hereby sign the DCO.


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the DCO Assistant Lite bot.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 23, 2026

Greptile Summary

This PR introduces Elorian, a new multimodal VLM evaluation pipeline built on top of DataDesigner. It adds a full elorian/ package that orchestrates image loading, parallel model inference (Claude + GPT-4o by default), and LLM-as-a-judge scoring, all wired into DataDesigner's DAG execution engine.

Key issues found:

  • judges.pybuild_prompt will KeyError on custom templates (str.format() fails if the template contains any {...} other than {responses_block}; e.g., JSON in the prompt). Should use .replace() instead.
  • judges.py — docstring example breaks both alias and model_id rules (alias="my-judge" uses a hyphen and model_id="anthropic/claude-sonnet-4-20250514" includes the provider prefix, both of which cause runtime failures).
  • run_example.py — commented-out judge block has the same two bugs (alias="judge-gpt4o" and model_id="openai/gpt-4o"); a user uncommenting this as-is will hit broken column names and a likely doubly-prefixed LiteLLM routing error.
  • pipeline.py — global side effects: the ImageContext monkey-patch is applied at import time (affecting all users of ImageContext in the process), and os.environ["DATA_DESIGNER_MODEL_BACKEND"] is permanently mutated without cleanup.
  • image_utils.py docstring advertises URL-loading support that isn't implemented.
  • Minor: describe() unconditionally appends ... even for prompts shorter than 80 characters; unused dd import in run_example.py.

Confidence Score: 3/5

  • The default happy-path works, but two P1 logic bugs (broken docstring examples and a str.format() hazard in build_prompt) will catch users who follow the examples or use custom judge templates.
  • The default example runs end-to-end without triggering any of the bugs. However, the build_prompt str.format() issue is a latent runtime failure for any user who passes a custom prompt_template with curly braces, and the misleading docstring/example aliases with hyphens and provider-prefixed model IDs will cause API errors for users who follow them. These should be fixed before the PR is merged, since the example code is the primary onboarding surface.
  • elorian/judges.py (docstring example and build_prompt logic) and elorian/run_example.py (commented judge block) need the most attention before merge.

Important Files Changed

Filename Overview
elorian/judges.py LLM judge registry; contains two P1 issues: docstring example uses hyphenated alias and provider-prefixed model_id, and build_prompt uses str.format() which raises KeyError on custom templates with curly braces
elorian/pipeline.py Core pipeline orchestrator; global monkey-patch on ImageContext applied at import time and unconditional os.environ mutation in _get_data_designer() are process-wide side effects; also minor describe() truncation bug
elorian/run_example.py Working example script; commented-out judge block contains a hyphenated alias and a doubly-prefixed model_id that would fail at runtime if uncommented; also has an unused import
elorian/models.py VLM model registry and provider management; clean implementation with clear separation between ModelSpec, ModelRegistry, and PROVIDERS dict
elorian/image_utils.py Image loading utilities; module docstring claims URL loading support but no such function is implemented; otherwise straightforward PIL/HuggingFace loaders
elorian/init.py Package init with clean __all__ exports; no issues
elorian/README.md Comprehensive documentation for the Elorian pipeline; well-structured with good examples; consistently enforces the underscore alias rule

Sequence Diagram

sequenceDiagram
    participant User
    participant Pipeline as MultimodalEvalPipeline
    participant ImgUtils as image_utils
    participant ModelReg as ModelRegistry
    participant JudgeReg as JudgeRegistry
    participant DD as DataDesigner

    User->>ImgUtils: load_images_from_hf_dataset()
    ImgUtils-->>User: seed_df (uuid, base64_image, label)

    User->>Pipeline: __init__(seed_df, model_registry, judge_registry)
    User->>Pipeline: preview(num_records)

    Pipeline->>ModelReg: to_model_configs()
    ModelReg-->>Pipeline: [ModelConfig(claude_vision), ModelConfig(gpt4o_vision)]

    Pipeline->>JudgeReg: specs
    JudgeReg-->>Pipeline: [JudgeSpec(judge_claude)]

    Pipeline->>Pipeline: _build_config()
    Note over Pipeline: Adds seed dataset<br/>Adds response_{alias} columns (LLMTextColumnConfig)<br/>Adds eval_{alias} columns (LLMJudgeColumnConfig)

    Pipeline->>Pipeline: _get_data_designer()
    Note over Pipeline: Sets DATA_DESIGNER_MODEL_BACKEND=litellm_bridge

    Pipeline->>DD: validate(builder)
    Pipeline->>DD: preview(builder, num_records)
    DD-->>User: PreviewResults (response_claude_vision, response_gpt4o_vision, eval_judge_claude)
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: elorian/judges.py
Line: 9-13

Comment:
**Docstring example contradicts usage rules on two counts**

The docstring example at the top of the file violates the rules established elsewhere:

1. `alias="my-judge"` uses a hyphen. The README and inline comments throughout the codebase explicitly state that aliases **must use underscores**, because hyphens break Jinja2 template rendering (e.g., `{{ response_my-judge }}` is parsed as `{{ response_my }} - judge`). This example will silently produce broken column references at runtime.

2. `model_id="anthropic/claude-sonnet-4-20250514"` includes the provider prefix. However, `JudgeSpec.to_model_config()` already passes `provider=self.provider` to `dd.ModelConfig`, so including the prefix in `model_id` would result in a doubly-prefixed LiteLLM call (`anthropic/anthropic/claude-sonnet-4-20250514`), causing an API routing failure. The actual registered default (`judge_claude`) correctly uses `model_id="claude-sonnet-4-20250514"` without a prefix.

```suggestion
    registry = JudgeRegistry()
    registry.register(JudgeSpec(
        alias="my_judge",
        model_id="claude-sonnet-4-20250514",
        scores=[...],
        prompt_template="...",
    ))
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/judges.py
Line: 117-132

Comment:
**`str.format()` raises `KeyError` for custom templates containing curly braces**

`self.prompt_template.format(responses_block=responses_block)` will raise a `KeyError` for any custom `prompt_template` that contains `{...}` placeholders other than `{responses_block}` — for example a template that includes a JSON snippet like `{"key": "value"}` or a named slot like `{task_description}`. This is a common pattern in judge prompts.

The fix is to use a single `.replace()` call, which is immune to stray braces:

```python
def build_prompt(self, model_aliases: list[str]) -> str:
    lines = []
    for alias in model_aliases:
        col_name = f"response_{alias}"
        lines.append(f"--- Response from {alias} ---\n{{{{ {col_name} }}}}")
    responses_block = "\n\n".join(lines)
    return self.prompt_template.replace("{responses_block}", responses_block)
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/pipeline.py
Line: 51-52

Comment:
**Global monkey-patch applied at module import time**

`ImageContext._format_base64_context` is patched unconditionally the moment `elorian.pipeline` is imported. Any other code in the same process that imports `data_designer.config.models.ImageContext` — including upstream DataDesigner code and other tests — will silently receive the patched behavior. If the upstream bug is ever fixed, this patch will silently override the fix.

Consider making the patch opt-in (e.g., apply it only inside `_get_data_designer()`) or, better, contributing the fix upstream and removing the patch entirely once it lands.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/pipeline.py
Line: 143-151

Comment:
**`os.environ` mutation is a permanent, process-wide side effect**

Setting `os.environ["DATA_DESIGNER_MODEL_BACKEND"]` inside `_get_data_designer()` permanently overrides the environment for the entire process — including any other code that creates a `DataDesigner` instance later, and any subprocess spawned after the call. This can cause hard-to-debug behaviour when the pipeline is used as a library.

Consider passing the backend configuration through the `DataDesigner` constructor instead, or at least restoring the original value:

```python
_prev = os.environ.get("DATA_DESIGNER_MODEL_BACKEND")
os.environ["DATA_DESIGNER_MODEL_BACKEND"] = "litellm_bridge"
try:
    return DataDesigner(...)
finally:
    if _prev is None:
        os.environ.pop("DATA_DESIGNER_MODEL_BACKEND", None)
    else:
        os.environ["DATA_DESIGNER_MODEL_BACKEND"] = _prev
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/pipeline.py
Line: 192-194

Comment:
**`...` is always appended even when the prompt is shorter than 80 characters**

`self.prompt[:80] + "..."` unconditionally appends the ellipsis. For a short custom prompt (e.g., `"Describe this image."`), the output reads `Prompt: Describe this image....` which is misleading.

```suggestion
        lines.append(f"Prompt: {self.prompt[:80]}{'...' if len(self.prompt) > 80 else ''}")
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/image_utils.py
Line: 1-8

Comment:
**Module docstring lists URL loading as supported, but no such function exists**

The module docstring lists `- A list of image URLs` as a supported source, but there is no `load_images_from_urls()` function in the file. Either remove the bullet point from the docstring, or add a note that URL loading is not yet implemented, to avoid confusion for users reading the docs.

```suggestion
"""Utilities for loading and processing images into DataDesigner seed datasets.

Supports loading from:
- A local directory of image files
- A HuggingFace dataset with an image column
- A list of PIL images
"""
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/run_example.py
Line: 60-66

Comment:
**Commented example has a hyphenated alias and a doubly-prefixed `model_id`**

If a user uncomments this block as-is, two bugs will surface:

1. `alias="judge-gpt4o"` — the hyphen violates the underscore-only rule and will produce a broken Jinja2 column reference (`{{ eval_judge-gpt4o }}`).
2. `model_id="openai/gpt-4o"` — this includes the provider prefix. Because `JudgeSpec.to_model_config()` also sets `provider="openai"` (the default), LiteLLM will receive `openai/gpt-4o` with `provider=openai`, which may be interpreted as `openai/openai/gpt-4o` depending on routing, causing an API error.

```suggestion
    # judge_registry.register(
    #     JudgeSpec(
    #         alias="judge_gpt4o",
    #         model_id="gpt-4o",
    #         provider="openai",
    #         description="GPT-4o as LLM judge",
    #     )
    # )
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: elorian/run_example.py
Line: 13

Comment:
**Unused import**

`import data_designer.config as dd` is only referenced inside the commented-out block (for `dd.ModelProvider`). As long as that block remains commented out, `dd` is never used and this import is dead code.

```suggestion
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Merge branch 'main' into vllm_example" | Re-trigger Greptile

Comment on lines +9 to +13
registry.register(JudgeSpec(
alias="my-judge",
model_id="anthropic/claude-sonnet-4-20250514",
scores=[...],
prompt_template="...",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Docstring example contradicts usage rules on two counts

The docstring example at the top of the file violates the rules established elsewhere:

  1. alias="my-judge" uses a hyphen. The README and inline comments throughout the codebase explicitly state that aliases must use underscores, because hyphens break Jinja2 template rendering (e.g., {{ response_my-judge }} is parsed as {{ response_my }} - judge). This example will silently produce broken column references at runtime.

  2. model_id="anthropic/claude-sonnet-4-20250514" includes the provider prefix. However, JudgeSpec.to_model_config() already passes provider=self.provider to dd.ModelConfig, so including the prefix in model_id would result in a doubly-prefixed LiteLLM call (anthropic/anthropic/claude-sonnet-4-20250514), causing an API routing failure. The actual registered default (judge_claude) correctly uses model_id="claude-sonnet-4-20250514" without a prefix.

Suggested change
registry.register(JudgeSpec(
alias="my-judge",
model_id="anthropic/claude-sonnet-4-20250514",
scores=[...],
prompt_template="...",
registry = JudgeRegistry()
registry.register(JudgeSpec(
alias="my_judge",
model_id="claude-sonnet-4-20250514",
scores=[...],
prompt_template="...",
))
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/judges.py
Line: 9-13

Comment:
**Docstring example contradicts usage rules on two counts**

The docstring example at the top of the file violates the rules established elsewhere:

1. `alias="my-judge"` uses a hyphen. The README and inline comments throughout the codebase explicitly state that aliases **must use underscores**, because hyphens break Jinja2 template rendering (e.g., `{{ response_my-judge }}` is parsed as `{{ response_my }} - judge`). This example will silently produce broken column references at runtime.

2. `model_id="anthropic/claude-sonnet-4-20250514"` includes the provider prefix. However, `JudgeSpec.to_model_config()` already passes `provider=self.provider` to `dd.ModelConfig`, so including the prefix in `model_id` would result in a doubly-prefixed LiteLLM call (`anthropic/anthropic/claude-sonnet-4-20250514`), causing an API routing failure. The actual registered default (`judge_claude`) correctly uses `model_id="claude-sonnet-4-20250514"` without a prefix.

```suggestion
    registry = JudgeRegistry()
    registry.register(JudgeSpec(
        alias="my_judge",
        model_id="claude-sonnet-4-20250514",
        scores=[...],
        prompt_template="...",
    ))
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +117 to +132
def build_prompt(self, model_aliases: list[str]) -> str:
"""Build the judge prompt with response column references.

Args:
model_aliases: List of model aliases whose response columns
will be referenced in the prompt via Jinja2 {{ }} syntax.

Returns:
Prompt string with Jinja2 column references.
"""
lines = []
for alias in model_aliases:
col_name = f"response_{alias}"
lines.append(f"--- Response from {alias} ---\n{{{{ {col_name} }}}}")
responses_block = "\n\n".join(lines)
return self.prompt_template.format(responses_block=responses_block)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 str.format() raises KeyError for custom templates containing curly braces

self.prompt_template.format(responses_block=responses_block) will raise a KeyError for any custom prompt_template that contains {...} placeholders other than {responses_block} — for example a template that includes a JSON snippet like {"key": "value"} or a named slot like {task_description}. This is a common pattern in judge prompts.

The fix is to use a single .replace() call, which is immune to stray braces:

def build_prompt(self, model_aliases: list[str]) -> str:
    lines = []
    for alias in model_aliases:
        col_name = f"response_{alias}"
        lines.append(f"--- Response from {alias} ---\n{{{{ {col_name} }}}}")
    responses_block = "\n\n".join(lines)
    return self.prompt_template.replace("{responses_block}", responses_block)
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/judges.py
Line: 117-132

Comment:
**`str.format()` raises `KeyError` for custom templates containing curly braces**

`self.prompt_template.format(responses_block=responses_block)` will raise a `KeyError` for any custom `prompt_template` that contains `{...}` placeholders other than `{responses_block}` — for example a template that includes a JSON snippet like `{"key": "value"}` or a named slot like `{task_description}`. This is a common pattern in judge prompts.

The fix is to use a single `.replace()` call, which is immune to stray braces:

```python
def build_prompt(self, model_aliases: list[str]) -> str:
    lines = []
    for alias in model_aliases:
        col_name = f"response_{alias}"
        lines.append(f"--- Response from {alias} ---\n{{{{ {col_name} }}}}")
    responses_block = "\n\n".join(lines)
    return self.prompt_template.replace("{responses_block}", responses_block)
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +51 to +52
# Apply the patch once at import time.
ImageContext._format_base64_context = _patched_format_base64_context # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Global monkey-patch applied at module import time

ImageContext._format_base64_context is patched unconditionally the moment elorian.pipeline is imported. Any other code in the same process that imports data_designer.config.models.ImageContext — including upstream DataDesigner code and other tests — will silently receive the patched behavior. If the upstream bug is ever fixed, this patch will silently override the fix.

Consider making the patch opt-in (e.g., apply it only inside _get_data_designer()) or, better, contributing the fix upstream and removing the patch entirely once it lands.

Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/pipeline.py
Line: 51-52

Comment:
**Global monkey-patch applied at module import time**

`ImageContext._format_base64_context` is patched unconditionally the moment `elorian.pipeline` is imported. Any other code in the same process that imports `data_designer.config.models.ImageContext` — including upstream DataDesigner code and other tests — will silently receive the patched behavior. If the upstream bug is ever fixed, this patch will silently override the fix.

Consider making the patch opt-in (e.g., apply it only inside `_get_data_designer()`) or, better, contributing the fix upstream and removing the patch entirely once it lands.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +143 to +151
def _get_data_designer(self) -> DataDesigner:
"""Create a DataDesigner interface instance with the required providers."""
# Use LiteLLM bridge so provider_type is used as the routing prefix.
os.environ["DATA_DESIGNER_MODEL_BACKEND"] = "litellm_bridge"

return DataDesigner(
artifact_path=self.artifact_path,
model_providers=self._collect_providers(),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 os.environ mutation is a permanent, process-wide side effect

Setting os.environ["DATA_DESIGNER_MODEL_BACKEND"] inside _get_data_designer() permanently overrides the environment for the entire process — including any other code that creates a DataDesigner instance later, and any subprocess spawned after the call. This can cause hard-to-debug behaviour when the pipeline is used as a library.

Consider passing the backend configuration through the DataDesigner constructor instead, or at least restoring the original value:

_prev = os.environ.get("DATA_DESIGNER_MODEL_BACKEND")
os.environ["DATA_DESIGNER_MODEL_BACKEND"] = "litellm_bridge"
try:
    return DataDesigner(...)
finally:
    if _prev is None:
        os.environ.pop("DATA_DESIGNER_MODEL_BACKEND", None)
    else:
        os.environ["DATA_DESIGNER_MODEL_BACKEND"] = _prev
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/pipeline.py
Line: 143-151

Comment:
**`os.environ` mutation is a permanent, process-wide side effect**

Setting `os.environ["DATA_DESIGNER_MODEL_BACKEND"]` inside `_get_data_designer()` permanently overrides the environment for the entire process — including any other code that creates a `DataDesigner` instance later, and any subprocess spawned after the call. This can cause hard-to-debug behaviour when the pipeline is used as a library.

Consider passing the backend configuration through the `DataDesigner` constructor instead, or at least restoring the original value:

```python
_prev = os.environ.get("DATA_DESIGNER_MODEL_BACKEND")
os.environ["DATA_DESIGNER_MODEL_BACKEND"] = "litellm_bridge"
try:
    return DataDesigner(...)
finally:
    if _prev is None:
        os.environ.pop("DATA_DESIGNER_MODEL_BACKEND", None)
    else:
        os.environ["DATA_DESIGNER_MODEL_BACKEND"] = _prev
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +192 to +194
lines = ["Multimodal Eval Pipeline Configuration", "=" * 40]
lines.append(f"\nImage column: {self.image_column}")
lines.append(f"Prompt: {self.prompt[:80]}...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 ... is always appended even when the prompt is shorter than 80 characters

self.prompt[:80] + "..." unconditionally appends the ellipsis. For a short custom prompt (e.g., "Describe this image."), the output reads Prompt: Describe this image.... which is misleading.

Suggested change
lines = ["Multimodal Eval Pipeline Configuration", "=" * 40]
lines.append(f"\nImage column: {self.image_column}")
lines.append(f"Prompt: {self.prompt[:80]}...")
lines.append(f"Prompt: {self.prompt[:80]}{'...' if len(self.prompt) > 80 else ''}")
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/pipeline.py
Line: 192-194

Comment:
**`...` is always appended even when the prompt is shorter than 80 characters**

`self.prompt[:80] + "..."` unconditionally appends the ellipsis. For a short custom prompt (e.g., `"Describe this image."`), the output reads `Prompt: Describe this image....` which is misleading.

```suggestion
        lines.append(f"Prompt: {self.prompt[:80]}{'...' if len(self.prompt) > 80 else ''}")
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1 to +8
"""Utilities for loading and processing images into DataDesigner seed datasets.

Supports loading from:
- A local directory of image files
- A HuggingFace dataset with an image column
- A list of PIL images
- A list of image URLs
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Module docstring lists URL loading as supported, but no such function exists

The module docstring lists - A list of image URLs as a supported source, but there is no load_images_from_urls() function in the file. Either remove the bullet point from the docstring, or add a note that URL loading is not yet implemented, to avoid confusion for users reading the docs.

Suggested change
"""Utilities for loading and processing images into DataDesigner seed datasets.
Supports loading from:
- A local directory of image files
- A HuggingFace dataset with an image column
- A list of PIL images
- A list of image URLs
"""
"""Utilities for loading and processing images into DataDesigner seed datasets.
Supports loading from:
- A local directory of image files
- A HuggingFace dataset with an image column
- A list of PIL images
"""
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/image_utils.py
Line: 1-8

Comment:
**Module docstring lists URL loading as supported, but no such function exists**

The module docstring lists `- A list of image URLs` as a supported source, but there is no `load_images_from_urls()` function in the file. Either remove the bullet point from the docstring, or add a note that URL loading is not yet implemented, to avoid confusion for users reading the docs.

```suggestion
"""Utilities for loading and processing images into DataDesigner seed datasets.

Supports loading from:
- A local directory of image files
- A HuggingFace dataset with an image column
- A list of PIL images
"""
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +60 to +66
# judge_registry.register(
# JudgeSpec(
# alias="judge-gpt4o",
# model_id="openai/gpt-4o",
# description="GPT-4o as LLM judge",
# )
# )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Commented example has a hyphenated alias and a doubly-prefixed model_id

If a user uncomments this block as-is, two bugs will surface:

  1. alias="judge-gpt4o" — the hyphen violates the underscore-only rule and will produce a broken Jinja2 column reference ({{ eval_judge-gpt4o }}).
  2. model_id="openai/gpt-4o" — this includes the provider prefix. Because JudgeSpec.to_model_config() also sets provider="openai" (the default), LiteLLM will receive openai/gpt-4o with provider=openai, which may be interpreted as openai/openai/gpt-4o depending on routing, causing an API error.
Suggested change
# judge_registry.register(
# JudgeSpec(
# alias="judge-gpt4o",
# model_id="openai/gpt-4o",
# description="GPT-4o as LLM judge",
# )
# )
# judge_registry.register(
# JudgeSpec(
# alias="judge_gpt4o",
# model_id="gpt-4o",
# provider="openai",
# description="GPT-4o as LLM judge",
# )
# )
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/run_example.py
Line: 60-66

Comment:
**Commented example has a hyphenated alias and a doubly-prefixed `model_id`**

If a user uncomments this block as-is, two bugs will surface:

1. `alias="judge-gpt4o"` — the hyphen violates the underscore-only rule and will produce a broken Jinja2 column reference (`{{ eval_judge-gpt4o }}`).
2. `model_id="openai/gpt-4o"` — this includes the provider prefix. Because `JudgeSpec.to_model_config()` also sets `provider="openai"` (the default), LiteLLM will receive `openai/gpt-4o` with `provider=openai`, which may be interpreted as `openai/openai/gpt-4o` depending on routing, causing an API error.

```suggestion
    # judge_registry.register(
    #     JudgeSpec(
    #         alias="judge_gpt4o",
    #         model_id="gpt-4o",
    #         provider="openai",
    #         description="GPT-4o as LLM judge",
    #     )
    # )
```

How can I resolve this? If you propose a fix, please make it concise.


from __future__ import annotations

import data_designer.config as dd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unused import

import data_designer.config as dd is only referenced inside the commented-out block (for dd.ModelProvider). As long as that block remains commented out, dd is never used and this import is dead code.

Suggested change
import data_designer.config as dd
Prompt To Fix With AI
This is a comment left during a code review.
Path: elorian/run_example.py
Line: 13

Comment:
**Unused import**

`import data_designer.config as dd` is only referenced inside the commented-out block (for `dd.ModelProvider`). As long as that block remains commented out, `dd` is never used and this import is dead code.

```suggestion
```

How can I resolve this? If you propose a fix, please make it concise.

@ZhenXu587 ZhenXu587 marked this pull request as draft March 23, 2026 18:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant