Skip to content

Conversation

@JayKuo1
Copy link

@JayKuo1 JayKuo1 commented Jan 26, 2026

See title.

@gemini-code-assist
Copy link

Summary of Changes

Hello @JayKuo1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational architectural improvement by implementing lazy loading for agents and datasets, which enhances the system's flexibility and performance. A major new component, the EarthLink agent, is integrated to streamline geoscience experiment planning. This agent intelligently assesses data availability, formulates detailed experimental plans, and refines them based on data checks, utilizing a suite of specialized tools for climate data interaction and web research.

Highlights

  • Lazy Loading Implementation: Introduced lazy loading for both agents and datasets, significantly improving modularity and reducing initial import overhead by only loading components when they are first accessed. This is achieved through dynamic __getattr__ and __dir__ methods in the scieval.agents and scieval.dataset packages.
  • New EarthLink Agent: Added a new 'EarthLink' agent designed for geoscience experiment planning. This agent includes sophisticated functionalities for data availability checks, experimental plan generation, and iterative plan refinement, leveraging specialized tools for CMIP6 and observational climate data, as well as web search capabilities.
  • Agent and Dataset Refactoring: Refactored agent_runner.py to use a generic create_agent factory function, enabling dynamic agent instantiation and robust error handling for missing dependencies. The dataset module (scieval.dataset/__init__.py) was also refactored to support lazy loading of various dataset classes.
  • New Utility Modules: Added several new utility modules under scieval/agents/EarthLink/utils for common functionalities like colored text logging, base64 encoding, caching (hashable_cache), code block extraction, string truncation, and custom JSON encoding. A new embedding utility was also introduced for variable description searches.
  • Data Management for EarthLink: Integrated comprehensive data management for the EarthLink agent, including tools to list and verify CMIP6 and observational data files, models, variables, and experiments. This involves downloading and loading various JSON data files related to climate science datasets.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces lazy loading for datasets and agents, which is a significant improvement for managing dependencies and reducing initial load times. It also adds a new EarthLink agent and its associated tools, configuration, and data handling. The lazy loading mechanism is well-implemented using __getattr__ and a registry. Several new files are added for the EarthLink agent, including data information, tools for CMIP6 and observational data, and utility functions. The .gitignore file is updated to include variables_embedding.jsonl for the new agent. Overall, the changes enhance modularity and extensibility.

Comment on lines +139 to +140
tiktoken_enabled=False,
tiktoken_model_name="Qwen/Qwen3-Embedding-8B",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The tiktoken_model_name is hardcoded to "Qwen/Qwen3-Embedding-8B" when tiktoken_enabled is false. This overrides the CFG.EMBEDDING_MODEL if it's different and tiktoken_enabled is false. This could lead to unexpected behavior if the user intends to use a different HuggingFace model for embeddings while tiktoken_enabled is false. It might be better to use CFG.EMBEDDING_MODEL directly here or clarify the interaction.

Suggested change
tiktoken_enabled=False,
tiktoken_model_name="Qwen/Qwen3-Embedding-8B",
tiktoken_model_name=CFG.EMBEDDING_MODEL,

Comment on lines +166 to +167
elif self.mode == "full":
raise NotImplementedError("Full mode is not implemented yet.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The NotImplementedError for full mode is good for clarity. However, if full mode is not intended to be implemented in the near future, consider removing the full option from the Literal type hint and the mode parameter's default value to prevent users from attempting to use it.

if len(msg) > 0:
return "Error in parameters:\n" + "\n".join(msg) + "\nPlease check the parameters and try again." # You can use the file search tool or web search tool to find the related information."

# assert activity != "DCPP"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The commented-out assertion # assert activity != "DCPP" indicates a potential constraint or a known issue. If DCPP activity is truly unsupported or problematic, it should be handled explicitly with an error message rather than a commented-out assertion, or the reason for commenting it out should be documented.

Comment on lines +68 to +69
if ("pass" not in output):
raise ValueError("Output JSON must contain 'pass' and 'reason' keys.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ValueError message Output JSON must contain 'pass' and 'reason' keys. is slightly misleading. The reason key is only required if pass is false. The current check if ("pass" not in output) covers the first part, but the second if (not output['pass']) and ("reason" not in output) is more specific. Consider rephrasing the first error message to reflect that pass is always required, and reason is conditional.

Suggested change
if ("pass" not in output):
raise ValueError("Output JSON must contain 'pass' and 'reason' keys.")
if "pass" not in output:
raise ValueError("Output JSON must contain 'pass' key.")

vlmeval/dataset/SciCode/eval/data/*.h5

# EarthLink file
variables_embedding.jsonl No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The .gitignore entry variables_embedding.jsonl is added without a preceding newline, which can sometimes lead to issues with the last line not being properly ignored by some git clients or tools. It's a good practice to ensure a newline at the end of the file.

variables_embedding.jsonl

Comment on lines +109 to +110
_cache = LRUCache(maxsize=512)
_lock = threading.Lock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _cache and _lock are defined outside the hashable_cache decorator, making them global to the module. This means all functions decorated with hashable_cache will share the same cache and lock. If different decorated functions need independent caches, these should be defined within the decorator closure for each function.


from scieval.agents import create_agent, get_available_agents
from scieval.agents.records import EvalRecord, TrajectoryStore
from scieval.agents.smolagents import SmolAgentsAgent

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The import from scieval.agents.smolagents import SmolAgentsAgent is removed, but SmolAgentsAgent is still referenced as a default class name in _build_agent_from_config. While the new create_agent factory handles this, it might be cleaner to explicitly import SmolAgentsAgent if it's intended to be a default, or ensure the default is a string that create_agent can resolve without needing a direct import here.

Comment on lines 406 to 407
assert np.all([x == TYPES[0] for x in TYPES]), (dataset_list, TYPES)
return TYPES[0]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assert np.all([x == TYPES[0] for x in TYPES]) check relies on numpy which is imported but not explicitly used elsewhere in this function. If numpy is only used for this assertion, consider replacing it with a pure Python check (e.g., len(set(TYPES)) == 1) to reduce dependencies or clarify its usage.

Comment on lines 433 to 434
MODALITIES = [DATASET_MODALITY(dname) for dname in dataset_list]
assert np.all([x == MODALITIES[0] for x in MODALITIES]), (dataset_list, MODALITIES)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The assert np.all([x == MODALITIES[0] for x in MODALITIES]) check relies on numpy which is imported but not explicitly used elsewhere in this function. If numpy is only used for this assertion, consider replacing it with a pure Python check (e.g., len(set(MODALITIES)) == 1) to reduce dependencies or clarify its usage.

Comment on lines 509 to 513
if 'A' in data and 'B' in data:
if 'image' in data or 'image_path' in data:
warnings.warn(f'Will assume unsupported dataset {dataset_name} as a Custom MCQ dataset. ')
CustomMCQDataset = __getattr__('CustomMCQDataset')
return CustomMCQDataset(dataset=dataset_name, **kwargs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The lazy loading for CustomMCQDataset, CustomTextMCQDataset, and CustomVQADataset uses __getattr__ directly. While this works, it's generally better to import these classes at the top of the file if they are frequently used or if the lazy loading mechanism is primarily for external modules. For these internal custom datasets, a direct import might be clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant