Skip to content

Conversation

@nirav0999
Copy link
Contributor

  • added xscode generation

Copilot AI review requested due to automatic review settings August 7, 2025 22:46
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @nirav0999, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've introduced a new feature for generating and evaluating XSCode, a benchmark for secure code generation. This involves a multi-step pipeline to create high-quality, de-duplicated, and filtered prompts based on Common Weakness Enumerations (CWEs) and CodeGuru rules, ensuring they are suitable for testing overrefusal in CodeLLMs.

Highlights

  • XSCode Generation Pipeline: I've implemented a comprehensive pipeline to generate overrefusal prompts for secure code generation, leveraging Common Weakness Enumeration (CWE) and CodeGuru vulnerability information.
  • Automated Prompt Filtering: I've introduced pre- and post-filtering mechanisms to validate generated prompts against criteria such as asking for code, being benign, and having a secure solution, ensuring the quality of the dataset.
  • Data Deduplication: I've integrated a deduplication step using MinHash and LSH to remove redundant or highly similar prompts, enhancing the diversity and efficiency of the evaluation dataset.
  • Modular Architecture: I've structured the generation process into distinct, manageable Python modules (cwe2ovrf.py, dedup.py, pre_filter.py, post_filter.py) orchestrated by a central main.py script.
  • Project Restructuring: I've renamed and reorganized existing annotation utility scripts into a dedicated annotate_utils subdirectory, along with updating copyright information.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces XSCode, a new evaluation benchmark for detecting overrefusal in secure code generation. The system generates evaluation prompts by creating benign code requests that may trigger unnecessary refusals from safety-aligned models.

  • Implements a complete pipeline for generating, filtering, and validating prompts based on CWE vulnerabilities
  • Adds pre and post-filtering mechanisms using LLM judges to ensure prompt quality
  • Provides deduplication functionality to remove similar prompts from the dataset

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
eval/compile_xscode/main.py Main orchestration script coordinating the entire XSCode generation pipeline
eval/compile_xscode/cwe2ovrf.py Core generation logic that transforms CWE vulnerabilities into overrefusal test prompts
eval/compile_xscode/pre_filter.py Pre-filtering validation using LLM judges to assess prompt quality before annotation
eval/compile_xscode/post_filter.py Post-filtering validation applied after manual annotation to ensure final quality
eval/compile_xscode/dedup.py Deduplication system using MinHash LSH to remove similar prompts
eval/compile_xscode/annotate_utils/*.py Annotation utilities with minor copyright header updates and typo fixes
eval/compile_xscode/README.md Documentation explaining XSCode usage and evaluation

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive pipeline for generating the XSCode dataset, an overrefusal benchmark for secure code generation. The changes include scripts for data generation from CWE and CodeGuru sources, deduplication using MinHashLSH, and pre/post-filtering using LLM-based judges. The overall implementation is well-structured and robust. I've identified a few minor issues, including some typos and an opportunity to improve the robustness of a file operation. Overall, this is a great addition.

@nirav0999 nirav0999 closed this Aug 7, 2025
@nirav0999 nirav0999 reopened this Aug 7, 2025
@ganler ganler merged commit 9c2fa31 into main Aug 7, 2025
2 checks passed
@ganler ganler deleted the xscode-generation branch August 7, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants