Skip to content

SMAT-Lab/HapRepair

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ArkTS Code Defect Repair System

This repository is the official repository of "HapRepair: Learn to Repair OpenHarmony Apps".

This is an automated ArkTS code defect repair system based on large language models. The system uses Retrieval-Augmented Generation (RAG) technology combined with multiple large language models to detect and fix performance defects in ArkTS code.

Here's the framework of the system:

framework

Key Features

  • Code defect detection
  • RAG-based code repair suggestion generation
  • Multi-round code repair
  • Code functionality verification
  • Multi-model support (GPT, Deepseek, Qwen, etc.)

System Architecture

The system consists of the following main modules:

  • Code Repair Module (fix.py, fix_projects.py)
  • Vector Retrieval Module (save_defects_to_database.py)
  • Prompt Generation Module (get_prompt.py)
  • Output Processing Module (output_handler.py)
  • Context Extraction Module (get_surrounding_context.py)

CFG

The control flow graph is generated by ArkAnalyzer.

You can get the CFG of the projects by running the script arkanalyzer/tests/CFGTest.ts.

Supported Models

  • OpenAI GPT Series
  • Deepseek Chat
  • Qwen
  • Ollama
  • LLaMA
  • GPTGod

Usage

  1. Configure environment variables:
OPENAI_API_KEY=<your_key>
OPENAI_API_BASE=<api_base>
DEEPSEEK_API_KEY=<your_key>
DEEPSEEK_API_BASE=<api_base>
PINECONE_API_KEY=<your_key>
  1. Install dependencies:
pip install -r requirements.txt
  1. Run defect detection:
python RQ1.py
  1. Run code repair:
python fix.py

Besides, you can use CodeLinter in Huawei DevEco Studio to detect code defects.

And use ArkAnalyzer to obtain the CFG of the code to check the functionality of the code.

Experimental Setup

We evaluate HapRepair on a curated benchmark of real-world OpenHarmony apps:

  • Benchmark: 8,664 performance/security defects across 35 OpenHarmony projects, detected by HomeCheck (the ArkAnalyzer-based static checker also ported from Huawei CodeLinter rules) and tracked in revision/target_projects_haprepair.json.
  • Rule taxonomy: 37 performance rules triggered across the benchmark; 15 (40.5%) are context-dependent and 22 (59.5%) are local/template-sufficient (see summary/gpt-5.1_rq2_rule_type_context_ratio.md).
  • Pipeline: (1) HomeCheck scans each project and emits performance/security findings, (2) get_surrounding_context.py extracts the surrounding context for every finding, (3) save_defects_to_database.py indexes curated fix exemplars into a Pinecone vector store, (4) get_prompt.py retrieves the top-k nearest exemplars via RAG and assembles a repair prompt, and (5) fix.py / fix_projects.py drive an iterative multi-round repair loop with output_handler.py validating each patch.
  • Models evaluated: gpt-5.1 (main), gpt-5-mini, deepseek-chat, qwen3-coder-plus, qwen3-30b-a3b.
  • Ablation axes: RAG top-k ∈ {0, 1, 3, 5}, diff strategy ∈ {difflib, gpt-diff, no-diff}, context scope ∈ {surrounding, full-file}.
  • Protocol: Up to 6 repair rounds per project; a defect is counted as fixed only when HomeCheck no longer reports it on the rewritten code.

Reproduce the main table with:

python3 revision/code/delta_check_summarize.py --allow-missing-final-logs

Results

Main result across LLMs. All five models converge within five repair iterations. GPT-5.1 / GPT-5-mini / DeepSeek-Chat / Qwen3-Coder-Plus drop from 8,664 initial defects to 236 / 166 / 247 / 353 respectively (97–98% resolution); Qwen3-30B-A3B plateaus higher at 1,348 (84%), underscoring that model capacity still matters for hard, context-heavy rules.

Remaining defects across repair rounds

Category-level resolution (gpt-5.1). Performance rules drop from 8,150 → 234 (97%) and security rules from 514 → 2 (100%) after five iterations.

Category resolution across iterations

Per-project progression (sampled). HapRepair wipes out all 36 defects in PullLinking on round 1, takes flutter_embedding from 123 → 1, and drives the overall 35-project benchmark from 8,664 → 236 (97%).

Per-project iterations

Delta-check against "fix-by-deletion". A conservative audit filtering every resolved finding plausibly attributable to large-scale code deletion still leaves a net fix rate of 96.11% (8,327/8,664) — confirming the gains come from real repairs, not code removal (summary/gpt-5.1_delta_check.md).

Ablation study (gpt-5.1, round 1). RAG is the dominant factor; Top-3 retrieval is the sweet spot; surrounding context beats full-file; and providing a structural diff is essential.

Ablation study

Full breakdown: summary/ablation/ablation_summary_gpt-5.1_round1.md. The auto-generated per-project bars and top-10 rule charts (scripts/plot_readme_figures.py) provide an additional view of the same data.

Important Notes

  • Requires configuration of relevant API keys
  • Recommended Python version: 3.10+
  • Requires sufficient GPU memory for running large language models
  • Recommended to backup code before repair

Contributing

Issues and Pull Requests are welcome to help improve the project.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors