Widget2Code is a baseline framework that strengthens both perceptual understanding and system-level generation for transforming visual widgets into UI code. It leverages advanced vision-language models to automatically generate production-ready WidgetDSL from screenshots, featuring icon detection across 57,000+ icons, layout analysis, component recognition and generation. This repository provides the implementation and tools needed to generate high-fidelity widget code.
- π¨ Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs
- π¦ Dec 22, 2025: Benchmark dataset uploaded to Hugging Face
- π Dec 22, 2025: Paper uploaded to arXiv
- π Dec 16, 2025: We release the complete Widget2Code framework including inference code, interactive playground, batch processing scripts, and evaluation tools.
playground.mp4
Widget2Code is a baseline framework that strengthens both perceptual understanding and system-level generation for transforming visual widgets into UI code.
Widget2Code employs a sophisticated multi-stage generation pipeline:
- Image Preprocessing: Resolution normalization, format conversion, and quality analysis
- Layout Detection: Multi-stage layout analysis with intelligent retry mechanism for robust component positioning
- Icon Retrieval: FAISS-based similarity search across 57,000+ icons with dual-encoder (text + image) matching
- Chart Recognition: Specialized detection and classification for 8 chart types using vision models
- Color Extraction: Advanced palette and gradient analysis with perceptual color matching
- DSL Generation: LLM-based structured output generation with domain-specific prompts
- Validation: Schema validation, constraint checking, and error correction
- Compilation: DSL to React JSX/HTML transformation with optimization
- Rendering: Render from code to png in headless browser
- GPU: NVIDIA GPU with CUDA support (recommended for icon retrieval acceleration)
- Memory: Minimum 8GB RAM, 16GB+ recommended for batch processing
- Operating System: Linux, macOS, or Windows (WSL2)
- Node.js: 18.x or higher
- Python: 3.10 or higher
One-Command Setup:
./scripts/setup/install.shInstalls all dependencies including Node.js packages and isolated Python environment.
Create .env file with API credentials and ground truth directory:
cp .env.example .env
# Edit .env and configure:
# - API credentials
# - GT_DIR: Path to ground truth directory for evaluation (e.g., ./data/widget2code-benchmark/test)# Start API backend (required for batch processing)
npm run api# Batch generation with 5 concurrent workers
./scripts/generation/generate-batch.sh ./mockups ./output 5
# Force regenerate all images
./scripts/generation/generate-batch.sh ./mockups ./output 5 --force# Batch rendering with 5 concurrent workers
./scripts/rendering/render-batch.sh ./output 5
# Force rerender all widgets
./scripts/rendering/render-batch.sh ./output 5 --force# Evaluate generated widgets against ground truth
# If GT_DIR is set in .env, -g flag is optional
./scripts/evaluation/run_evaluation.sh ./output
# Or specify ground truth directory explicitly
./scripts/evaluation/run_evaluation.sh ./output -g ./data/widget2code-benchmark/test
# Use GPU and more workers for faster evaluation
./scripts/evaluation/run_evaluation.sh ./output -g ./data/widget2code-benchmark/test --cuda -w 16# Start interactive playground
npm run playgroundWidget2Code achieves state-of-the-art performance across multiple quality dimensions including layout accuracy, legibility, style preservation, perceptual similarity, and geometric precision.
Widget2Code has been evaluated on 13 benchmark datasets:
- Seed1.6-Thinking
- Gemini2.5-Pro
- GPT-4o
- Qwen3-VL
- Qwen3-VL-235b
- Design2Code
- DCGen
- LatCoder
- UICopilot
- WebSight-VLM-8B
- ScreenCoder
- UI-UG
- Widget2Code
Download the Widget2Code Benchmark Dataset to the ./data/ folder.
After downloading, set GT_DIR=./data/widget2code-benchmark/test in your .env file, or use the -g flag when running evaluation scripts. The test split (./data/widget2code-benchmark/test) should be used as ground truth for evaluation.
Benchmark Results: All Methods Results (465MB) - Download evaluation results across all 13 benchmark datasets and methods from Google Drive.
To use the benchmark results:
# Install gdown (if not already installed)
pip install gdown
# Download using gdown (465MB)
gdown --fuzzy "https://drive.google.com/file/d/1LAYReu4fUES1IE0qM7h-zNGvyUgYnqwz/view?usp=sharing"
# If download fails, manually download from the link above
# Extract to project root directory
unzip benchmarks_backup_20251216.zip
# Run evaluation on all benchmarks (using test split as ground truth)
./scripts/evaluation/run_all_benchmarks.sh -g ./data/widget2code-benchmark/test --cuda -w 16If you find Widget2Code useful for your research or projects, please cite our work:
@article{widget2code2025,
title={Widget2Code: From Visual Widgets to UI Code via Multimodal LLMs},
author={Houston H. Zhang, Tao Zhang, Baoze Lin, Yuanqi Xue, Yincheng Zhu, Huan Liu, Li Gu, Linfeng Ye, Ziqiang Wang, Xinxin Zuo, Yang Wang, Yuanhao Yu, Zhixiang Chi},
journal={arXiv preprint},
year={2025}
}

