Rename to OpenClaw and update image tags#5
Open
TimPietruskyRunPod wants to merge 118 commits intomainfrom
Open
Rename to OpenClaw and update image tags#5TimPietruskyRunPod wants to merge 118 commits intomainfrom
TimPietruskyRunPod wants to merge 118 commits intomainfrom
Conversation
Refresh images, docs, and scripts to use Moltbot naming and env vars. Update Docker build workflow to tag images with branch names.
Clarify that branch builds publish tags using the branch name with slashes normalized.
Push images on branch and PR builds using the source branch name and allow all branches/tags to trigger builds.
Fail fast when moltbot is missing so the rename does not silently fall back.
Trigger image builds on pull requests (branch tag) and release tags only, with documentation to match.
Trigger builds on main pushes so :latest is published while keeping PR builds for branches.
Pin to the beta tag so the image gets the moltbot binary.
Use the supported clawdbot package and provide a moltbot symlink.
Ensure clawdbot reads the intended state directory in the gguf entrypoint.
Create required state directories and lock down permissions after doctor.
Rewrite the root README to focus on Moltbot images, context sizes, and status summary.
Align images, configs, and entrypoints with OpenClaw branding and paths. Update docs and templates to drop Moltbot/Clawdbot references.
Centralize web UI and SSH log output across entrypoints. Adjust build contexts to include shared scripts and document builds.
Document the tokenized Web UI URL and device pairing approval commands.
Add an OpenClaw skill and CLI wrapper for FLUX.2 SDNQ image generation. Wire skills loading and install dependencies in images.
- PyTorch cu128 required for Blackwell sm_120 GPU support - Diffusers from git required for Flux2KleinPipeline (not in stable 0.36.0)
- New root AGENTS.md with architecture, model variants, skills, quick commands - CLAUDE.md now references AGENTS.md for agents/devs
Focus on build/test commands, code style, testing instructions
- Codebase structure with purpose of each folder - Key architectural decisions (llama.cpp for 5090, cu128, etc.) - Where to make changes table - Build, test, and debugging commands
Add speech-to-text and text-to-speech capabilities using LiquidAI's LFM2.5-Audio-1.5B model with GPU acceleration on RTX 5090. Changes: - Build audio runners from llama.cpp PR #18641 with CUDA SM120 support - Add openclaw-tts script with voice selection (US/UK male/female) - Add openclaw-stt script for audio transcription - Add skills/tts and skills/stt for OpenClaw integration - ~80x speedup vs CPU-only prebuilt runners (2s vs 15s) Performance on RTX 5090: - TTS: ~965 tokens/sec, ~2.3s for short sentences - STT: ~688 tokens/sec, ~2.0s for short clips - Audio decode: 4ms (vs 1296ms on CPU) Model files downloaded at runtime to /workspace/models/LFM2.5-Audio-GGUF/
Replace per-request model loading with persistent audio server: - Scripts now use streaming API to audio server on port 8001 - TTS: 0.8s vs 2.5s (3x faster) - STT: 0.3s vs 2.0s (7x faster) - Model stays loaded in VRAM (~845 MiB) Changes: - Rewrite openclaw-tts/stt as Python scripts using server API - Add -ngl 99 to entrypoint for GPU-accelerated audio server - Server auto-starts with container on port 8001
add persistent flux.2 klein image generation server on port 8002 for instant inference with pre-loaded model in vram - add openclaw-image-server http server that loads model at startup - refactor openclaw-image-gen to use server api instead of loading per request - reduce llm context from 200k to 100k tokens to free vram for image server - update entrypoint to start image server alongside llm and audio servers - update openclaw config contextTokens to match reduced context - add image server to cleanup function and startup messages
fix image server to work alongside llm and audio servers by optimizing vram usage and fixing sdnq quantizer registration - register sdnq quantizer with diffusers to fix model loading errors - disable torch compile/inductor to reduce vram pressure - enable attention/vae slicing and tiling for lower memory usage - restore llm context to 200k (was reduced to 100k) - add llama_parallel=1 config for single slot (no concurrency) - add llama_gpu_layers=44 config to free vram for image server - update agents.md with vram usage table and binary separation docs - document critical requirement: llm and audio binaries must be separate
copy openclaw-image-server to docker image and expose port 8002 for persistent image generation server
set speed-first defaults and align openclaw context limits ensure audio server loads its shared libs via LD_LIBRARY_PATH
persist generated images and expose /latest and /images endpoints ensure media output dirs exist and surface public/proxy urls
include flux2-klein-1024 and test-robot examples
add a lightweight media proxy + static ui on port 8080 bundle a tool_result hook to render image urls inline
enable toolresult hook in entrypoint so chat surfaces audio links. add proxy audio endpoints and ui controls for tts and stt.
Models with the same base name (e.g., GLM-5, MiniMax-M2.5) are now ordered by quantization level: 1-bit first, then 2-bit, 3-bit, etc.
…ovements - add qwen3.5-397b-a17b model (2-bit, 137gb, 1m context on b200) - use --override-kv to bypass llama-server slot capping at n_ctx_train - format context display as "1m" for million-scale tokens - migrate registry url and site base path to openclaw2go.io - rewrite catalog mlx model separation using flatmap for correct os tabs - add variant switch handler for cross-platform model selection - auto-detect mmproj for any llm with vision projection file - add vision type and mmproj field to model schema - update flux2-klein-mlx repo and vram
- fix duplicate model count caused by gguf+mlx entries sharing same id - make platform tabs local state so switching doesn't affect global os - deduplicate variant groups by os to prevent duplicate macos tabs - remove unused onVariantSwitch prop chain - move accent border from left to top on selected model cards - remove vram label from memory presets - add vram presets: 141 (h200), 256 (m4 ultra), 288 (b300) - add m4 ultra 256gb to mac gpu list - add tts type, vision badges, model filters, hasVision field - fix qwen3.5 overhead and step3.5 context length
ik_llama.cpp only sets CMAKE_RUNTIME_OUTPUT_DIRECTORY (executables) but not CMAKE_LIBRARY_OUTPUT_DIRECTORY, so .so files end up in build/src/ instead of build/bin/ where the COPY glob expects them. our openclaw2go-llamacpp fork sets both to build/bin/. this caused /opt/engines/ik-llamacpp/lib/ to be empty, breaking minimax m2.5 iq2_ks and iq3_ks models.
add tps column to model picker list with column header, display tps as first info block in selected model cards (reordered to tps → quant → engine), and plumb tps data from registry model json through catalog and group-models to all ui components. add per-gpu tps benchmarks to 13 model configs.
- step-3.5-flash: 69 t/s on a100 - qwen3-coder-next: 118 t/s on l40 - kimi-k2.5-tq1: 9 t/s on b200 (with offload) - fix glm-4.7-full file paths: repo reorganized to Q4_K_M/ subdir with 5 splits instead of 4
Served via GitHub Pages at: curl -fsSL https://openclaw2go.io/install.sh | sh irm https://openclaw2go.io/install.ps1 | iex
arrow up/down moves focus between model rows across sections. auto-swaps selection only within a type that already has an active model, preserving cross-type independence.
…ne layer placement
… determine layer placement" This reverts commit c8a786d.
GLM-4.7 Full Q4_K_M needs ~228GB VRAM which exceeds even B200 (182GB). GLM-4.7 Flash (17GB, 179 t/s) and GLM-5 TQ1 (176GB, 27 t/s on B200) cover the same use cases better.
add qwen3-tts as the default tts engine, replacing outetss as default. includes openclaw-tts-server (python, openai /v1/audio/speech endpoint), model configs for 0.6b and 1.7b customvoice variants, shared pytorch venv (image-gen + qwen3-tts), and site catalog restructuring that merges tts into audio section with capability badges. verified on rtx 4090 (rtf 0.87-0.98x, 2.75gb vram) and apple silicon mlx (33 tps, rtf 0.49x, 4.2gb memory).
add measured tps for minimax-m25 iq2_ks (97 tps on h100) and iq3_ks (93 tps on b200). add verified config entries for both.
…tdated video script - Set up @changesets/cli for changelog generation with root package.json - Add CONTRIBUTING.md with bug report, feature request, model contribution, and PR guidelines - Add GitHub issue templates for bug reports and feature requests - Add issue template config disabling blank issues - Update .dockerignore to exclude changeset and package files from Docker context - Remove outdated docs/video-script.md
- Add GitHub Actions release workflow using changesets/action - On push to main, creates a "Version Packages" PR when changesets exist - Add initial release changeset (minor bump 0.0.0 → 0.1.0) - Set package.json version to 0.0.0 so first release is 0.1.0
- Add MLX variants for GPT-OSS-20B, OLMo-3.1-Think, GLM-4.7-Claude-Distill, Step-3.5-Flash, MiniMax-M2.5 (3 quants), Kimi-K2.5 (2 quants), GLM-5 (2 quants), and Qwen3.5-397B so they appear in Mac platform view - Change Qwen3-TTS 0.6B and 1.7B type from "tts" to "audio" with capabilities: ["tts"] so they render in the Audio section - Remove OuteTTS 0.2-500M (obsolete)
Add bits field to schema and all 34 model JSONs so quantization info is structured data instead of being regex-parsed from model names. Delete parse-quant.ts and rewrite group-models.ts to use the explicit bits field directly. MLX variants use mlx.bits when it differs from the GGUF quant (e.g. GLM Claude distill: GGUF 4-bit, MLX 8-bit).
Add primaryBits field that always holds the top-level GGUF bits from the model JSON. Use it for group keys so GGUF and MLX entries from the same model stay in one group even when their quant levels differ (e.g. GLM Claude Distill: GGUF 4-bit, MLX 8-bit). The detail card still shows the actual per-platform bits when switching tabs. Also default the model picker to Linux filtering when no OS is selected, hiding Mac-only standalone entries on the default view. Fix gpt-oss-20b-mlx.json bits to 8 to match its model name.
Verified on actual RTX 5090 hardware with full context: - Qwen3.5-35B-A3B: 770K context (was 1M), 105 t/s (was 108) - Qwen3.5-27B: 409K context (was 1M), 47 t/s (was 50) Context sizes auto-computed by resolve-profile.py based on updated overhead values that account for llama.cpp compute graph buffers. Also adds new model entries to AGENTS.md.
…ariants per tab Remove deprecated cli-local and cli-cloud tabs and all related code. Fix docker command to include auth env vars and correct volume name. Fix mlx command with venv setup, correct server invocations, and one-shot image generation. Each deploy tab now resolves the correct model variant from allModels regardless of OS selection. Empty selection shows placeholder instead of bogus command.
…by id Models like lfm2.5-audio have different IDs for their GGUF and MLX versions (liquidai/lfm25-audio vs mlx/lfm25-audio). The variant resolvers now fall back to the model group system to find the correct variant when IDs differ, so the mlx tab correctly finds mlx siblings and the docker tab finds gguf siblings.
Replace duplicated variant resolution logic in DeployOutput with the existing getVariantForOs from group-models. Each tab just calls getVariantForOs(group, 'linux') or getVariantForOs(group, 'mac'). Remove allModels prop that is no longer needed.
…ut, and embedded copy button Deploy section no longer jumps in height when switching tabs or models. Requirements shown as a labeled bullet list beside a capped-width code block with the copy button pinned inside it.
… agents.md - change engine "mlx" → "mlx-lm" in 3 mlx model configs (glm47-flash, gpt-oss-20b, nemotron3-nano) so deploy output generates correct pip install and startup commands - set lfm25-audio mlx.repo to mlx-community/LFM2.5-Audio-1.5B-4bit (was empty string) - rewrite agents.md: remove everything obvious from code (architecture, file trees, cli docs, build commands), keep only non-obvious gotchas and error-prone decisions (410 → 68 lines)
read root package.json version via vite define, replace hardcoded v0.1 in ConfigPanel, and add package.json to build-catalog workflow paths so the site rebuilds when changesets bumps the version.
…model IDs Replace boolean `true` with explicit model repo IDs in config examples and deploy command generation. Rename env var across all scripts, docs, templates, and site components.
…wn engine remove stale engine names (llamacpp, llamacpp-audio, etc.) from ALLOWED_ENGINES so external registry models with old names get skipped, preserving baked-in models with correct engine. fail fast in resolve-profile.py when engine is not found instead of silently using empty dict (which caused libmtmd.so.0 crash). rename remaining OPENCLAW_CONFIG references to OPENCLAW2GO_CONFIG.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan