Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions koboldcpp/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
6 changes: 6 additions & 0 deletions koboldcpp/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
appVersion: "1.110"
description: Run AI Models Locally, Free & Open-Source
name: koboldcpp
type: application
version: 1.0.2
123 changes: 123 additions & 0 deletions koboldcpp/OlaresManifest.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
olaresManifest.version: '0.11.0'
olaresManifest.type: app
apiVersion: 'v2'
metadata:
name: koboldcpp
description: Run AI Models Locally, Free & Open-Source.
icon: https://app.cdn.olares.com/appstore/koboldcpp/icon.png
appid: koboldcpp
version: '1.0.2'
title: KoboldCpp
categories:
- Utilities_v112
- Productivity
permission:
appData: true
appCache: true
userData:
- Home
spec:
versionName: '1.110'
promoteImage:
- https://app.cdn.olares.com/appstore/koboldcpp/1.webp
- https://app.cdn.olares.com/appstore/koboldcpp/2.webp
- https://app.cdn.olares.com/appstore/koboldcpp/3.webp
fullDescription: |
*KoboldCPP is a powerful, C++ based backend built for running large language models locally using the GGUF format the same format supported by llama.cpp. Originally created to power storytelling and role-playing platforms like KoboldAI, it has grown into a complete local LLM engine capable of handling a wide variety of modern models, including:*
- LLaMA, LLaMA 2, and LLaMA 3
- Mistral and Mixtral
- Phi and Gemma
- Qwen and Yi
- Many other models converted to GGUF

*Features*
- Single file executable, with no installation required and no external dependencies
- Runs on CPU or GPU, supports full or partial offloaded
- LLM text generation (Supports all GGML and GGUF models, backwards compatibility with ALL past models)
- Image Generation and Image Editing (Stable Diffusion 1.5, SDXL, SD3, Flux, Qwen Image, Z-Image, Klein)
- Video Generation (WAN 2.2)
- Speech-To-Text (Voice Recognition) via Whisper
- Text-To-Speech (Voice Generation) via Qwen3TTS, Kokoro, OuteTTS, Parler and Dia
- Music Generation (Ace Step 1.5)
- Image Recognition (Multimodal Vision)
- MCP Server support and tool calling
- Provides many compatible APIs endpoints for many popular webservices (KoboldCppApi - OpenAiApi OllamaApi A1111ForgeApi ComfyUiApi WhisperTranscribeApi XttsApi OpenAiSpeechApi)
- Bundled KoboldAI Lite UI with editing tools, save formats, memory, world info, author's note, characters, scenarios.
- Includes multiple modes (chat, adventure, instruct, storywriter) and UI Themes (aesthetic roleplay, classic writer, corporate assistant, messsenger)
- Supports loading Tavern Character Cards, importing many different data formats from various sites, reading or exporting JSON savefiles and persistent stories.
- Many other features including new samplers, regex support, websearch, RAG via TextDB, image recognition/vision and more.
- Ready-to-use binaries for Windows, MacOS, Linux. Runs directly with Colab, Docker, also supports other platforms if self-compiled (like Android (via Termux) and Raspberry PI).
developer: KoboldAI
website: https://github.com/LostRuins/koboldcpp
submitter: Olares
locale:
- en-US
- zh-CN
doc: https://github.com/LostRuins/koboldcpp/wiki
{{- if and .Values.admin .Values.bfl.username (eq .Values.admin .Values.bfl.username) }}
requiredMemory: 4Gi
limitedMemory: 8Gi
requiredDisk: 5Gi
limitedDisk: 50Gi
requiredCpu: 2.1
limitedCpu: 6
requiredGpu: 10Gi
limitedGpu: 24Gi
{{- else }}
requiredMemory: 64Mi
limitedMemory: 500Mi
requiredDisk: 50Mi
limitedDisk: 200Mi
requiredCpu: 10m
limitedCpu: 500m
{{- end }}
supportArch:
- amd64
subCharts:
- name: koboldcppserver
shared: true
- name: koboldcpp
options:
apiTimeout: 0
dependencies:
- name: olares
type: system
version: '>=1.12.3-0'
{{- if and .Values.admin .Values.bfl.username (eq .Values.admin .Values.bfl.username) }}
{{- else }}
- name: koboldcpp
type: application
version: '>=1.0.0'
mandatory: true
{{- end }}
appScope:
{{- if and .Values.admin .Values.bfl.username (eq .Values.admin .Values.bfl.username) }}
clusterScoped: true
appRef:
- koboldcpp
{{- else }}
clusterScoped: false
{{- end }}
sharedEntrances:
- name: koboldcpp
host: sharedentrances-koboldcpp
port: 0
title: KoboldCpp API
icon: https://app.cdn.olares.com/appstore/koboldcpp/icon.png
invisible: true
authLevel: internal
entrances:
- authLevel: internal
host: koboldcpp-web-svc
icon: https://app.cdn.olares.com/appstore/koboldcpp/icon.png
name: koboldcpp
openMethod: window
port: 8080
title: KoboldCpp
envs:
- envName: OLARES_USER_HUGGINGFACE_SERVICE
required: true
applyOnChange: true
valueFrom:
envName: OLARES_USER_HUGGINGFACE_SERVICE
29 changes: 29 additions & 0 deletions koboldcpp/i18n/en-US/OlaresManifest.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
metadata:
title: KoboldCpp
description: Run AI Models Locally, Free & Open-Source.
spec:
fullDescription: |
*KoboldCPP is a powerful, C++ based backend built for running large language models locally using the GGUF format the same format supported by llama.cpp. Originally created to power storytelling and role-playing platforms like KoboldAI, it has grown into a complete local LLM engine capable of handling a wide variety of modern models, including:*
- LLaMA, LLaMA 2, and LLaMA 3
- Mistral and Mixtral
- Phi and Gemma
- Qwen and Yi
- Many other models converted to GGUF

*Features*
- Single file executable, with no installation required and no external dependencies
- Runs on CPU or GPU, supports full or partial offloaded
- LLM text generation (Supports all GGML and GGUF models, backwards compatibility with ALL past models)
- Image Generation and Image Editing (Stable Diffusion 1.5, SDXL, SD3, Flux, Qwen Image, Z-Image, Klein)
- Video Generation (WAN 2.2)
- Speech-To-Text (Voice Recognition) via Whisper
- Text-To-Speech (Voice Generation) via Qwen3TTS, Kokoro, OuteTTS, Parler and Dia
- Music Generation (Ace Step 1.5)
- Image Recognition (Multimodal Vision)
- MCP Server support and tool calling
- Provides many compatible APIs endpoints for many popular webservices (KoboldCppApi - OpenAiApi OllamaApi A1111ForgeApi ComfyUiApi WhisperTranscribeApi XttsApi OpenAiSpeechApi)
- Bundled KoboldAI Lite UI with editing tools, save formats, memory, world info, author's note, characters, scenarios.
- Includes multiple modes (chat, adventure, instruct, storywriter) and UI Themes (aesthetic roleplay, classic writer, corporate assistant, messsenger)
- Supports loading Tavern Character Cards, importing many different data formats from various sites, reading or exporting JSON savefiles and persistent stories.
- Many other features including new samplers, regex support, websearch, RAG via TextDB, image recognition/vision and more.
- Ready-to-use binaries for Windows, MacOS, Linux. Runs directly with Colab, Docker, also supports other platforms if self-compiled (like Android (via Termux) and Raspberry PI).
29 changes: 29 additions & 0 deletions koboldcpp/i18n/zh-CN/OlaresManifest.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
metadata:
title: KoboldCpp
description: 本地运行 AI 模型,免费且开源。
spec:
fullDescription: |
*KoboldCPP 是一个强大的基于 C++ 的后端,专为本地运行大型语言模型而设计,采用与 llama.cpp 相同支持的 GGUF 格式。最初为像 KoboldAI 这样的故事创作和角色扮演平台打造,如今已发展为一款功能全面的本地大模型引擎,能够支持多种现代模型,包括:*
- LLaMA、LLaMA 2 和 LLaMA 3
- Mistral 和 Mixtral
- Phi 和 Gemma
- Qwen 和 Yi
- 以及许多其他已转换为 GGUF 格式的模型

*功能特色*
- 单文件可执行程序,无需安装,无需任何外部依赖
- 支持 CPU 或 GPU 运行,可全量或部分转存
- LLM 文本生成(支持所有 GGML 和 GGUF 模型,同时向下兼容所有旧模型)
- 图像生成与编辑(支持 Stable Diffusion 1.5、SDXL、SD3、Flux、Qwen Image、Z-Image、Klein)
- 视频生成(WAN 2.2)
- 语音转文字(Whisper 语音识别)
- 文字转语音(通过 Qwen3TTS、Kokoro、OuteTTS、Parler 和 Dia 语音生成)
- 音乐生成(Ace Step 1.5)
- 图像识别(多模态视觉)
- 支持 MCP 服务器及工具调用
- 提供众多流行 Web 服务兼容的 API 端点(KoboldCppApi、OpenAiApi、OllamaApi、A1111ForgeApi、ComfyUiApi、WhisperTranscribeApi、XttsApi、OpenAiSpeechApi)
- 捆绑 KoboldAI Lite UI,内置编辑工具、存档格式、记忆功能、世界信息、作者笔记、角色、场景管理等
- 包括多种模式(聊天、冒险、指令、故事创作)和多种 UI 主题(美学角色扮演、经典作家、商务助手、信息使者)
- 支持加载 Tavern 角色卡,导入各类站点数据格式,读取/导出 JSON 存档与持久化故事
- 还包含新采样器、正则表达式支持、网页搜索、基于 TextDB 的 RAG、图像识别/视觉等众多功能
- 提供适用于 Windows、MacOS、Linux 的即用型二进制文件;支持 Colab、Docker 直接运行,自行编译还支持其他平台(如 Android(Termux)和树莓派)
6 changes: 6 additions & 0 deletions koboldcpp/koboldcpp/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
appVersion: "latest"
description: description
name: koboldcpp
type: application
version: 1.0.0
120 changes: 120 additions & 0 deletions koboldcpp/koboldcpp/templates/clientproxy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
namespace: {{ .Release.Namespace }}
data:
nginx.conf: |
upstream app_backend {
server koboldcpp-svc.koboldcppserver-shared:5001 max_fails=1 fail_timeout=2s;
server download-svc.koboldcppserver-shared:8090 backup;
}

server {
listen 8080;
access_log /opt/bitnami/openresty/nginx/logs/access.log;
error_log /opt/bitnami/openresty/nginx/logs/error.log;

client_max_body_size 200m;

location / {
proxy_pass http://app_backend;
proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
proxy_next_upstream_tries 2;
proxy_connect_timeout 2s;
proxy_read_timeout 600s;
proxy_send_timeout 600s;

proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";

client_max_body_size 200m;
}
}

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}
namespace: {{ .Release.Namespace }}
labels:
io.kompose.service: koboldcppweb
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: koboldcppweb
strategy:
type: Recreate
template:
metadata:
labels:
io.kompose.service: koboldcppweb
spec:
volumes:
- name: nginx-config
configMap:
name: nginx-config
defaultMode: 438
items:
- key: nginx.conf
path: nginx.conf
containers:
- name: nginx
image: docker.io/beclab/aboveos-bitnami-openresty:1.25.3-2
ports:
- containerPort: 8080
protocol: TCP
env:
- name: OPENRESTY_CONF_FILE
value: /etc/nginx/nginx.conf
readinessProbe:
exec:
command:
- /bin/sh
- -c
- |
http_code=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080)
[ $http_code -ge 200 ] && [ $http_code -lt 500 ]
initialDelaySeconds: 2
timeoutSeconds: 3
periodSeconds: 3
successThreshold: 1
failureThreshold: 60
resources:
limits:
cpu: 100m
memory: 256Mi
requests:
cpu: 10m
memory: 64Mi
volumeMounts:
- name: nginx-config
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: nginx-config
mountPath: /opt/bitnami/openresty/nginx/conf/server_blocks/nginx.conf
subPath: nginx.conf

---
apiVersion: v1
kind: Service
metadata:
name: koboldcpp-web-svc
namespace: {{ .Release.Namespace }}
spec:
type: ClusterIP
selector:
io.kompose.service: koboldcppweb
ports:
- name: http
protocol: TCP
port: 8080
targetPort: 8080
Empty file added koboldcpp/koboldcpp/values.yaml
Empty file.
6 changes: 6 additions & 0 deletions koboldcpp/koboldcppserver/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v2
appVersion: "latest"
description: description
name: koboldcppserver
type: application
version: 1.0.0
13 changes: 13 additions & 0 deletions koboldcpp/koboldcppserver/templates/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: koboldcpp-env
namespace: {{ .Release.Namespace }}
data:
NVIDIA_VISIBLE_DEVICES: "all"
NVIDIA_DRIVER_CAPABILITIES: "compute,utility"
KCPP_MODEL: ""
KCPP_ARGS: "--model /models/Qwen3.5-4B-UD-Q4_K_XL.gguf --contextsize 4096 --usecuda --gpulayers 99 --mmproj /models/mmproj-F32.gguf --sdmodel /models/picX_real.safetensors --sdquant 1 --whispermodel /models/ggml-large-v3-turbo.bin --ttsmodel /models/Qwen3-TTS-12Hz-1.7B-Base-q8_0.gguf --ttswavtokenizer /models/qwen3-tts-tokenizer-q8_0.gguf --ttsgpu --admin --admindir /models/admindir"
KCPP_DONT_UPDATE: "true"
KCPP_DONT_TUNNEL: "true"
KCPP_DONT_REMOVE_MODELS: "true"
Loading