-
-
Notifications
You must be signed in to change notification settings - Fork 180
Gguf chunking #397
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or request
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request
Type
Fields
Give feedbackNo fields configured for issues without a type.
Describe the feature
After a laborious journey through Apple Notarization, I discovered that the only way to package and ship builds for mac os with larger LLMs is through gguf chunking.
Notarization fails for files larger ≈ 4GB
Loading chunked ggufs already works in LLMUnity/llama.cpp
If you could issue a warning or document this, it might save other OSX developers a lot of headaches.
Ideally, chunking via llama-gguf-split (part of llama.cpp tools) would be integrated into LLMUnity and offered via the LLM/Build Manager.