A native Apple-platform utility that tells you whether a GGUF model will fit on your Mac, iPhone, iPad, or Vision Pro, and how much memory is left for KV-cache at a selected context window.
Built with SwiftUI for iOS, macOS, and visionOS 26.1+.
- Browses GGUF models on Hugging Face by popularity and search, with file sizes resolved per quant variant.
- Computes exact KV-cache memory by parsing the GGUF header
(
block_count,head_count,head_count_kv,embedding_length) for models you've downloaded — falls back to a calibrated estimate for browse-only models. - Reports compatibility against the device you're running on, using
ProcessInfo.physicalMemoryand the OS-reported device class. No chip lookup tables, no hardcoded device catalog. - Ranks model files by fit, so you can quickly find the largest compatible quantization for the current device.
Clone, then create your local signing config:
cp Config/Signing.xcconfig.template Config/Signing.xcconfig
# edit Config/Signing.xcconfig and fill in your Apple Developer Team IDOpen SelfHostLLM Calculator.xcodeproj in Xcode 26+ and build, or:
xcodebuild -project "SelfHostLLM Calculator.xcodeproj" \
-scheme "SelfHostLLM Calculator" \
-destination 'platform=macOS' buildTo run the unit tests:
xcodebuild -project "SelfHostLLM Calculator.xcodeproj" \
-scheme "SelfHostLLM Calculator" \
-destination 'platform=macOS' testThe repo is self-contained — no Swift Package Manager dependencies, no CocoaPods, no Carthage. Just SwiftUI and Foundation.
The app is intentionally small and direct:
Core/Services/CalculatorEngine.swiftis the pure-struct calculator.Core/Services/GGUFHeaderReader.swiftparses GGUF v2/v3 headers without loading model weights.Data/Remote/HFAPIClient.swifttalks to the Hugging Face model API.Core/Services/ModelRepository.swiftowns the offline-first repo cache and downloaded model merge.Features/Dashboard/is the unified single-screen experience.
See CLAUDE.md for the full project map and implementation notes.
LLM Calculator treats the GGUF file size as model memory. That is the memory needed to load the weights.
KV-cache is handled in two ways:
- Downloaded GGUF files: exact KV-cache is calculated from GGUF header metadata.
- Browse-only Hugging Face results: KV-cache is estimated as a conservative fraction of file size and scaled linearly by context window.
Compatibility uses the device's OS-reported physical memory, a platform reserve, and a small framework overhead. Results are intentionally simple:
- Green: fits comfortably
- Orange: tight, but should fit
- Red: exceeds the device budget
- Gray: file size is unknown
LLM Calculator does not require an account and does not include analytics or tracking code.
The app makes network requests to Hugging Face to load GGUF repository metadata, search results, file lists, and model downloads. Search queries are sent to Hugging Face when you use search. Downloaded models, Hugging Face cache data, and the downloaded-model registry are stored locally in the app's Documents directory.
- Browse-only compatibility is an estimate until the GGUF header is available from a downloaded file.
- Device detection uses OS-reported device class and physical memory only. It does not identify chip names or rely on hardware lookup tables.
- The app checks memory fit, not runtime speed, prompt-processing throughput, or model quality.
I wanted a tool I'd actually trust before pulling a 30 GB GGUF onto a laptop, and the answer "it depends on KV-cache" deserves to be a number, not a vibe. Putting it on GitHub because the audience is devs, the math is worth peer-reviewing, and the SwiftUI is worth borrowing.
MIT — see LICENSE.