← /pulse/gh-ggml-org-llama-cpp-b9668
URGENTRUNTIME UPDATE·2026-06-16

llama.cpp ships b9668

▼ WHAT HAPPENED

llama.cpp cut release b9668 on 2026-06-16. Release notes excerpt: "vulkan: prefer host-visible memory buffers on UMA devices (#22930) * implement UMA host-visible memory * update based on 0cc4m's suggestion **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9668/llama-b9668-bin-macos-arm64.tar.gz) - macOS Apple Silicon (arm64, KleidiAI enabled) [DISABLED](https://github.c..."

▼ OPERATOR ANGLE

Read the release notes and decide whether operators need to act. test throughput and memory fit before pinning the new version. Publish if this changes model compatibility, GPU backend behavior, memory use, quantization paths, security posture, migration requirements, or production serving reliability.
[pulse item] · runlocalai.co/pulse/gh-ggml-org-llama-cpp-b9668