llama.cpp ships b9689

▼ WHAT HAPPENED

llama.cpp cut release b9689 on 2026-06-17. Release notes excerpt: "metal : add f16 and bf16 support for concat operator (#24724) * metal : add f16 and bf16 support for concat operator Extend the Metal backend concat operator to support f16 and bf16 tensor types in addition to the existing f32 and i32 support. - Template kernel_concat on type T with specializations for float, half, bfloat, and int - Add type-specific pipelin..."

▼ OPERATOR ANGLE

Read the release notes and decide whether operators need to act. test throughput and memory fit before pinning the new version. Publish if this changes model compatibility, GPU backend behavior, memory use, quantization paths, security posture, migration requirements, or production serving reliability.

SOURCE: https://github.com/ggml-org/llama.cpp/releases/tag/b9689[GITHUB-RELEASE]

[pulse item] · runlocalai.co/pulse/gh-ggml-org-llama-cpp-b9689