llama.cpp ships b9673

▼ WHAT HAPPENED

llama.cpp cut release b9673 on 2026-06-17. Release notes excerpt: "sycl: Add optional USM system allocations (#22526) This introduces an optional feature to allocate large GPU buffers (≥ 1GB) using USM system allocations if supported by the device. It allows using buffers from the system allocator then letting the system manage memory migrations between host and device as necessary. This feature is disabled by default and r..."

▼ OPERATOR ANGLE

Read the release notes and decide whether operators need to act. test throughput and memory fit before pinning the new version. Publish if this changes model compatibility, GPU backend behavior, memory use, quantization paths, security posture, migration requirements, or production serving reliability.

SOURCE: https://github.com/ggml-org/llama.cpp/releases/tag/b9673[GITHUB-RELEASE]

[pulse item] · runlocalai.co/pulse/gh-ggml-org-llama-cpp-b9673