WARNINGRUNTIME UPDATE·2026-06-19
llama.cpp ships b9715
▼ WHAT HAPPENED
llama.cpp cut release b9715 on 2026-06-19. Release notes excerpt: "Ggml/cuda col2im 1d (#24417) * cuda: add GGML_OP_COL2IM_1D, follow-up to the CPU op * cuda: col2im_1d use fast_div_modulo for the index decomposition * cuda: col2im_1d tighten supports_op, type match and contiguous dst **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9715/llama-b9715-bin-macos-arm64.tar...."
▼ OPERATOR ANGLE
Read the release notes and decide whether operators need to act. test throughput and memory fit before pinning the new version. Publish if this changes model compatibility, GPU backend behavior, memory use, quantization paths, security posture, migration requirements, or production serving reliability.
[pulse item] · runlocalai.co/pulse/gh-ggml-org-llama-cpp-b9715