llama.cpp releases
468 articles archived · Visit source ↗ · RSS
-
llama.cpp releases dev-tools 16d ago
b9655
chat: fix an "oldie but goodie" grammar generator bug that surfaced during last changes ( #24653 ) chat: fix an "oldie but goodie" grammar generator bug that surfaced during last changes update erroneous case in PEG parser test macOS/iOS: macOS Apple Silicon (arm64) macOS Apple…
38 -
llama.cpp releases dev-tools 16d ago
b9654
mtmd : add post-decode callback ( #24645 ) Assisted-by: pi:llama.cpp/Qwen3.6-27B macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu…
14 -
llama.cpp releases dev-tools 16d ago
b9653
vulkan: support more CONCAT types ( #24579 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…
35 -
llama.cpp releases dev-tools 16d ago
b9652
wasm : fix fallback symbol collision ( #24639 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…
4 -
llama.cpp releases dev-tools 16d ago
b9651
SYCL: use native subgroup size for K-quant DMMV ( #21700 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…
15 -
llama.cpp releases dev-tools 16d ago
b9650
sycl: fix soft_max_f32 max reduction ( #24451 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…
18 -
llama.cpp releases dev-tools 16d ago
b9649
sycl : fix reorder function; add fp32/fp16 in build script ( #24578 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan)…
31 -
llama.cpp releases dev-tools 16d ago
b9647
[SYCL] add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp ( #24584 ) add to support pool_1d, move pool_1d/2d code to pool.cpp/hpp update ops.md macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS…
25 -
llama.cpp releases dev-tools 16d ago
b9646
[SYCL]: Remove per-allocation Level Zero runtime checks ( #23399 ) [SYCL] Centralize Level Zero detection in ggml_sycl_init use the same wording get back the warning [SYCL] Remove per-allocation getenv() for GGML_SYCL_ENABLE_LEVEL_ZERO bring back the comment move it up to make…
10 -
llama.cpp releases dev-tools 16d ago
b9645
metal : add repeat bf16 ( #24638 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64…
36 -
llama.cpp releases dev-tools 17d ago
b9644
chat: fix whitespace problems once and for all ( #24624 ) chat: fix whitespace problems once and for all Purge trailing spaces from grammar generation Revert "Purge trailing spaces from grammar generation" This reverts commit b0827ec . macOS/iOS: macOS Apple Silicon (arm64)…
30 -
llama.cpp releases dev-tools 17d ago
b9642
CUDA: only support F32/F16 for GGML_OP_REPEAT ( #24533 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…
33 -
llama.cpp releases dev-tools 17d ago
b9641
ggml-webgpu: improve i-quants mul_mat performance and speed up prefil…
23 -
llama.cpp releases dev-tools 17d ago
b9637
chat: add dedicated Cohere2MoE (North Code) parser ( #24615 ) chat: add dedicated Cohere2MoE (North Code) parser Some renames to make @CISC happy :> macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework…
25 -
llama.cpp releases dev-tools 17d ago
b9632
jinja : add count/d/e filter aliases ( #24606 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan)…
9 -
llama.cpp releases dev-tools 18d ago
b9631
cli : fix not copying preserved tokens ( #24258 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…
6 -
llama.cpp releases dev-tools 18d ago
b9630
Add cohere2moe to llama-vocab for TINY_AYA ( #24601 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…
16 -
llama.cpp releases dev-tools 18d ago
b9628
add sycl to check-release ( #24583 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64…
21 -
llama.cpp releases dev-tools 18d ago
b9627
ui : fix llama-ui-embed crash when no asset dir is given ( #24597 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan)…
25 -
llama.cpp releases dev-tools 18d ago
b9626
Add arch support for cohere2-MoE ( #24260 ) Add arch support for cohere2-MoE Removed redundant gating_func checks Changed ffn lookup to prefer prefix_dense_intermediate_size Renamed arch to cohere2moe Removed redundant lmhead check and chat template changes Removed…
36 -
llama.cpp releases dev-tools 18d ago
b9625
jinja : fix negative step slice with start/stop values ( #24580 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan)…
27 -
llama.cpp releases dev-tools 18d ago
b9624
ui: build-time gzip compression ( #24571 ) ui: keep original file name and path fix nocache ui: build-time gzip compression macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU)…
19 -
llama.cpp releases dev-tools 18d ago
b9623
jinja : fix split and replace with empty first arg ( #24574 ) fix split and replace with empty first arg fix reserve size macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU)…
13 -
llama.cpp releases dev-tools 19d ago
b9622
vulkan: support non-contig unary/glu ops ( #24215 ) vulkan: support non-contig unary/glu ops Change unary/glu ops to pass in all strides and use fastdiv for the index calculation. Put all unary ops in one file, similar to glu, to share the code. codex went ahead and added expm1…
15 -
llama.cpp releases dev-tools 19d ago
b9621
ui: keep original file name and path ( #24568 ) ui: keep original file name and path fix nocache macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu…
35 -
llama.cpp releases dev-tools 19d ago
b9620
server: clean up static assets handling ( #24550 ) server: clean up static assets handling nits simplify file name handling, use static file name everywhere cmake/ui : bundle UI assets in an archive ui : run prettier on post-build.js Co-authored-by: Alde Rojas hello@alde.dev…
12 -
llama.cpp releases dev-tools 19d ago
b9616
ci : unbreak release harder ( #24545 ) unbreak release harder missed one remove missing test for now macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu…
29 -
llama.cpp releases dev-tools 20d ago
b9611
fit : avoid including llama-ext.h in fit.h ( #24506 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…
28 -
llama.cpp releases dev-tools 20d ago
b9610
sync : ggml macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64 (ROCm 7.2) Ubuntu x64…
22 -
llama.cpp releases dev-tools 20d ago
b9608
vendor : update cpp-httplib to 0.47.0 ( #24395 ) Signed-off-by: Adrien Gallouët angt@huggingface.co macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu…
13 -
llama.cpp releases dev-tools 20d ago
b9606
spec: add EAGLE3 speculative decoding support ( #18039 ) llama : enable layer input extraction spec: support eagle3 eagle3: fix params bug eagle3: support Gemma4 eagle3 from RedHatAI eagle3: set sync when get features from target Co-authored-by: tnhnyzc…
24 -
llama.cpp releases dev-tools 20d ago
b9605
ggml: support concat for scalar types at cuda backend ( #24011 ) cuda: support concat for scalar types Update concat.cu fix metal ci issue macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux:…
20 -
llama.cpp releases dev-tools 20d ago
b9604
[SYCL] Fix CI build & release for SYCL backend ( #24387 ) restore SYCL build and release, remove github cache modify for test only verify the ccache is used remove debug code change rm duplicate action, update key in ccache add action ccache-clear after building in both ubuntu…
21 -
llama.cpp releases dev-tools 20d ago
b9603
opencl: add q5_0/q5_1 gemm and gemv kernels for Adreno ( #24319 ) opencl: add q5_0 adreno support opencl: add q5_1 adreno support opencl: cosmetic fix Co-authored-by: Li He lih@qti.qualcomm.com macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled)…
13 -
llama.cpp releases dev-tools 20d ago
b9601
vulkan: ifdef eMesaHoneykrisp (build fix) ( #24479 ) Fixes build/CI after #24306 . macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu…
13 -
llama.cpp releases dev-tools 21d ago
b9596
server: skip unused log lines on router mode ( #24463 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…
32 -
llama.cpp releases dev-tools 21d ago
b9594
vocab : refactor normalizer flags into options struct, add strip_accents ( #24371 ) vocab : refactor normalizer flags into options struct, add strip_accents Update src/llama-vocab.h Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com Update src/llama-vocab.cpp…
27 -
llama.cpp releases dev-tools 21d ago
b9592
vendor : update LibreSSL to 4.3.2 ( #24397 ) Signed-off-by: Adrien Gallouët angt@huggingface.co macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x…
35 -
llama.cpp releases dev-tools 21d ago
b9591
Remove padding and multiple D2D copies for MTP ( #24086 ) Make ggml_gated_delta_net take only the initial recurrent state (D, 1, n_seqs) and passes the snapshot count K as an op parameter instead of inferring it from state->ne[1]. Remove the padding hack and copy all emitted…
8 -
llama.cpp releases dev-tools 21d ago
b9590
chat: fix LFM2/LFM2.5 ignoring json_schema ( #24377 ) The LFM2 specialized template handler only built a grammar for tool-calling, silently ignoring json_schema from response_format. macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED…
6 -
llama.cpp releases dev-tools 22d ago
b9587
speculative : fix "ngram-map-k4v" name in logging ( #24253 ) This is a non-functional change. When using --spec-type ngram-map-k4v , the log messages at startup and runtime say ngram-map-k . Added logic in the in the constructor of common_speculative_impl_ngram_map_k to pass the…
16 -
llama.cpp releases dev-tools 22d ago
b9586: webui: implement pinned conversations support (#21387)
webui: implement pinned conversations support webui: linter/prettier pass Fix the unused handleMobileSidebarItemClick from the component. the search should find pinned conversations as well Co-authored-by: Pascal admin@serveurperso.com Co-authored-by: Pascal…
24 -
llama.cpp releases dev-tools 22d ago
b9585
graph: Fix granite speech model inference by applying embedding scale when deepstack is not used ( #24357 ) llama-graph : apply embedding scale when deepstack is not used nits: remove non-existant hunyuan-vl from the tests apply suggestion from @gabe-l-hart Co-authored-by: Xuan…
25 -
llama.cpp releases dev-tools 22d ago
b9584
ci : fix windows release ( #24369 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64…
22 -
llama.cpp releases dev-tools 23d ago
b9581
vulkan: reduce iq1 shared memory usage for mul_mm ( #24287 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu…
21 -
llama.cpp releases dev-tools 23d ago
b9580
vulkan: add v_dot2_f32_f16 support in matrix-matrix multiplication and Flash Attention ( #24123 ) vulkan: add support for valve fp16 dot2 extension use macro for dot2 path choice properly check for the feature add dot_product abstraction to reduce preprocessor branching…
10 -
llama.cpp releases dev-tools 23d ago
b9578
mtmd: refactor video subproc handling ( #24316 ) mtmd: refactor video subproc handling Update tools/mtmd/mtmd-helper.cpp Co-authored-by: Mikko Juola mikjuo@gmail.com Co-authored-by: Mikko Juola mikjuo@gmail.com macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64,…
11 -
llama.cpp releases dev-tools 23d ago
b9577
server: log prompts to directory ( #22031 ) server: log prompts to directory Add --log-prompts-dir to write each prompt to a separate text file in the specified directory. Apply suggestion from @ngxson Co-authored-by: Xuan-Son Nguyen thichthat@gmail.com macOS/iOS: macOS Apple…
35