llama.cpp releases · · 1 min read

b9864

Mirrored from llama.cpp releases for archival readability. Support the source by reading on the original site.

server + ui: ping silent SSE streams every 1s and kick only after 3s so slow prefill never drops healthy connections (#25241)

  • server + ui: ping silent SSE streams every 1s and kick only after 3s so slow prefill never drops healthy connections

  • server + ui: sse_ping_interval becomes a per-request body field

Address review from ngxson: the global default returns to 30 so API
clients see no behavior change, and the WebUI sends sse_ping_interval: 1
in the request body since it owns the 3s visibility-kick contract and
declares the cadence it needs. Positive values keep the existing > 0
gate, -1 keeps its disabled semantics.

  • server: move sse_ping_interval into the request schema

Address review from ngxson: the field is now a typed field_num with
hard limits (-1, INT32_MAX) bound to task_params, seeded from the CLI
default alongside the other inherited parameters. The raw json_value
read and its redundant comment are gone, and schema evaluation brings
type and range validation for free.

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from llama.cpp releases