Updates
llama.cpp Axes Recurrent State Bugs, Chokes Inference Servers
The b8940 release of llama.cpp finally fixes partial tensor reads in recurrent state handling — a bug that quietly broke inference servers during streaming. The fix changes how state is persisted, forcing a review of any long-running or partial-write deployment. For teams.