Ollama – v0.13.5-rc0: GGML update to ec98e2002 (#13451)

Ollama – v0.13.5-rc0: GGML update to ec98e2002 (#13451)

Ollama v0.13.5-rc0 just dropped — and it’s all about speed under the hood! 🚀

The GGML inference engine got a major upgrade to commit `ec98e2002`, with smarter, leaner internals:

✅ MaskBatchPadding removed — Less padding = less overhead. KQ masking is now cleaner and faster.
🚫 NVIDIA Nemotron 3 Nano support paused — Temporarily pulled for stability. Coming back stronger soon!
🔧 Solar Pro tweaks — Under-the-hood adjustments, still being verified. If you’re using Solar, test your models!

No flashy UI — just a lighter, faster engine for local LLM inference. Think of it like swapping your car’s engine for a turbocharged version that runs cooler.

Pro tip: Custom models? Run sanity checks — GGML changes can ripple through quantization and attention layers.

Stay sharp, tinkerers. The local LLM revolution keeps accelerating. 🛠️

🔗 View Release

Ollama – v0.13.5-rc0: GGML update to ec98e2002 (#13451)

More posts

Voxtral Wyoming – v1.0.0

Ollama – v0.17.5

Voxtral Wyoming – v0.5.0

Voxtral Wyoming – v0.4.0