Ollama – v0.13.2-rc2: ggml: handle all streams (#13350)

Ollama – v0.13.2-rc2: ggml: handle all streams (#13350)

🚀 Ollama v0.13.2-rc2 just dropped — and it’s a quiet win for stability!

The big fix? ggml now handles all GPU/CPU streams properly. No more leaked buffers or misaligned memory. Think of it as finally tidying up your AI workshop so every tensor has its place.

✨ Why you’ll care:

Smoother inference on multi-GPU setups
Fewer crashes during heavy async loads
Better memory cleanup = longer, happier sessions

If you’ve been battling weird memory hiccups with Llama 3 or DeepSeek-R1 on Linux/macOS/Windows — this is your upgrade. Quiet change, huge impact. 💨

Upgrade now and run like a champ.

🔗 View Release

Ollama – v0.13.2-rc2: ggml: handle all streams (#13350)

More posts

Voxtral Wyoming – v1.0.0

Ollama – v0.17.5

Voxtral Wyoming – v0.5.0

Voxtral Wyoming – v0.4.0