Ollama – v0.15.1-rc1

Ollama – v0.15.1-rc1

🚀 Ollama v0.15.1-rc1 just dropped — and it’s a quiet powerhouse!

GLM4-MoE-Lite now quantizes more tensors to Q8_0 → smaller footprint, faster inference, same brainpower. Perfect for laptops, Raspberry Pis, or any edge device running low on RAM.

And goodbye, weird double BOS tokens! 🎉 No more repetitive beginnings — your outputs are now cleaner and smoother.

This is a release candidate, so it’s stable but still being polished. If you’re running GLM4-MoE-Lite or just want leaner, faster models — update now and feel the difference.

🧠 Pro tip: Q8_0 = less memory, same genius.

🔗 View Release

Ollama – v0.15.1-rc1

More posts

Ollama – v0.15.5-rc2

ComfyUI – v0.12.2

ComfyUI – v0.12.1

ComfyUI – v0.12.0