Ollama – v0.15.1-rc1

Ollama – v0.15.1-rc1

πŸš€ Ollama v0.15.1-rc1 just dropped β€” and it’s a quiet powerhouse!

GLM4-MoE-Lite now quantizes more tensors to Q8_0 β†’ smaller footprint, faster inference, same brainpower. Perfect for laptops, Raspberry Pis, or any edge device running low on RAM.

And goodbye, weird double BOS tokens! πŸŽ‰ No more repetitive beginnings β€” your outputs are now cleaner and smoother.

This is a release candidate, so it’s stable but still being polished. If you’re running GLM4-MoE-Lite or just want leaner, faster models β€” update now and feel the difference.

🧠 Pro tip: Q8_0 = less memory, same genius.

πŸ”— View Release