Ollama – v0.15.0-rc3: Revert “model: add MLA absorption for glm4moelite (#13810)” (#13869)

Ollama – v0.15.0-rc3: Revert “model: add MLA absorption for glm4moelite (#13810)” (#13869)

🚨 Ollama v0.15.0-rc3 just dropped — and it’s a revert!

The team pulled back the MLA (Multi-Layer Attention) absorption patch for GLM4-MoE-Lite (#13810) in #13869.

Why? Stability. Compatibility. No coffee spills today. ☕🚫

This isn’t a feature drop — it’s a strategic pause. If you’re using GLM4-MoE-Lite, stick with v0.14.x for now. The MLA integration is still in the lab — expect something smoother, smarter, and more stable soon.

Ollama’s still your go-to for local LLMs: Llama 3, Mistral, Phi-4, Gemma — all running smooth. Just hold off on GLM4-MoE-Lite’s latest “enhancement” until the next drop.

Keep tinkering — good things come to those who wait (and test). 🚀

🔗 View Release