Ollama – v0.13.1: llm: Don’t always evict models on CPU-only systems

Ollama – v0.13.1: llm: Don’t always evict models on CPU-only systems

Big win for CPU folks! 🎉 Ollama v0.13.1 just dropped and fixes a major pain point: models no longer get constantly evicted from memory on CPU-only systems. 🐢💻

Before: Ollama thought “no VRAM = always evict,” causing annoying reloads even when RAM was plentiful.

Now: It only evicts when actually needed—like when you’re juggling multiple huge models and RAM is tight.

Result? Smoother, faster inference on laptops, old machines, or cloud instances without GPUs. Load your Llama 3 or Phi-4 once—and let it stay loaded.

Fixes #13227. CPU users, rejoice! 🙌

🔗 View Release

Ollama – v0.13.1: llm: Don’t always evict models on CPU-only systems

More posts

Voxtral Wyoming – v1.0.0

Ollama – v0.17.5

Voxtral Wyoming – v0.5.0

Voxtral Wyoming – v0.4.0