Ollama – v0.13.2-rc0: ggml update to b7108 (#12992)

Ollama – v0.13.2-rc0: ggml update to b7108 (#12992)

Ollama v0.13.2-rc0 just dropped β€” and it’s a speed demon πŸš€

The big win? ggml updated to b7108, powering faster, leaner LLM inference across the board.

Here’s what’s new:

  • βœ… TopK sampling optimized β€” smarter token selection, especially on big vocab models.
  • βœ… Metal argsort fixed β€” M-series chips now run smoother than ever 🍏
  • βœ… Bakllava image-to-text regression patched β€” multimodal models are back in business.
  • 🚨 Projector metadata warning β€” if you’re using multimodal GGUF files, double-check your metadata.
  • ⚠️ Vulkan fixes temporarily reverted β€” stability first, speed later.

This is a release candidate β€” stable enough for daily use, fresh enough to feel the gains. If you’re on Apple Silicon? This is your upgrade.

Update now and keep those models rolling. πŸ€–πŸ’»

πŸ”— View Release