Ollama – v0.22.0

Ollama – v0.22.0

๐Ÿš€ Ollama Update Alert! ๐Ÿš€

If youโ€™re running your local LLMs on Apple Silicon, listen up! The latest release (v0.22.0-rc1) is officially here, and it’s bringing some massive performance optimizations via an MLX update. This is a huge deal for anyone trying to squeeze every bit of juice out of their Mac hardware.

Hereโ€™s the breakdown of whatโ€™s new:

  • Batch Processing Power: The `mlxrunner` now supports batching the sampler across multiple sequences. If you’re working with large datasets or need to generate multiple outputs at once, this is a massive efficiency win! ๐Ÿ“ˆ
  • NVIDIA & MLX Bridge: In a super cool move for cross-platform workflows, MLX now supports importing models optimized via NVIDIA TensorRT. This makes it way easier to move your heavy-duty workflows between NVIDIA and Apple hardware without the headache.
  • Precision Tokenization: A bug fix for multi-regex BPE offset handling is included, ensuring your tokenization stays precise and error-free during complex text processing tasks.

Time to pull that update and start benchmarking! ๐Ÿ› ๏ธ

๐Ÿ”— View Release