Ollama – v0.22.0
๐ Ollama Update Alert! ๐
If youโre running your local LLMs on Apple Silicon, listen up! The latest release (v0.22.0-rc1) is officially here, and it’s bringing some massive performance optimizations via an MLX update. This is a huge deal for anyone trying to squeeze every bit of juice out of their Mac hardware.
Hereโs the breakdown of whatโs new:
- Batch Processing Power: The `mlxrunner` now supports batching the sampler across multiple sequences. If you’re working with large datasets or need to generate multiple outputs at once, this is a massive efficiency win! ๐
- NVIDIA & MLX Bridge: In a super cool move for cross-platform workflows, MLX now supports importing models optimized via NVIDIA TensorRT. This makes it way easier to move your heavy-duty workflows between NVIDIA and Apple hardware without the headache.
- Precision Tokenization: A bug fix for multi-regex BPE offset handling is included, ensuring your tokenization stays precise and error-free during complex text processing tasks.
Time to pull that update and start benchmarking! ๐ ๏ธ
