MLX-LM – v0.28.3 – TaterBytes

MLX-LM – v0.28.3

🔥 MLX LM v0.28.3 is LIVE! 🔥

Heads up, Apple silicon LLM tinkerers – LLaMA-Factory just dropped a massive update for MLX LM! This release is packed with refinements and new features to help you build, train & serve even better models.

Here’s the breakdown:

Memory Efficiency: State Space Models (SSM) are leaner now. 🙌

MoE Magic: Lots of improvements to Mixture of Experts – LoRA fixes, bailing logic, and* a new LFM2 option!

Qwen3-VL Support: Visual language model support added with Qwen3-VL (plus a dense version!). 🖼️
Faster GPT2: Batch processing for GPT-2 just got quicker.
DWQ Tweaks: Depthwise Quantization refined with temperature adjustments.
Python 3.9 Love: Qwen3 support now extends to Python 3.9 users!
Plus: Cleaned up params, simplified I/O, CUDA install fixes, batched SSM masking, gradient accumulation, data parallel eval, Jamba support & LLM Benchmarks! 📊

Dig into the full changelog – there’s a ton here to play with! 🎉

🔗 View Release

MLX-LM – v0.28.3

More posts

Voxtral Wyoming – v1.0.0

Ollama – v0.17.5

Voxtral Wyoming – v0.5.0

Voxtral Wyoming – v0.4.0