MLX-LM – v0.28.3
π₯ MLX LM v0.28.3 is LIVE! π₯
Heads up, Apple silicon LLM tinkerers β LLaMA-Factory just dropped a massive update for MLX LM! This release is packed with refinements and new features to help you build, train & serve even better models.
Hereβs the breakdown:
- Memory Efficiency: State Space Models (SSM) are leaner now. π
MoE Magic: Lots of improvements to Mixture of Experts β LoRA fixes, bailing logic, and* a new LFM2 option!
- Qwen3-VL Support: Visual language model support added with Qwen3-VL (plus a dense version!). πΌοΈ
- Faster GPT2: Batch processing for GPT-2 just got quicker.
- DWQ Tweaks: Depthwise Quantization refined with temperature adjustments.
- Python 3.9 Love: Qwen3 support now extends to Python 3.9 users!
- Plus: Cleaned up params, simplified I/O, CUDA install fixes, batched SSM masking, gradient accumulation, data parallel eval, Jamba support & LLM Benchmarks! π
Dig into the full changelog β there’s a ton here to play with! π
