MLX-LM – v0.28.3

MLX-LM – v0.28.3

πŸ”₯ MLX LM v0.28.3 is LIVE! πŸ”₯

Heads up, Apple silicon LLM tinkerers – LLaMA-Factory just dropped a massive update for MLX LM! This release is packed with refinements and new features to help you build, train & serve even better models.

Here’s the breakdown:

  • Memory Efficiency: State Space Models (SSM) are leaner now. πŸ™Œ

MoE Magic: Lots of improvements to Mixture of Experts – LoRA fixes, bailing logic, and* a new LFM2 option!

  • Qwen3-VL Support: Visual language model support added with Qwen3-VL (plus a dense version!). πŸ–ΌοΈ
  • Faster GPT2: Batch processing for GPT-2 just got quicker.
  • DWQ Tweaks: Depthwise Quantization refined with temperature adjustments.
  • Python 3.9 Love: Qwen3 support now extends to Python 3.9 users!
  • Plus: Cleaned up params, simplified I/O, CUDA install fixes, batched SSM masking, gradient accumulation, data parallel eval, Jamba support & LLM Benchmarks! πŸ“Š

Dig into the full changelog – there’s a ton here to play with! πŸŽ‰

πŸ”— View Release