MLX-LM – v0.29.0

MLX-LM – v0.29.0

πŸš€ MLX LM v0.29.0 is live β€” and it’s a beast!

  • Batch generation just got 2x faster thanks to `wired_limit` fixes β€” your server will thank you.
  • RoPE & SuScaledRoPE fixed for `rnj-1` and others β€” smoother attention, less drift.
  • Dequantize bug squashed βœ… Now using the right function β€” cleaner outputs, better precision.
  • Repetition penalty defaults to 0.0 β€” less annoying repetition from day one. 🎯
  • DSV32 & Gemma3 β€” bugs gone, stable and ready to deploy.
  • SSM batching fixed β€” state-space models now behave on the server. πŸ’‘
  • Nemotron 3 added! πŸŽ‰ Go ahead, test it.
  • Devstral-2 now works properly β€” no more surprises. πŸ‘

Big shoutout to first-time contributors: @otarkhan, @devnamrits, @DePasqualeOrg, and @inferencers β€” welcome to the crew! πŸ™Œ

Update now β€” your LLMs are ready for a speed run. πŸ› οΈ

Full changelog: [v0.28.4…v0.29.0](link)

πŸ”— View Release