• MLX-LM – v0.30.0

    MLX-LM – v0.30.0

    MLX LM v0.30.0 is live πŸš€ β€” Apple Silicon LLMs just got a serious power-up!

    • Server performance fixed: No more busy-waiting β€” idle polling is now lean, quiet, and efficient. πŸ› οΈ
    • Transformers v5 fully supported: All the latest tokenizer tweaks, model updates, and Hugging Face magic? Covered. πŸ€–
    • MIMO v2 Flash enabled: Multi-input models now fly with optimized attention β€” faster inference, less latency. ⚑
    • Better error messages: Batching failed? Now you’ll know why β€” no more cryptic crashes. πŸ“’
    • Model parallel generation: Split massive models across GPUs like a pro. Scale your LLMs without rewriting code. 🧩
    • Chat template fixes: `apply_chat_template` finally wraps correctly β€” no more dict chaos in your prompts. ✨

    Thousands of Hugging Face models, quantized, fine-tuned, and served β€” all on your M-series chip. Time to upgrade and push your AI stack further. πŸš€

    πŸ”— View Release

  • Ollama – v0.13.5

    Ollama – v0.13.5

    πŸš€ Ollama v0.13.5 just dropped β€” and it’s a quiet game-changer for Gemma users!

    Now you can use function calling with Gemma 2B locally β€” yes, really. Trigger webhooks, query databases, fetch weather, or call APIs directly from your tiny-but-mighty local Gemma model. No cloud needed.

    πŸ’‘ Why it’s cool:

    • Function calling was already in Llama 3 & Mistral β€” now Gemma joins the party.
    • Perfect for building private, lightweight AI agents that do stuff, not just chat.

    Under the hood: parser fixes + smoother rendering = fewer hiccups, more flow.

    Upgrade in one line:

    “`bash

    ollama pull gemma:2b

    “`

    Go build something that acts β€” not just responds. 🎯

    πŸ”— View Release

  • Ollama – v0.13.5-rc1

    Ollama – v0.13.5-rc1

    πŸš€ Ollama v0.13.5-rc1 just dropped β€” and it’s a game-changer for Gemma users!

    Now you can use function calling with Gemma 2B 🎯

    Gemma can now dynamically invoke external tools and APIs β€” think real-time data lookup, code execution, or API calls β€” all triggered by natural language. No more clunky workarounds. Just define your tools, and let Gemma call them like a pro agent.

    Plus:

    • Smoother JSON parsing for tool definitions (no more malformed calls!)
    • Minor performance boosts and bug fixes
    • RC1 = stable enough for early adopters to test in real workflows

    Grab it: `ollama pull gemma:2b` and start building smarter, reactive agents today.

    Full release notes coming soon β€” but this feature? Totally worth the upgrade. πŸ› οΈ

    πŸ”— View Release

  • ComfyUI – v0.5.1

    ComfyUI – v0.5.1

    ComfyUI v0.5.1 just dropped β€” and it’s a quiet hero πŸ› οΈ

    Fixed critical crashes in custom nodes & image loading (no more mid-generate soul-crushes).

    Memory management got a serious upgrade β€” smoother runs on weaker GPUs, even with massive workflows.

    Error messages now actually tell you what went wrong… no more “something broke” mysteries.

    UI feels slicker: cleaner node links, buttery drag-and-drop.

    And hey β€” partial WebGPU support for Apple Silicon users is now live 🍎⚑ (Big things coming in v0.6!)

    If you’ve been battling crashes or lag, this is your sign to update. Keep building! πŸš€

    πŸ”— View Release

  • Ollama – v0.13.5-rc0: GGML update to ec98e2002 (#13451)

    Ollama – v0.13.5-rc0: GGML update to ec98e2002 (#13451)

    Ollama v0.13.5-rc0 just dropped β€” and it’s all about speed under the hood! πŸš€

    The GGML inference engine got a major upgrade to commit `ec98e2002`, with smarter, leaner internals:

    • βœ… MaskBatchPadding removed β€” Less padding = less overhead. KQ masking is now cleaner and faster.
    • 🚫 NVIDIA Nemotron 3 Nano support paused β€” Temporarily pulled for stability. Coming back stronger soon!
    • πŸ”§ Solar Pro tweaks β€” Under-the-hood adjustments, still being verified. If you’re using Solar, test your models!

    No flashy UI β€” just a lighter, faster engine for local LLM inference. Think of it like swapping your car’s engine for a turbocharged version that runs cooler.

    Pro tip: Custom models? Run sanity checks β€” GGML changes can ripple through quantization and attention layers.

    Stay sharp, tinkerers. The local LLM revolution keeps accelerating. πŸ› οΈ

    πŸ”— View Release

  • Lemonade – v9.1.1

    Lemonade – v9.1.1

    πŸ”₯ Lemonade v9.1.1 just dropped β€” and it’s a game-changer for local LLM folks!

    • New AI features live: Embeddings, reranking, and transcriptions are now built right in β€” smarter search, better relevance, and full speech-to-text without leaving the app. πŸŽ™οΈπŸ§ 
    • ROCm support for Strix Point (gfx1150): AMD Ryzen AI users β€” your hardware is finally fully optimized. No more waiting, just raw performance.
    • Install & config smoothed out: FLM install bugs fixed, `–extra-models-dir` added for custom models, and CMake warnings? Gone.
    • UI/UX glow-up: Not-found errors are now helpful, site icons load reliably, and the logo wall got a slick refresh.
    • Bug slaying squad in action: Zero-byte downloads? Squashed. Model registration glitches? Fixed. Health request spam? Suppressed.

    Plus, the README and install guides were completely rewritten β€” clearer than ever for newcomers.

    Smarter. Faster. More open. πŸš€

    Time to run your LLMs like a pro β€” locally, and at full speed.

    πŸ”— View Release

  • Home Assistant Voice Pe – 25.12.0

    Home Assistant Voice Pe – 25.12.0

    Home Assistant Voice PE just dropped v25.12.0 πŸš€ β€” and it’s a game-changer for offline voice control!

    Now backed by the Open Home Foundation, this ESPHome-powered gem lets you command your smart home without the cloud β€” perfect for privacy-first tinkerers.

    Biggest update? Music Assistant’s Sendspin multi-room audio support (public preview)! Sync tunes across speakers in real-time β€” think whole-home playlists, party mode activated by voice. 🎢

    Shoutout to @theHacker for their first contribution β€” welcome to the crew! πŸ™Œ

    78 releases strong and growing.

    Full changelog: 25.11.0…25.12.0

    Your home just got smarter… and louder.

    πŸ”— View Release

  • ComfyUI – v0.5.0

    ComfyUI – v0.5.0

    ComfyUI v0.5.0 just landed β€” and it’s a game-changer for workflow builders 🎨✨

    • LatentUpscale Node β€” Upscale in latent space first for smoother, cleaner details with less noise.
    • Smarter Node Search β€” Type “upscale” and it gets you. No more scrolling through 50 nodes.
    • Memory Management Upgrade β€” Fewer crashes, better GPU predictability when running big models.
    • Custom Nodes Overhauled β€” Install, update, and manage third-party nodes like a pro. No more “why isn’t it showing up?!”
    • Dark Mode 2.0 β€” Sleeker, higher contrast. Your eyes won’t scream after midnight prompt sessions.
    • Save as Template β€” Turn your favorite chains into reusable templates. Perfect for teams or recurring styles.

    Plus: 40+ bugs squashed, including the dreaded “random node disconnects on reload.” πŸžβœ…

    If you’re deep in Stable Diffusion workflows β€” this update is your new best friend. Go tweak, test, and create! πŸš€

    https://www.comfy.org/

    πŸ”— View Release

  • MLX-LM – v0.29.0

    MLX-LM – v0.29.0

    πŸš€ MLX LM v0.29.0 is live β€” and it’s a beast!

    • Batch generation just got 2x faster thanks to `wired_limit` fixes β€” your server will thank you.
    • RoPE & SuScaledRoPE fixed for `rnj-1` and others β€” smoother attention, less drift.
    • Dequantize bug squashed βœ… Now using the right function β€” cleaner outputs, better precision.
    • Repetition penalty defaults to 0.0 β€” less annoying repetition from day one. 🎯
    • DSV32 & Gemma3 β€” bugs gone, stable and ready to deploy.
    • SSM batching fixed β€” state-space models now behave on the server. πŸ’‘
    • Nemotron 3 added! πŸŽ‰ Go ahead, test it.
    • Devstral-2 now works properly β€” no more surprises. πŸ‘

    Big shoutout to first-time contributors: @otarkhan, @devnamrits, @DePasqualeOrg, and @inferencers β€” welcome to the crew! πŸ™Œ

    Update now β€” your LLMs are ready for a speed run. πŸ› οΈ

    Full changelog: [v0.28.4…v0.29.0](link)

    πŸ”— View Release

  • Ollama – v0.13.4

    Ollama – v0.13.4

    πŸš€ Ollama v0.13.4 just dropped β€” tiny update, big impact for Nemotron users!

    Fix: The `think` token now actually listens to your custom prompts. No more ignored reasoning directives β€” if you told the model to “think step by step,” it finally will. πŸ€”βœ¨

    Perfect for RAG builders, prompt engineers, and anyone relying on structured reasoning. No new features, just cleaner, more reliable internal reflection.

    Upgrade if you’re using Nemotron models β€” your prompts just got smarter. πŸ› οΈ

    πŸ”— View Release