• Lemonade – v9.1.1

    Lemonade – v9.1.1

    πŸ”₯ Lemonade v9.1.1 just dropped β€” and it’s a game-changer for local LLM folks!

    • New AI features live: Embeddings, reranking, and transcriptions are now built right in β€” smarter search, better relevance, and full speech-to-text without leaving the app. πŸŽ™οΈπŸ§ 
    • ROCm support for Strix Point (gfx1150): AMD Ryzen AI users β€” your hardware is finally fully optimized. No more waiting, just raw performance.
    • Install & config smoothed out: FLM install bugs fixed, `–extra-models-dir` added for custom models, and CMake warnings? Gone.
    • UI/UX glow-up: Not-found errors are now helpful, site icons load reliably, and the logo wall got a slick refresh.
    • Bug slaying squad in action: Zero-byte downloads? Squashed. Model registration glitches? Fixed. Health request spam? Suppressed.

    Plus, the README and install guides were completely rewritten β€” clearer than ever for newcomers.

    Smarter. Faster. More open. πŸš€

    Time to run your LLMs like a pro β€” locally, and at full speed.

    πŸ”— View Release

  • Home Assistant Voice Pe – 25.12.0

    Home Assistant Voice Pe – 25.12.0

    Home Assistant Voice PE just dropped v25.12.0 πŸš€ β€” and it’s a game-changer for offline voice control!

    Now backed by the Open Home Foundation, this ESPHome-powered gem lets you command your smart home without the cloud β€” perfect for privacy-first tinkerers.

    Biggest update? Music Assistant’s Sendspin multi-room audio support (public preview)! Sync tunes across speakers in real-time β€” think whole-home playlists, party mode activated by voice. 🎢

    Shoutout to @theHacker for their first contribution β€” welcome to the crew! πŸ™Œ

    78 releases strong and growing.

    Full changelog: 25.11.0…25.12.0

    Your home just got smarter… and louder.

    πŸ”— View Release

  • ComfyUI – v0.5.0

    ComfyUI – v0.5.0

    ComfyUI v0.5.0 just landed β€” and it’s a game-changer for workflow builders 🎨✨

    • LatentUpscale Node β€” Upscale in latent space first for smoother, cleaner details with less noise.
    • Smarter Node Search β€” Type “upscale” and it gets you. No more scrolling through 50 nodes.
    • Memory Management Upgrade β€” Fewer crashes, better GPU predictability when running big models.
    • Custom Nodes Overhauled β€” Install, update, and manage third-party nodes like a pro. No more “why isn’t it showing up?!”
    • Dark Mode 2.0 β€” Sleeker, higher contrast. Your eyes won’t scream after midnight prompt sessions.
    • Save as Template β€” Turn your favorite chains into reusable templates. Perfect for teams or recurring styles.

    Plus: 40+ bugs squashed, including the dreaded “random node disconnects on reload.” πŸžβœ…

    If you’re deep in Stable Diffusion workflows β€” this update is your new best friend. Go tweak, test, and create! πŸš€

    https://www.comfy.org/

    πŸ”— View Release

  • MLX-LM – v0.29.0

    MLX-LM – v0.29.0

    πŸš€ MLX LM v0.29.0 is live β€” and it’s a beast!

    • Batch generation just got 2x faster thanks to `wired_limit` fixes β€” your server will thank you.
    • RoPE & SuScaledRoPE fixed for `rnj-1` and others β€” smoother attention, less drift.
    • Dequantize bug squashed βœ… Now using the right function β€” cleaner outputs, better precision.
    • Repetition penalty defaults to 0.0 β€” less annoying repetition from day one. 🎯
    • DSV32 & Gemma3 β€” bugs gone, stable and ready to deploy.
    • SSM batching fixed β€” state-space models now behave on the server. πŸ’‘
    • Nemotron 3 added! πŸŽ‰ Go ahead, test it.
    • Devstral-2 now works properly β€” no more surprises. πŸ‘

    Big shoutout to first-time contributors: @otarkhan, @devnamrits, @DePasqualeOrg, and @inferencers β€” welcome to the crew! πŸ™Œ

    Update now β€” your LLMs are ready for a speed run. πŸ› οΈ

    Full changelog: [v0.28.4…v0.29.0](link)

    πŸ”— View Release

  • Ollama – v0.13.4

    Ollama – v0.13.4

    πŸš€ Ollama v0.13.4 just dropped β€” tiny update, big impact for Nemotron users!

    Fix: The `think` token now actually listens to your custom prompts. No more ignored reasoning directives β€” if you told the model to “think step by step,” it finally will. πŸ€”βœ¨

    Perfect for RAG builders, prompt engineers, and anyone relying on structured reasoning. No new features, just cleaner, more reliable internal reflection.

    Upgrade if you’re using Nemotron models β€” your prompts just got smarter. πŸ› οΈ

    πŸ”— View Release

  • Ollama – v0.13.4-rc2

    Ollama – v0.13.4-rc2

    πŸš€ Ollama v0.13.4-rc2 just droppedβ€”and it’s all about speed and stability!

    If you’ve ever sat there watching “Loading model…” like it’s a slow-loading game, this update is your cheat code.

    ✨ What’s new:

    • ⚑ Faster model init β€” Memory mapping + CUDA context tweaks slash startup time.
    • πŸ€– Multi-GPU love β€” Better resource splitting so your 2x or 4x GPUs actually work together, not fight.
    • πŸ› οΈ Smarter cache β€” Fewer crashes from corrupted downloads or interrupted pulls.

    πŸ’‘ Pro tip: Running Llama 3 or Mistral on a multi-GPU rig? You could see 20%+ faster load times.

    Still a release candidateβ€”but if you’re ready to cut the wait time and boost reliability, upgrade now. Your GPU’s doing backflips in gratitude. πŸ€–πŸ’»

    πŸ”— View Release

  • Deep-Live-Cam – 2.4

    Deep-Live-Cam – 2.4

    Deep-Live-Cam 2.4 just droppedβ€”and it’s a game-changer for real-time face swaps πŸŽ­πŸ’»

    βœ… Dropdowns fixed β€” No more menu stutters. Smooth sailing through all your deepfake edits.

    ⚑ Forced GPU mode on laptops β€” Finally, your iGPU won’t hold you back. Full CUDA/CoreML/DirectML power unlocked.

    🎨 Poisson Blending upgraded β€” Translucent artifacts? Vanished. Messy ear edges? Gone. Swaps now look planted, not pasted.

    πŸ‘„ Mouthmask optimized for Inswapper β€” Lips sync perfectly with speech. No more robot-smiles. Realistic expressions, zero effort.

    Plus tiny but massive polish tweaksβ€”because your deepfake shouldn’t have glitches.

    All updates live only on the official site. Grab it before your GPU starts sending you love letters πŸ’¬πŸ”₯

    πŸ”— View Release

  • Text Generation Webui – v3.21

    Text Generation Webui – v3.21

    πŸš€ Text Generation WebUI v3.21 just dropped β€” and it’s lighter, faster, smarter!

    The portable builds are now leaner: no more bloated llama.cpp symlinks (Python .whl quirks, we see you πŸ˜…). They auto-recreate on first launch β€” clean, efficient, zero hassle.

    πŸ”₯ Backend upgrades galore:

    • llama.cpp β†’ updated to latest ggml-org commit (5c8a717) β€” smoother inference, fewer crashes
    • ExLlamaV3 v0.0.18 β€” better quantization + smarter memory use
    • safetensors v0.7 β€” faster load times, tighter security
    • triton-windows 3.5.1.post22 β€” CUDA ops on Windows? Smoother than ever

    πŸ“¦ Portable builds now come in 4 flavors:

    • πŸ–₯️ `cuda12.4` (NVIDIA)
    • πŸ’» `vulkan` (AMD/Intel GPUs)
    • 🧠 `cpu` (no GPU? no problem)
    • 🍏 `macos-arm64` (Apple Silicon optimized)

    πŸ”„ Update? Just unzip β†’ replace only your `user_data/` folder. All your models, settings, themes β€” untouched. No reconfiguring. No stress.

    Perfect for tinkerers who want power without the install drama. Grab it, unzip, and start generating πŸš€

    πŸ”— View Release

  • Ollama – v0.13.4-rc1

    Ollama – v0.13.4-rc1

    πŸš€ Ollama v0.13.4-rc1 just dropped β€” and Gemma 3 just got a serious upgrade!

    Gemma 3’s RoPE scaling is now set to 1.0, meaning:

    βœ… Better long-context handling

    βœ… Fewer hallucinations

    βœ… More stable reasoning on laptops and low-resource rigs

    Perfect for RAG apps, chatbots, or code assistants that need to stay sharp on long prompts.

    Under the hood:

    πŸ”§ Smoother model loading

    🧠 Minor memory optimizations

    ⚑ Improved compatibility with newer GPU drivers

    This is a release candidate β€” stable, tested, and ready for early adopters who want to ride the wave before it hits mainline.

    Grab it, tweak your prompts, and start gemma-ing harder πŸ€–βœ¨

    πŸ”— View Release

  • Ollama – v0.13.4-rc0

    Ollama – v0.13.4-rc0

    πŸš€ Ollama v0.13.4-rc0 is live β€” and it’s a game-changer for Gemma 3 users!

    Fixed: Global RoPE scale values are now properly applied, meaning longer prompts get way more accurate reasoning and smoother context handling. No more weird token scaling glitches β€” your Gemma 3 finally behaves like it should.

    βœ… Verified release (GitHub-signed for trust & security)

    βœ… Smoother, more consistent inference across macOS, Windows, and Linux

    If you’re running Gemma 3 locally? Update now. This isn’t just a patch β€” it’s your model finally unlocking its full potential. πŸ€–βœ¨

    Grab it before the final drops β€” and keep those LLMs running clean!

    πŸ”— View Release