• Text Generation Webui – v4.6

    Text Generation Webui – v4.6

    🚀 Major Update Alert: text-generation-webui v4.6 is here!

    If you’re looking for the “AUTOMATIC1111” of LLMs, this is your playground. The latest update brings massive improvements for anyone building agentic workflows or running local models with precision.

    🛠️ Precision Tool Calling

    Tired of agents running commands without permission? You can now enable Tool call confirmation in the Chat tab. It adds inline approve/reject buttons, giving you total oversight over every command execution.

    🔌 Expanded MCP Ecosystem

    The Model Context Protocol (MCP) just got a huge boost!

    • You can now configure local subprocess-based MCP servers via `user_data/mcp.json`.
    • If you’re already using the configuration format from Claude Desktop or Cursor, it’s a seamless transition.
    • Tool discovery is now cached to keep your interface snappy.

    🧠 Advanced “Chain of Thought” Management

    New controls for reasoning models are officially live:

    • Use the new UI checkbox or `–preserve-thinking` CLI flag to decide if thinking blocks from previous turns stay in your context.
    • The UI is now smarter—it only displays “Reasoning effort” and “Enable thinking” controls when you’re actually using a model that supports them.

    🖥️ Smoother UI/UX

    • Independent Sidebars: Sidebars now toggle independently and remember their state even after a page refresh.
    • Visual Polish: Improved light mode borders, fixed code block copy buttons, and smoother scrolling during model loading.

    ⚙️ Under the Hood & Security

    • Performance: `llama.cpp` now defaults to `–draft-min 48` for better speculative decoding performance.
    • Security: Critical SSRF vulnerability fixes for URL fetching are included to keep your local environment safe.
    • Bug Fixes: Resolved issues with Gemma 4 thinking tags and UI token leaks during tool calls.

    Time to fire up those local models and test out these new agentic features! 🛠️✨

    🔗 View Release

  • Ollama – v0.21.2-rc0

    Ollama – v0.21.2-rc0

    Ollama just dropped a fresh release candidate, v0.21.2-rc0, and it’s a massive step toward turning your local models into true digital agents! 🚀

    If you haven’t been playing with Ollama yet, it’s the ultimate toolkit for running powerful LLMs like Llama 3, DeepSeek-R1, and Mistral directly on your own hardware. It handles all the heavy lifting of downloading and configuring models so you can focus on building.

    Here is the lowdown on this new update:

    • Bundled OpenClaw Web Search: This is the star of the show! 🌐 Ollama now includes integrated web search capabilities via OpenClaw.
    • Real-Time Knowledge: Instead of being trapped inside a static training dataset, your local models can now pull in fresh, real-time information from the internet.
    • Path to Autonomy: This integration moves Ollama much closer to functioning as a fully autonomous local agent that can browse and verify facts on the fly.

    If you’ve been looking for a way to augment your local workflows with live data without relying on cloud APIs, this RC is definitely worth a spin! 🛠️

    🔗 View Release

  • MLX-LM – v0.31.3

    MLX-LM – v0.31.3

    MLX LM: Run LLMs with MLX just dropped v0.31.3! 🚀

    If you’re obsessed with running local LLMs on Apple Silicon, this patch release is a massive win for your inference workflows. The star of the show is a new thread-local generation stream, which works alongside MLX v0.31.2 to make streaming responses way more reliable when you’re working in multi-threaded environments.

    Here’s the breakdown of what’s new:

    • Streaming & Concurrency: New thread-local support means much smoother, more stable streaming even when handling multiple tasks at once.
    • Tool Calling Overhaul: A huge cleanup for function calling! This includes better parallel tool call handling in the server, specific patches for MiniMax M2, and improved parser support (like handling hyphenated names and braces) for Gemma 4.
    • Stability Fixes: Squashed those pesky bugs related to batch dimension mismatches in `BatchKVCache`, `BatchRotatingKVCache`, and `ArraysCache`. No more unexpected dimension errors!
    • Model-Specific Tweaks: Fixed issues with Gemma 4 KV-shared layers and resolved embedding issues for `Apertus`.
    • General Polish: Better handling of “think” tokens in tokenizer wrappers, improved `safetensors` directory checks, and fixed missing imports in cache modules.

    This is a super practical update if you’ve been running into dimension errors or tricky tool-calling behavior with the latest models. Time to update and get those local models humming! 🛠️

    🔗 View Release

  • Ollama – v0.21.1

    Ollama – v0.21.1

    Ollama v0.21.1 🦬

    If you’re running local LLMs, you know Ollama is the go-to for getting models up and running with zero friction on macOS, Windows, or Linux. It’s the ultimate toolkit for anyone looking to experiment with Llama 3, DeepSeek-R1, or Mistral without needing a massive cloud budget.

    This quick patch release focuses on fine-tuning model recommendations to ensure you’re getting the best performance out of your local setup.

    What’s new:

    • Model Optimization: The update swaps out `kimi-k2.5` for `k2.6` as the top recommended model in the launch configuration. 🚀

    It’s a small but important tweak to make sure your default experience points you toward the most capable version of the Kimi model available! Perfect for those of us always hunting for that extra bit of reasoning power.

    🔗 View Release

  • ComfyUI – v0.19.4

    ComfyUI – v0.19.4

    ComfyUI v0.19.4 is officially live! 🚀

    If you’re deep in the weeds of node-based workflows, you know ComfyUI is the ultimate playground for Stable Diffusion. It lets you stitch together complex pipelines—from SDXL to ControlNet—using a modular, visual interface that gives you total granular control over every step of your generation process.

    This latest release (v0.19.4) is a targeted patch focused on keeping your workspace running smoothly. While it doesn’t introduce massive new nodes, it’s all about that essential maintenance:

    • Bug Squashing: Fixes for those annoying workflow errors that can derail a complex render mid-way through.
    • Stability Boosts: Improved reliability when loading heavy models and managing custom node dependencies.
    • Performance Refinements: Backend tweaks to ensure your node graphs stay efficient as they grow in complexity.

    If you’ve noticed any unexpected crashes or weird behavior during your recent generations, now is the perfect time to hit that update button and get back to creating! 🛠️

    🔗 View Release

  • Ollama – v0.21.1-rc1

    Ollama – v0.21.1-rc1

    Ollama v0.21.1-rc1 🛠️

    If you love running heavy-hitting LLMs like Llama 3 or DeepSeek-R1 locally without relying on expensive cloud APIs, listen up! A new release candidate for Ollama just dropped, and it’s all about fine-tuning your local experience.

    What’s new:

    • Model Optimization: The team has updated the top recommended model selection, swapping out `kimi-k2.5` for the newer, more optimized `k2.6`. 🚀

    This is a quick but smart tweak designed to ensure that when you pull down models, you’re getting the best possible performance and compatibility for your specific hardware setup. Keep an eye on this RC as it heads toward a full stable release!

    🔗 View Release

  • Mantella – v0.14

    Mantella – v0.14

    Mantella v0.14 is officially live! 🛠️

    If you haven’t checked this out yet, Mantella is an incredible mod for Skyrim and Fallout 4 that lets you actually talk to NPCs using your voice. It pulls together Whisper for speech-to-text, LLMs for the brains, and tools like Piper or XTTS for lifelike speech, turning static dialogue trees into real, unscripted conversations.

    The latest update is packed with quality-of-life tweaks to make managing your AI companions much smoother:

    • UI Control Upgrades: You can now toggle actions directly through the Mantella UI—no more digging through deep menus just to manage an interaction!
    • New Default Model: The default has been switched to Gemma 4 free, giving your characters a fresh, updated logic baseline.
    • Improved Multi-NPC Logic: Managing a crowded tavern? There’s a new helper message for keyword-based actions specifically designed to keep multi-NPC conversations from getting chaotic.
    • Prompt Refinements: The devs have polished prompt definitions and “radiant prompts” to ensure character behavior stays consistent and on-track.
    • Data Refresh: A fresh CSV update is included to keep your underlying data structures current.

    If you’re tinkering with complex dialogue trees or trying to build more immersive worlds, these logic tweaks should make a massive difference in how smoothly your NPCs react! 🚀

    🔗 View Release

  • Ollama – v0.21.1-rc0

    Ollama – v0.21.1-rc0

    Ollama just dropped a new release candidate, v0.21.1-rc0, bringing some much-needed precision to your local LLM workflows! 🛠️

    If you’re running models locally, you already know Ollama is the ultimate toolkit for managing and running heavyweights like Llama 3, DeepSeek-R1, and Mistral without the cloud headache. This latest update focuses on fine-tuning how specific models behave during inference.

    What’s new in this release:

    • Gemma 4 Formatting Fix: A targeted fix for the server to ensure proper formatting is applied when `think=false` is set specifically for Gemma 4 models.
    • Improved Output Consistency: This tweak helps prevent messy or broken responses, ensuring that when you disable “thinking” mode, the model’s output remains clean and structured.

    If you’ve been experimenting with the latest Google models via Ollama, definitely grab this RC to keep your parsing logic from breaking! 🚀

    🔗 View Release

  • Tater – Tater v73

    Tater – Tater v73

    🥔 Tater v73 — “Clean Signals” is here! 🎤✨

    If you’ve been looking for a way to run a fully private, local-native AI assistant without sending your data to the cloud, Tater is a must-watch. Powered by the Hydra planning engine, it uses a structured four-headed loop (Perception, Action, Judgment, and Speech) to execute complex tasks via local LLMs like Ollama or LM Studio. It’s a powerhouse for anyone building a modular, privacy-first AI stack that integrates with everything from Home Assistant to macOS and even OG Xbox!

    The latest update focuses on polishing the feedback loop between Tater and your hardware satellites to make interactions feel much more intuitive.

    What’s New in v73:

    • Smart Tool Signaling: The updated firmware (VoicePE & Satellite1) now explicitly notifies connected devices the moment a tool is being called. 🛠️

    New “Tool Mode” Animation: We’ve added a specific animation state for satellites during tool calls. This creates a much clearer visual flow: Listen → Think → Tool Call (Animation) → Reply.*

    • Zero Breaking Changes: Everything remains backward compatible! If you aren’t using the new Tater tool markers, standard ESPHome voice behavior stays exactly as it was. 🔄
    • Seamless Home Assistant Integration: No extra configuration required. Home Assistant will continue to communicate normally, and devices will simply revert to standard behavior if a tool marker isn’t detected.

    This update is all about making the interaction between your AI brain and your physical hardware feel more responsive and transparent! 🚀

    🔗 View Release

  • Tater – Tater v72

    Tater – Tater v72

    🥔 Tater v72 — “Flash & Wake” is here! 🚀

    If you’ve been dreaming of building your own local voice-activated hardware, Tater just leveled up from an AI assistant to a full-blown development hub. This update focuses on making ESPHome-based voice satellites easier to deploy than ever before.

    What’s new in this release:

    • Integrated ESPHome Flashing: No more switching between desktop tools or manual YAML editing! A brand new ESPHome Firmware tab lets you pick templates (like VoicePE), edit YAML substitutions directly in the Tater UI, and flash firmware straight to your devices. 🛠️
    • Micro Wake Word Management: Managing wake words is now seamless. The new Micro Wake Word section gives you access to over 400 prebuilt wake words, plus the ability to paste URLs for your own custom-trained models. 🎙️

    This update effectively turns Tater into an all-in-one factory for voice hardware. Whether you’re tweaking a config or deploying a brand new satellite, you can now manage the entire lifecycle from one single interface! 🥔✨

    🔗 View Release