• Ollama – v0.21.3-rc0

    Ollama – v0.21.3-rc0

    Ollama just dropped a new release candidate, v0.21.3-rc0, and it’s bringing some serious brainpower to your local LLM workflows! 🛠️

    If you aren’t using Ollama yet, it is the ultimate toolkit for running powerful models like Llama 3, DeepSeek-R1, and Mistral directly on your own hardware. It handles all the heavy lifting of downloading and configuring models so you can focus on building.

    Here’s what’s new in this RC update:

    • Reasoning Effort Support: This is a game-changer for anyone playing with chain-of-thought models! The update maps “reasoning effort” to the “think” parameter, giving you much finer control over how much computational “thinking” time a model spends on a prompt. 🧠
    • OpenAI Compatibility Tweaks: The release includes specific updates to better handle OpenAI-style map responses, making it even smoother to swap between APIs without breaking your integration.

    If you’re tinkering with reasoning-heavy models and want to ensure your API calls are handling thought tokens perfectly, grab this RC and give it a spin! ✨

    🔗 View Release

  • Tater – Tater v74

    Tater – Tater v74

    🥔 Tater v74 — “Who Goes There?” is here! 📡🗣️

    Get ready to take your local AI stack completely off-grid! Tater just leveled up from a standard local assistant into a decentralized, identity-aware powerhouse. If you love privacy and autonomy, this update is a massive win for your local setup.

    Meshtastic Portal — Off-Grid Communication

    Tater can now whisper over radio waves using Meshtastic! You can now send and receive messages across a mesh network without needing any internet or cell towers.

    • Fully Local & Encrypted: Your data stays strictly within your nodes.
    • Remote Alerts: Send notifications across your mesh network even in total dead zones.
    • Decentralized Power: Deploy tiny, independent Tater networks that operate entirely via radio waves.

    Speaker ID — Voiceprint Recognition

    Tater just got a lot more observant! We’ve added local voiceprint enrollment so the system can identify specific users directly on your hardware.

    Identity-Aware Control: Tater recognizes you* specifically before any other processes even kick in.

    • Privacy-First: Everything happens locally on your machine—zero cloud processing involved.
    • Seamless Interaction: If an unrecognized voice speaks, Tater simply continues its routine without interruption.

    This release moves Tater beyond simple automation and into the realm of identity-aware, off-grid communication. Whether you’re building a private mesh or just want your AI to know exactly who is talking to it, v74 has you covered! 🧠✨

    🔗 View Release

  • Ollama – v0.21.2

    Ollama – v0.21.2

    Ollama v0.21.2 is officially live! 🚀

    If you’re looking to run heavy-hitting LLMs like Llama 3, DeepSeek-R1, or Mistral directly on your hardware without relying on the cloud, Ollama is your best friend. It turns the complex process of managing local models into a seamless, one-command experience across macOS, Windows, and Linux.

    This latest patch focuses on polishing the user experience and tightening up the engine:

    • Smoother Onboarding: The OpenClaw onboarding flow has been hardened, making that first-time setup much more robust and less prone to hiccups. 🛠️
    • Enhanced Stability: This update includes critical refinements to the underlying launch processes, ensuring your local instances spin up reliably every single time.

    Perfect for those of us building local RAG pipelines or just experimenting with privacy-first AI! 🥔✨

    🔗 View Release

  • Ollama – v0.21.2-rc1

    Ollama – v0.21.2-rc1

    Ollama just dropped a new release candidate, v0.21.2-rc1, and it’s all about smoothing out that initial setup! 🛠️

    If you’re looking to run heavyweights like Llama 3, DeepSeek-R1, or Mistral locally without the headache of manual configuration, this is your go-to tool. It handles all the heavy lifting for model weights and parameters so you can jump straight into prompting.

    What’s new in this release:

    • Hardened OpenClaw Onboarding: The big win here is a much more robust and “hardened” onboarding flow for OpenClaw. 🚀

    The team is clearly focusing on polishing the first-run experience, making sure that even complex local setups are reliable and hiccup-free from the very first click. Perfect for anyone looking to experiment with local LLMs without fighting the installation process!

    🔗 View Release

  • ComfyUI – v0.19.5

    ComfyUI – v0.19.5

    ComfyUI v0.19.5 is officially live! 🚀

    If you’re deep in the world of node-based workflows, you know ComfyUI is the ultimate playground for building complex Stable Diffusion pipelines. Whether you’re upscaling, inpainting, or experimenting with SDXL, this modular engine gives you total control over every step of your generative process.

    This latest release (v0.19.5) is a focused maintenance update designed to keep your creative momentum going without the hiccups. Here’s what’s happening under the hood:

    • Bug Squashing: Fixes for those annoying little glitches during node execution.
    • Enhanced Stability: Improvements to ensure your heavy, multi-node workflows stay rock solid during long renders.
    • Backend Optimization: Fine-tuned performance to help with smoother memory management and efficiency.

    Pro-tip for the tinkerers: Whenever a new version drops, don’t forget to fire up your ComfyUI Manager and run an update on your custom nodes! Keeping those extensions in sync is the best way to prevent workflow breakage. 🛠️

    🔗 View Release

  • Text Generation Webui – v4.6.2

    Text Generation Webui – v4.6.2

    text-generation-webui v4.6.2 is officially live, and it’s bringing some massive quality-of-life upgrades for your local LLM playground! 🚀 If you’ve been looking for more control over agentic workflows or better context management, this is the update you’ve been waiting for.

    Tool Call Control & MCP Support

    • Manual Approval: No more rogue tool calls! You can now toggle “Confirm tool calls” in the Chat tab to manually approve or reject actions with inline buttons. 🛡️
    • Stdio MCP Servers: Huge win for interoperability! You can now configure local subprocess-based MCP servers via `mcp.json`, making it much easier to sync your setup with Claude Desktop or Cursor.
    • Performance Boost: Tool discovery is now cached, so you won’t be re-querying servers every single time a generation runs.

    Enhanced Reasoning & Context Management

    • Preserve Thinking: A new `–preserve-thinking` flag (and UI checkbox) lets you decide if thinking blocks from previous turns stay in your context window. 🧠
    • Smart UI: The “Reasoning effort” and “Enable thinking” controls now only appear for models that actually support them, keeping your interface clutter-free.

    UI & UX Overhaul

    • Persistent Sidebars: Sidebars now toggle independently and remember their state even after a page refresh.
    • Visual Polishing: Improved light mode borders, fixed code block copy buttons, and better spacing in the past chats menu.

    Under the Hood & Security

    • Security Patch: Fixed SSRF vulnerabilities in URL fetching to keep your local environment safer during web-based tasks.
    • llama.cpp Updates: Includes new defaults for speculative decoding (`–draft-min 48`) and updated dependencies for `ik_llama.cpp` and `ExLlamaV3`.
    • New Portable Builds: Self-contained packages are available for Windows, Linux, and macOS (Apple Silicon/Intel), covering everything from CUDA 13.1 to CPU-only setups.

    Pro Tip: If you’re updating a portable install, just swap your `user_data` folder into the new version to keep all your models and settings exactly where they belong! 🛠️

    🔗 View Release

  • Text Generation Webui – v4.6.1

    Text Generation Webui – v4.6.1

    🚀 Big Update Alert: text-generation-webui v4.6.1 is here!

    If you’re looking for the “AUTOMATIC1111” of local LLMs, this latest release for the Gradio-based web UI is a massive win for anyone running models locally. It’s packed with quality-of-life upgrades and expanded connectivity to make your local setup even more powerful.

    New MCP Power 🛠️

    The Model Context Protocol (MCP) support just got a serious boost! You can now configure local subprocess-based MCP servers via `mcp.json`—exactly like you would in Claude Desktop or Cursor. The UI now pre-loads these tools at startup and caches discovery to keep your generations snappy.

    Enhanced Control & Transparency 🔍

    • Tool Call Confirmation: No more “black box” executions! You can now enable a checkbox in the Chat tab to see inline approve/reject buttons before any tool call runs.
    • Thinking Process Management: A new `preserve-thinking` flag and UI checkbox let you decide whether to keep thinking blocks from previous turns in your context—perfect for managing those precious tokens!
    • Smart UI: “Reasoning effort” controls now only appear when the specific model you’re using actually supports them.

    Under the Hood & Performance ⚙️

    • llama.cpp Upgrades: Includes updated dependencies and a default tweak (`–draft-min 48`) for smoother speculative decoding.
    • UI Overhaul: Sidebars are now independent and remember their state even after a page refresh.
    • Security Patch: Critical fixes for SSRF vulnerabilities in URL fetching to keep your local environment safe.

    Portable Builds Ready! 📦

    New self-contained packages are available for Windows, Linux, and macOS. Whether you’re rocking NVIDIA (CUDA 12.4 or 13.1), AMD (ROCm/Vulkan), or just a standard CPU, there’s a build ready to go. Updated builds for `ik_llama.cpp` are also included for those specialized quant types!

    Pro-Tip: Updating is a breeze—just extract the new version and swap your `user_data` folder. You can even keep multiple versions of the webui side-by-side sharing a single `user_data` directory! 🛠️✨

    🔗 View Release

  • Text Generation Webui – v4.6

    Text Generation Webui – v4.6

    🚀 Major Update Alert: text-generation-webui v4.6 is here!

    If you’re looking for the “AUTOMATIC1111” of LLMs, this is your playground. The latest update brings massive improvements for anyone building agentic workflows or running local models with precision.

    🛠️ Precision Tool Calling

    Tired of agents running commands without permission? You can now enable Tool call confirmation in the Chat tab. It adds inline approve/reject buttons, giving you total oversight over every command execution.

    🔌 Expanded MCP Ecosystem

    The Model Context Protocol (MCP) just got a huge boost!

    • You can now configure local subprocess-based MCP servers via `user_data/mcp.json`.
    • If you’re already using the configuration format from Claude Desktop or Cursor, it’s a seamless transition.
    • Tool discovery is now cached to keep your interface snappy.

    🧠 Advanced “Chain of Thought” Management

    New controls for reasoning models are officially live:

    • Use the new UI checkbox or `–preserve-thinking` CLI flag to decide if thinking blocks from previous turns stay in your context.
    • The UI is now smarter—it only displays “Reasoning effort” and “Enable thinking” controls when you’re actually using a model that supports them.

    🖥️ Smoother UI/UX

    • Independent Sidebars: Sidebars now toggle independently and remember their state even after a page refresh.
    • Visual Polish: Improved light mode borders, fixed code block copy buttons, and smoother scrolling during model loading.

    ⚙️ Under the Hood & Security

    • Performance: `llama.cpp` now defaults to `–draft-min 48` for better speculative decoding performance.
    • Security: Critical SSRF vulnerability fixes for URL fetching are included to keep your local environment safe.
    • Bug Fixes: Resolved issues with Gemma 4 thinking tags and UI token leaks during tool calls.

    Time to fire up those local models and test out these new agentic features! 🛠️✨

    🔗 View Release

  • Ollama – v0.21.2-rc0

    Ollama – v0.21.2-rc0

    Ollama just dropped a fresh release candidate, v0.21.2-rc0, and it’s a massive step toward turning your local models into true digital agents! 🚀

    If you haven’t been playing with Ollama yet, it’s the ultimate toolkit for running powerful LLMs like Llama 3, DeepSeek-R1, and Mistral directly on your own hardware. It handles all the heavy lifting of downloading and configuring models so you can focus on building.

    Here is the lowdown on this new update:

    • Bundled OpenClaw Web Search: This is the star of the show! 🌐 Ollama now includes integrated web search capabilities via OpenClaw.
    • Real-Time Knowledge: Instead of being trapped inside a static training dataset, your local models can now pull in fresh, real-time information from the internet.
    • Path to Autonomy: This integration moves Ollama much closer to functioning as a fully autonomous local agent that can browse and verify facts on the fly.

    If you’ve been looking for a way to augment your local workflows with live data without relying on cloud APIs, this RC is definitely worth a spin! 🛠️

    🔗 View Release

  • MLX-LM – v0.31.3

    MLX-LM – v0.31.3

    MLX LM: Run LLMs with MLX just dropped v0.31.3! 🚀

    If you’re obsessed with running local LLMs on Apple Silicon, this patch release is a massive win for your inference workflows. The star of the show is a new thread-local generation stream, which works alongside MLX v0.31.2 to make streaming responses way more reliable when you’re working in multi-threaded environments.

    Here’s the breakdown of what’s new:

    • Streaming & Concurrency: New thread-local support means much smoother, more stable streaming even when handling multiple tasks at once.
    • Tool Calling Overhaul: A huge cleanup for function calling! This includes better parallel tool call handling in the server, specific patches for MiniMax M2, and improved parser support (like handling hyphenated names and braces) for Gemma 4.
    • Stability Fixes: Squashed those pesky bugs related to batch dimension mismatches in `BatchKVCache`, `BatchRotatingKVCache`, and `ArraysCache`. No more unexpected dimension errors!
    • Model-Specific Tweaks: Fixed issues with Gemma 4 KV-shared layers and resolved embedding issues for `Apertus`.
    • General Polish: Better handling of “think” tokens in tokenizer wrappers, improved `safetensors` directory checks, and fixed missing imports in cache modules.

    This is a super practical update if you’ve been running into dimension errors or tricky tool-calling behavior with the latest models. Time to update and get those local models humming! 🛠️

    🔗 View Release