• Text Generation Webui – v4.3.2

    Text Generation Webui – v4.3.2

    text-generation-webui v4.3.2 is officially live! ๐Ÿš€ This Gradio-based powerhouse is the go-to interface for running LLMs locally, and this update brings some serious heavy-hitting performance boosts and expanded model support for all you tinkerers out there.

    Here is the breakdown of whatโ€™s new in this release:

    Core Model & Backend Upgrades

    • Gemma 4 Support: You can now run Gemma 4 with full tool-calling capabilities enabled in both the API and the UI. ๐Ÿ†•
    • New `ik_llama.cpp` Backend: A massive addition for performance enthusiasts! This backend offers superior KV cache quantization using Hadamard rotation, better optimizations for MoE models, and improved CPU inference.
    • Transformers Enhancements: The engine now auto-detects `torch_dtype` from model configs rather than forcing half-precision, making the model loading process much smarter.

    API & UI Improvements

    • Enhanced Completions API: The `/v1/completions` endpoint now supports `echo` and `logprobs`, allowing you to see token-level probabilities and IDs. ๐Ÿ“Š
    • Snappier Interface: A custom Gradio fork has been optimized to save up to 50ms per UI event, making button clicks and transitions feel much smoother.
    • Smarter Templates: Instruction templates are now detected via model metadata instead of relying on old filename patterns.

    Security & Stability Fixes

    • Hardened Security: Fixed an ACL bypass in the Gradio fork for Windows/macOS and added server-side validation for various input groups like Dropdowns and Radio buttons. ๐Ÿ›ก๏ธ
    • SSRF Protection: Added URL validation to `superbooga` extensions to block requests to private or internal networks.
    • Bug Squashing: Resolved several critical issues, including crashes related to Gemma 4 templates in llama.cpp and loading failures for Qwen3.5 MoE models.

    Portable Builds & Updates

    New self-contained packages are available for Windows, Linux, Mac, and various GPU architectures (NVIDIA CUDA, AMD Vulkan/ROCm, and Intel). If you’re using the portable version, updating is easier than everโ€”you can now use a shared `user_data` folder across multiple installs! ๐Ÿ“‚

    ๐Ÿ”— View Release

  • ComfyUI – v0.18.5

    ComfyUI – v0.18.5

    ComfyUI v0.18.5 is officially live! ๐Ÿš€

    For those of you building complex, node-based generative AI pipelines, ComfyUI continues to be the powerhouse engine for granular control over Stable Diffusion and beyond. Whether you’re orchestrating intricate image upscaling or multi-step video generation, this tool remains the gold standard for modularity and efficiency.

    This latest minor version update focuses on keeping your creative workflows smooth and reliable. Here is whatโ€™s new in v0.18.5:

    • Enhanced Stability: This patch includes refinements to existing code, specifically aimed at ensuring smoother operation when executing heavy or complex node sequences.
    • Core Maintenance: As part of the ongoing development by Comfy-Org, this release ensures the core engine stays perfectly aligned with the rapidly evolving broader AI ecosystem.

    If you’ve been pushing your hardware through massive workflows and want to ensure peak performance and stability, now is a great time to pull this update! ๐Ÿ› ๏ธ

    ๐Ÿ”— View Release

  • Text Generation Webui – v4.3.1

    Text Generation Webui – v4.3.1

    text-generation-webui v4.3.1 is officially live, and itโ€™s a massive one for anyone looking to push the boundaries of local LLM inference! ๐Ÿš€

    This Gradio-based web UI is essentially the “AUTOMATIC1111” equivalent for text generation, providing a comprehensive interface to run Large Language Models locally with support for multiple backends like llama.cpp, Transformers, and ExLlama.

    Hereโ€™s whatโ€™s new in this release:

    • Model & Inference Upgrades:
    • ๐Ÿ†• Gemma 4 Support: Full integration including tool-calling capabilities in both the API and UI.
    • ik_llama.cpp Backend: New support via portable builds (or the `–ik` flag for full installs) offering specialized optimizations for MoE models, improved CPU inference, and highly accurate KV cache quantization.
    • Transformers Optimization: The UI now auto-detects `torch_dtype` from model configs instead of forcing bf16/f16.
    • ExLlamaV3 Fixes: Resolved issues with Qwen3.5 MoE loading and fixed `ban_eos_token` functionality.
    • API Enhancements:
    • The `/v1/completions` endpoint now supports `echo` and `logprobs` parameters, returning token-level probabilities and new `top_logprobs_ids`.
    • Performance & UI Tweaks:
    • Snappier Interface: A custom Gradio fork has been optimized to save up to 50ms per UI event (like button clicks).
    • Smarter Templates: Instruction templates now detect from model metadata rather than relying on filename patterns.
    • Security & Stability:
    • Fixed a critical ACL bypass in the Gradio fork for Windows/macOS.
    • Added server-side validation for input components (Dropdown, Radio, etc.).
    • Patched an SSRF vulnerability in superbooga extensions by validating fetched URLs against private networks.

    ๐Ÿ› ๏ธ Pro-tip for updating: If you’re using a portable install, just download the latest version and replace your `user_data` folder. Since version 4.0, you can actually keep `user_data` one level up (next to your install folder) to make future updates even smoother!

    ๐Ÿ”— View Release

  • Text Generation Webui – v4.3

    Text Generation Webui – v4.3

    ๐Ÿšจ Text-Generation-WebUI v4.3 is live! ๐Ÿšจ

    Hey AI tinkerers & devs โ€” fresh update dropped, and itโ€™s packed with performance wins, new backends, and security upgrades. Hereโ€™s the lowdown:

    ๐Ÿ”น ๐Ÿ”ฅ Brand-new backend: `ik_llama.cpp`

    A high-octane fork by the imatrix creator, now baked into TGWU:

    • โœ… New quant formats (Q4_K_M, Q6_K, etc.)
    • ๐Ÿง  Hadamard-based KV cache quantization โ€” way more accurate, on by default
    • โšก Built for MoE models & CPU inference (yes, really fast)

    โ†’ Grab it via `textgen-portable-ik` or `–ik` flag!

    ๐Ÿ”น ๐Ÿง  API upgrades (OpenAI-compatible!)

    The `/v1/completions` endpoint now supports:

    • `echo`: Returns prompt + completion in one go
    • `logprobs`: Token-level log probabilities (prompt & generated)
    • `top_logprobs_ids`: Top token IDs per position โ€” perfect for probing model confidence ๐ŸŽฏ

    ๐Ÿ”น ๐ŸŽจ Gradio UX + Security Boost

    • ๐ŸŒ Custom Gradio fork = ~50ms faster UI interactions
    • ๐Ÿ”’ Fixed ACL bypass (Windows/macOS path quirks)
    • โœ… Server-side validation for Dropdown/Radio/CheckboxGroup
    • ๐Ÿ›ก๏ธ SSRF fix in superbooga: blocks internal/private IPs

    ๐Ÿ”น ๐Ÿ”ง Bug fixes & polish

    • `–idle-timeout` now works for encode/decode + parallel generations โœ…
    • Stopping strings fixed (e.g., `<|return|>` vs `<|result|>`)
    • Qwen3.5 MoE loads cleanly via ExLlamaV3_HF
    • `ban_eos_token` finally works (EOS suppression at logit level)

    ๐Ÿ”น ๐Ÿ“ฆ Dependency upgrades

    • ๐Ÿฆ™ `llama.cpp` โ†’ latest (`a1cfb64`) + Gemma-4 support
    • ๐Ÿ”„ `ExLlamaV3` โ†’ v0.0.28
    • ๐Ÿ“ฆ `transformers` โ†’ 5.5
    • โœจ Auto-detects `torch_dtype` from model config (override with `–bf16`)
    • ๐Ÿ—‘๏ธ Removed obsolete `models/config.yaml` โ€” templates pulled from model metadata now

    ๐Ÿ”น ๐Ÿ“Œ Terminology update

    “Truncation length” โ†’ now “context length” in logs (more accurate, less confusing!)

    ๐Ÿ”น ๐Ÿ“ฆ Portable builds โ€” GGUF-ready & zero-install

    | Platform | Build to Use |

    |———-|————–|

    | NVIDIA (old driver) | `cuda12.4` |

    | NVIDIA (new driver, CUDA >13) | `cuda13.1` |

    | AMD/Intel GPU | `vulkan` |

    | AMD (ROCm) | `rocm` |

    | CPU-only | `cpu` |

    | Apple Silicon | `macos-arm64` |

    | Intel Mac | `macos-x86_64` |

    ๐Ÿ” Updating? Just swap the folder โ€” keep `user_data/`, and now you can even move it one level up for shared use across versions ๐ŸŽ‰

    Let me know if you want a quick-start walkthrough on `ik_llama.cpp` or portable builds! ๐Ÿ› ๏ธ๐Ÿš€

    ๐Ÿ”— View Release

  • Ollama – v0.20.0: tokenizer: add byte fallback for SentencePiece BPE encoding (#15232)

    Ollama – v0.20.0: tokenizer: add byte fallback for SentencePiece BPE encoding (#15232)

    ๐Ÿšจ Ollama v0.20.0 is live! ๐Ÿšจ

    Big tokenizer upgrade incomingโ€”this oneโ€™s a must-have for accuracy and reliability, especially when dealing with non-ASCII or rare characters. Hereโ€™s whatโ€™s new:

    ๐Ÿ”น Byte fallback for SentencePiece BPE

    โ†’ When a character canโ€™t be tokenized via standard BPE merges, Ollama now falls back to encoding each UTF-8 byte as a `<0xHH>` token (e.g., `โ‚ฌ` โ†’ `<0xE2><0x82><0xAC>`).

    โ†’ No more silent character loss! ๐Ÿ›ก๏ธ

    ๐Ÿ”น Decoding updated too

    โ†’ The decoder now correctly reconstructs the original bytes from `<0xHH>` tokensโ€”ensuring perfect round-trip fidelity. โœ…

    ๐Ÿ”ง Fixes:

    • #15229 (dropped chars on encode)
    • #15231 (decoder crashes with unknown tokens)

    This upgrade boosts robustness across multilingual, technical, or edge-case textโ€”making local LLM inference even more reliable. ๐Ÿง โšก

    Upgrade now and keep your tokens tight! ๐Ÿ“ฆโœจ

    ๐Ÿ”— View Release

  • Home Assistant Voice Pe – 26.4.0

    Home Assistant Voice Pe – 26.4.0

    ๐Ÿšจ Home Assistant Voice PE 26.4.0 is live! ๐Ÿšจ

    Hey AI tinkerers & smart home buildersโ€”big update dropped for Home Assistant Voice PE! This open-source voice control gem just got even smoother, more robust, and community-powered ๐Ÿ™Œ

    ๐Ÿ”ฅ Whatโ€™s New in v26.4.0?

    โœ… Media Playback Stability Boost ๐ŸŽง

    No more audio dropoutsโ€”TTS and media playback now run buttery-smooth, even under load.

    โœ… Multi-Sendspin Server Support ๐ŸŒ

    Deploy multiple Sendspin instances for redundancy, load balancing, or global reach. Scaling just got way easier!

    โœ… TTS Timeout Bug Squashed โฑ๏ธ

    Finallyโ€”full TTS responses every time. No more mid-sentence cutoffs. ๐ŸŽ‰

    โœ… New Contributor Welcome! ๐Ÿ‘

    Shoutout to @akloeckner for their first PR (#558)โ€”the power of open source in action!

    ๐Ÿ’ก Bonus: The project is now officially sponsored by the Open Home Foundationโ€”a huge win for privacy-first, offline-capable voice control!

    ๐Ÿ”— Dig into all the nitty-gritty: [Changelog (25.12.4 โ†’ 26.4.0)](link-to-full-changelog)

    Letโ€™s make voice control truly accessible, local, and openโ€”drop a ๐ŸŽค if youโ€™re upgrading!

    ๐Ÿ”— View Release

  • Ollama – v0.20.0-rc1: convert: support new Gemma4 audio_tower tensor naming (#15221)

    Ollama – v0.20.0-rc1: convert: support new Gemma4 audio_tower tensor naming (#15221)

    ๐Ÿšจ Ollama v0.20.0-rc1 is live! ๐Ÿšจ

    Hey AI tinkerers โ€” big news for multimodal & audio model fans!

    ๐Ÿ”น New in this RC: The `convert` tool now supports the updated tensor naming convention for Gemma 4โ€™s `audio_tower` (PR #15221).

    ๐ŸŽฏ Why this rocks:

    • Fixes tensor-mismatch errors when converting or fine-tuning Gemma 4 audio models (think speech-to-text, multimodal ASR, etc.).
    • Keeps Ollama in lockstep with the latest Hugging Face model formats โ€” no more manual tensor renaming hacks!
    • Makes it way smoother to run Gemma 4โ€™s audio capabilities locally ๐ŸŽ™๏ธโžก๏ธ๐Ÿง 

    ๐Ÿ“ฆ Note: This is a release candidate โ€” stable enough to test, but expect polish & tweaks before final v0.20.0 drops.

    Try it out and let us know how your audio conversions go! ๐Ÿงช๐ŸŽง

    #Ollama #Gemma4 #AudioAI #LLMLocal

    ๐Ÿ”— View Release

  • Ollama – v0.20.0-rc0: Merge pull request #42 from ollama/jmorganca/gemma4-ggml-improvements

    Ollama – v0.20.0-rc0: Merge pull request #42 from ollama/jmorganca/gemma4-ggml-improvements

    ๐Ÿšจ Ollama v0.20.0-rc0 is here!

    PR #42 just landed โ€” bringing sweet, sweet Gemma 4 GGML love ๐Ÿฌ

    ๐Ÿ”น Gemma 4 GGML Fixes

    • Fixed a critical bug in the MoE fused gate/up projection split, boosting correctness & speed for CPU-based Gemma 4 runs.
    • Improved multiline tool-call argument parsing โ€” now handles complex JSON args (e.g., multi-line function/tool outputs) way more reliably.

    ๐ŸŽฏ Focus: Gemma 4 + GGML optimization & rock-solid tool-use support.

    โš ๏ธ Reminder: This is a release candidate (rc0) โ€” great for testing, not quite prod-ready yet. Stay tuned for the final `v0.20.0`! ๐Ÿš€

    ๐Ÿ”— View Release

  • Wyoming Openai – TTS voice name disambiguation (0.4.3)

    Wyoming Openai – TTS voice name disambiguation (0.4.3)

    ๐Ÿšจ Wyoming OpenAI v0.4.3 is liveโ€”and itโ€™s a quiet hero with big improvements! ๐Ÿšจ

    ๐Ÿ”ฅ TTS Voice Name Disambiguation (#55)

    No more confusing duplicate voice names! If multiple TTS models (e.g., Kokoro, Piper, Edge) share a voice like `”en-us”` or `”alice”`, they now show up uniquelyโ€”like `Alice [Kokoro]`, `Alice [Edge]`โ€”so Wyoming clients (like Home Assistant) can pick the exact one you want. ๐ŸŽฏ

    ๐Ÿ› ๏ธ Behind-the-scenes polish

    • โœ… `pyright` added for stricter type safety (fewer sneaky bugs!)
    • ๐Ÿ”„ CI/CD updated to latest GitHub Actions
    • ๐Ÿณ Docker builds are leaner with `.dockerignore` and optimized layers
    • ๐Ÿ“ฆ All deps bumpedโ€”clean, modern, secure

    ๐Ÿ“ฆ Still installable via `pip install wyoming-openai`, and Docker Compose setups for LocalAI, Edge TTS, Kokoro, Chatterbox & more.

    No API keys needed if youโ€™re using local or open-source modelsโ€”just drop in your preferred TTS/STT backend and go. ๐Ÿง โžก๏ธ๐Ÿ”Š

    ๐Ÿ‘‰ Full details: [v0.4.2…v0.4.3](link-to-changelog)

    Happy voice-hacking, folks! ๐ŸŽ™๏ธโœจ

    ๐Ÿ”— View Release

  • Tater – Tater v69

    Tater – Tater v69

    ๐Ÿšจ Tater v69 โ€” “Eyes Everywhere” ๐Ÿ“ก๐Ÿ‘๏ธ๐Ÿฅ” is live, and itโ€™s a visual upgrade thatโ€™ll make your local AI see the world like never before!

    ๐Ÿ”ฅ Awareness Core: Multi-Provider Vision

    โœ… UniFi Protect is now a first-class event provider โ€” join Home Assistant in feeding Tater real-time camera/motion data. No hacks, no glue code โ€” just plug-and-play awareness from either system (or both!).

    ๐Ÿ”„ Hybrid Awareness Mode

    ๐Ÿ‘๏ธ Use UniFi Protect for sensing (cameras, doorbells, motion events)

    ๐Ÿง  Keep Home Assistant for acting (TTS, notifications, media playback)

    All stitched together via one unified Awareness layer โ€” smooth, reliable, and yours.

    ๐Ÿง  AI Task Core: Smarter & More Reliable

    โœ”๏ธ Refined task routing & destination handling (think: “remind me in the kitchen” โ†’ actually works)

    โœ”๏ธ Fewer hiccups, more dependable automation flow

    ๐ŸŽต ComfyUI Audio ACE: Fixed & Tuned

    ๐ŸŽถ Stability fixes + cleaner generation reliability โ†’ fewer dropped beats, more vibes ๐ŸŽง

    โš™๏ธ Under-the-Hood Polish

    โ€ข Cleaner provider abstraction & event pipeline

    โ€ข General reliability boosts across all cores

    ๐Ÿฅ” Bottom line:

    > Tater v68 gave it eyes. Tater v69 lets it choose which ones to trust.

    Same brilliant Hydra brain ๐Ÿง  โ€” now with flexible perception ๐Ÿ‘๏ธ๐Ÿ‘๏ธ.

    Ready to upgrade? ๐Ÿš€

    ๐Ÿ‘‰ Check the Tater Shop, update your Core, and let your AI see more. ๐Ÿฅ”โœจ

    ๐Ÿ”— View Release