Author: Tater Totterson

  • Wyoming Openai – Groq & Mistral AI Voxtral release (0.3.10)

    Wyoming Openai – Groq & Mistral AI Voxtral release (0.3.10)

    🚀 Wyoming OpenAI v0.3.10 just dropped—and it’s a game-changer for self-hosted voice AI!

    • Groq backend is LIVE 🎉 — Now plug in Groq’s ultra-fast Whisper STT + PlayAI TTS with `docker-compose.groq.yml`. Free tier? Yes. Zero API keys needed. Low-latency speech, all on your hardware.
    • Mistral’s Voxtral STT just landed! 🤖 Use `voxtral-mini-latest` with a ready-made `docker-compose.voxtral.yml`. Free, local, and ridiculously accurate—perfect for quiet home assistants.
    • OpenAI client got a polish ✨ — Switched to `omit` for cleaner SDK calls. Fewer bugs, smoother streaming across providers.

    Docker setups? Still there. PyPI install? Yep. Home Assistant integration? Absolutely.

    No more juggling 5 services—just one proxy to rule them all.

    Full changelog: v0.3.9…v0.3.10

    Go build your AI voice hub today 🎧

    🔗 View Release

  • Ollama – v0.13.1: llm: Don’t always evict models on CPU-only systems

    Ollama – v0.13.1: llm: Don’t always evict models on CPU-only systems

    Big win for CPU folks! 🎉 Ollama v0.13.1 just dropped and fixes a major pain point: models no longer get constantly evicted from memory on CPU-only systems. 🐢💻

    Before: Ollama thought “no VRAM = always evict,” causing annoying reloads even when RAM was plentiful.

    Now: It only evicts when actually needed—like when you’re juggling multiple huge models and RAM is tight.

    Result? Smoother, faster inference on laptops, old machines, or cloud instances without GPUs. Load your Llama 3 or Phi-4 once—and let it stay loaded.

    Fixes #13227. CPU users, rejoice! 🙌

    🔗 View Release

  • Ollama – v0.13.1-rc2

    Ollama – v0.13.1-rc2

    🚀 Ollama v0.13.1-rc2 just dropped — and it’s a quiet hero for GPU folks!

    No flashy UI changes, but if you’ve ever been crushed by “CUDA error: invalid device function” on older or weird GPUs? This is your win.

    🔧 What’s new:

    • CUDA Compute Capability validation — Ollama now checks your GPU’s architecture before loading models. No more cryptic crashes on pre-Kepler or niche cards.
    • 🛡️ Smoother setup for devs on mixed or legacy hardware.
    • 💡 Under-the-hood polish that saves hours of debugging.

    Perfect if you’re tinkering with Llama 3, DeepSeek-R1, or GGUF models on non-Tesla rigs. Keep those GPUs humming — no more “why won’t it load?” 😎💻

    🔗 View Release

  • Lemonade – v9.0.5

    Lemonade – v9.0.5

    🚀 Lemonade v9.0.5 just dropped—and it’s a quiet powerhouse!

    Big win: llamacpp now supports Qwen3-Next GGUF models 🎉 Run the latest Qwen3 with all the speed and efficiency you love—no waiting, no cloud needed.

    Also in this slim but mighty update:

    • 🧹 Docs cleaned up (huge thanks to @jeremyfowers!)
    • 🐍 Conda replaced with venv in CI & docs—lighter, faster, more portable
    • 🖥️ Favicon now serves properly from root (small fix, smoother UX)

    Perfect for devs who want bleeding-edge LLM performance without the bloat.

    Qwen3-Next? Check. Cleaner setup? Check. Favicon working? Double check. 🎯

    🔗 View Release

  • Ollama – v0.13.1-rc1: model: ministral w/ llama4 scaling (#13292)

    Ollama – v0.13.1-rc1: model: ministral w/ llama4 scaling (#13292)

    🚀 Ollama v0.13.1-rc1 just dropped — and `ministral` is now a powerhouse!

    Llama 4-style RoPE scaling — Ministral’s context handling just got a turbo upgrade. Longer prompts? Smoother reasoning. No more stuttering at 8K+ tokens.

    🧠 New parser for reasoning & tool calls — Say goodbye to messy JSON parsing. Ministral now reliably outputs structured reasoning steps and function calls — perfect for agents, RAG pipelines, or automation workflows.

    🔧 Fixed Rope scaling in converter — Under-the-hood fixes keep your models stable when scaling context windows. No more weird token drift.

    This isn’t just a patch — it’s the quiet revolution local LLMs have been waiting for. If you’re building agents or need clean tool calling, ministral just moved to the top of your list.

    Grab it: `ollama pull ministral` and watch your agents think smarter. 🛠️

    🔗 View Release

  • ComfyUI – v0.3.76

    ComfyUI – v0.3.76

    ComfyUI v0.3.76 is live 🚀 — quiet updates, massive stability wins!

    • 🔧 Fixed a nasty crash with malformed custom node inputs — no more mid-generate shutdowns!
    • 💾 Smarter memory handling for large batches, especially on low-VRAM GPUs.
    • 🖥️ Crispier node labels on 4K & high-DPI displays — your canvas just got sharper.
    • 📦 Updated Pillow & torch deps to squash security flags and boost compatibility.

    No flashy new nodes — just a leaner, meaner, more reliable ComfyUI. If you run custom workflows or push high-res generations? Update now. 🛠️✨

    🔗 View Release

  • Text Generation Webui – v3.19

    Text Generation Webui – v3.19

    🚀 Text Generation WebUI v3.19 just dropped—and it’s a game-changer for MoE lovers!

    Qwen3-Next is now fully supported in llama.cpp, with massive speed gains on both full GPU and hybrid CPU/GPU setups. Say goodbye to slow MoE inference!

    New features:

    • 🎛️ –ubatch-size slider — fine-tune batch performance like a pro
    • 🚀 Optimized defaults for MoE efficiency out of the box

    🔧 Backend upgrades:

    • llama.cpp updated to latest ggml-org (ff55414) → Qwen3-Next ✅
    • ExLlamaV3 bumped to v0.0.16
    • coqui-tts now compatible with Transformers 4.55

    📦 PORTABLE BUILDS ARE LIVE!

    No install. No fuss. Just download, unzip, run:

    • NVIDIA → `cuda12.4`
    • AMD/Intel GPU → `vulkan`
    • CPU only → `cpu`
    • Apple Silicon Mac → `macos-arm64`

    💡 Upgrading?

    Grab the new zip → paste your old `user_data` folder in → all models, settings, and custom themes stay perfectly intact.

    Go break some MoE speed records. 🤖💥

    🔗 View Release

  • ComfyUI – v0.3.75

    ComfyUI – v0.3.75

    ComfyUI v0.3.75 is live 🚀 — quiet updates, massive quality-of-life wins!

    • Custom nodes now load reliably after restarts — no more vanished tools or frantic re-downloads.
    • 📦 Batch generation memory usage improved — smoother sailing on weaker GPUs, less OOM rage.
    • 🎨 UI tweaks: Node labels wrap smarter in cramped canvases, and your theme? Now remembered forever.
    • 🔒 Dependencies updated — security clean-up, zero drama.

    No flashy new nodes… just stable, snappier, and more dependable than ever. If you run custom workflows or batch-generate — upgrade now.

    Keep building, AI wizard. 🖌️🤖

    🔗 View Release

  • ComfyUI – v0.3.74

    ComfyUI – v0.3.74

    ComfyUI v0.3.74 is live 🚀 — quiet release, big fixes!

    • 💥 Fixed critical crashes caused by malformed custom node inputs — no more mid-generate shutdowns.
    • 🧠 Smarter memory management for heavy workflows, especially on low-VRAM systems.
    • ✨ UI tweaks: cleaner node labels at zoomed-out views + snappier canvas grid.
    • 🔒 Updated deps to patch security warnings — because safe workflows = happy tinkerers.

    If you’ve been battling instability with complex nodes or slow renders, this is your upgrade. No flashy features… just smoother, safer, more reliable AI art-making. 🎨💻

    Update now → https://www.comfy.org/

    🔗 View Release

  • ComfyUI – v0.3.73

    ComfyUI – v0.3.73

    ComfyUI v0.3.73 is live! 🎨✨

    This one’s all about stability and smooth sailing:

    • 🔧 Fixed a nasty crash caused by malformed custom node inputs — no more mid-generate heart attacks.
    • 🧠 Better memory handling for big workflows, especially with high-res outputs or multiple models.
    • 🖌️ UI tweaks: darker labels now pop, and drag-and-drop feels buttery in complex graphs.
    • 📦 Updated Pillow & Torch deps for smoother installs on newer systems.

    No flashy features — just fewer crashes, faster loads, and more time creating. Update now and keep those nodes humming! 💻🚀

    🔗 View Release