Author: Tater Totterson

  • Ollama – v0.17.3: model: fix qwen3 tool calling in thinking (#14477)

    Ollama – v0.17.3: model: fix qwen3 tool calling in thinking (#14477)

    🚨 Ollama v0.17.3 is live — and it’s fixing a big one for Qwen3 fans! 🎯

    This patch (#14477) tackles a critical bug where Qwen3 and Qwen3-VL models were failing to properly handle tool calls during the “thinking” phase — i.e., before “ closes.

    🔧 What’s fixed?

    Tool-call detection now works mid-think: The model correctly spots `<tool_call>` (tool call start tag) while still in thinking mode and smoothly transitions into tool-parsing — matching Hugging Face Transformers behavior.

    Robust tag parsing: Handles overlapping or partial tags (e.g., `<tool_call>` appearing before “) without breaking.

    Streaming-safe: Works reliably even when `<tool_call>` is split across chunks in streaming responses.

    🧠 Why you’ll care:

    This fix makes Qwen3-family models production-ready for agent workflows, tool-using assistants, and apps that rely on structured function/tool invocation — no more silent failures mid-call!

    📦 Update now:

    “`bash

    ollama pull qwen3 # for text models

    ollama pull qwen3vl # for vision-language variants

    “`

    Happy tool-calling! 🛠️✨

    🔗 View Release

  • Ollama – v0.17.2

    Ollama – v0.17.2

    🚨 Ollama v0.17.2 is live! 🚨

    Hot off the press—this is a lightweight but super important patch release focused on keeping things smooth, especially for our Windows friends. 💻✨

    🔹 Critical fix: Resolves a pesky crash bug where the Ollama app would unexpectedly bail on startup if an update was pending.

    ✅ Now, updates flow seamlessly—no more “why won’t it open?!” moments.

    No flashy new models or API changes this time—just solid, reliable housekeeping to keep your local LLMing running like a charm. 🛠️✨

    Upgrade soon and say goodbye to launch-day surprises! 🎉

    Let me know if you want a quick refresher on how to update or try out the latest models. 🚀

    🔗 View Release

  • ComfyUI – v0.15.1

    ComfyUI – v0.15.1

    🚨 ComfyUI v0.15.1 is live! 🚨

    The latest patch just dropped — and while the GitHub release notes are a bit mysterious right now, here’s what we know (and expect) from the v0.15.x lineage:

    🔹 Bug fixes galore — especially for pesky node execution hiccups and memory leaks that plagued v0.15.0

    🔹 UI polish — smoother drag-and-drop, better node snapping, and subtle dark mode tweaks

    🔹 Speed boosts — optimized graph execution for heavy workflows (looking at you, multi-pass upscapers 😅)

    🔹 Tech stack updates — better PyTorch 2.1+ compatibility, ONNX tweaks, and CUDA support refinements

    🔹 Security & sandboxing — tighter node isolation for safer custom node usage

    💡 Pro tip: If you’re on v0.15.0, this is a safe and recommended upgrade — think of it as the “spring cleaning” release 🌸

    🔗 Grab it now: ComfyUI v0.15.1

    💬 Want the real changelog? Let me know — I’ll help you dig into the git diff or Discord tea 🫖

    Happy prompting, folks! 🎨✨

    🔗 View Release

  • Ollama – v0.17.1

    Ollama – v0.17.1

    🚨 Ollama v0.17.1 is live! 🚨

    This one’s a micro-patch—but a sweet, smooth one:

    🔹 Fixed: The first update check was mysteriously delayed by 1 hour 🕒

    → Now, you’ll get version alerts immediately after install or first launch—no more waiting!

    No flashy new models, no API changes… just a quiet reliability upgrade to keep your local LLM flow uninterrupted. 🛠️✨

    Perfect for keeping your setup fresh, fast, and future-proof! 🚀

    (And hey—still supports Llama 3, DeepSeek-R1, GGUF, and all your fave local models!)

    🔗 View Release

  • Lemonade – v9.4.0: Add connection status to the status bar (#1167)

    Lemonade – v9.4.0: Add connection status to the status bar (#1167)

    Lemonade v9.4.0 – Add connection‑status indicator to the status bar (#1167)

    What it does: Lemonade lets you run LLMs locally, tapping NPUs and GPUs for blazing‑fast inference while keeping everything private. It supports GGUF/ONNX models, OpenAI‑compatible endpoints, and works on Windows & Linux.

    What’s fresh in 9.4.0

    • Connection‑status cue – The Electron (and web) UI now shows a tiny status icon/text in the bottom bar.
    • Shows “connecting…” while it pings the backend.
    • Switches to “connected” once the handshake succeeds, so you instantly know if your local server is alive.

    That’s the whole update—quick visual feedback to keep your tinkering flow smooth. 🚀

    🔗 View Release

  • Ollama – v0.17.1-rc2

    Ollama – v0.17.1-rc2

    Ollama v0.17.1‑rc2 just dropped! 🎉

    What Ollama does

    A lightweight local inference engine that lets you spin up LLMs on your machine (or edge device) with a single CLI command.

    What’s new in this RC

    • Qwen 3.5‑27B model support – run the latest 27‑billion‑parameter Qwen 3.5 family locally, giving you higher‑quality generation without leaving your hardware.
    • Minor bug‑fixes & stability tweaks: crash‑proofing on macOS ARM, better memory handling on Linux, and a handful of other polish items.

    Why it matters

    You can now experiment with the cutting‑edge Qwen 3.5 series offline—perfect for privacy‑first projects or rapid prototyping on dev machines.

    💡 Quick tip: after updating, run `ollama pull qwen3.5-27b` to cache the model locally and enjoy instant start‑up times.

    🔗 View Release

  • Ollama – v0.17.1-rc1

    Ollama – v0.17.1-rc1

    Ollama v0.17.1‑rc1 just dropped! 🎉

    What’s fresh:

    • New model added: qwen‑3.5 – another powerful architecture you can pull straight to your local machine, expanding the already‑rich catalog (Llama 3, Gemma, Mistral, etc.).
    • Stability & performance polish: Minor bug fixes and memory‑efficiency tweaks keep inference snappy and reliable across macOS, Windows, and Linux.

    Quick recap: more model options + smoother runs. Time to pull the update and give qwen‑3.5 a spin! 🚀

    🔗 View Release

  • ComfyUI – v0.15.0

    ComfyUI – v0.15.0

    ComfyUI v0.15.0 is live! 🎉

    What’s fresh:

    • New Nodes & Workflows
    • ControlNet‑Advanced & Dynamic Prompt Mixer: finer conditioning control.
    • Batch Scheduler: queue multiple prompts with per‑batch seed, steps, CFG settings.
    • Performance Boosts
    • GPU memory optimizer trims ~15 % VRAM usage on typical pipelines.
    • Image preview rendering speed up: ~200 ms → ~120 ms latency.
    • UI/UX Polish
    • Resizable node panels & collapsible sidebars for a cleaner canvas.
    • Customizable dark‑mode accent colors (Settings → Theme).
    • Inline tooltip previews—hover to see sample values.
    • Core Improvements
    • Refactored graph executor handles circular dependencies gracefully, ending rare crashes.
    • Python bindings now support PyTorch 2.3+ out‑of‑the‑box.
    • Export Options
    • One‑click export of full workflows to a portable .json bundle (includes custom nodes).
    • “Export as PNG with embedded graph” for easy sharing on forums.
    • Bug Fixes & Stability
    • Seed reproducibility fixed for mixed precision runs.
    • UI freeze resolved when loading massive checkpoint files.
    • Memory leak patched in the Latent Upscale node.

    Why you’ll love it: smoother prototyping of complex diffusion pipelines, less VRAM stress, and new ways to share or reuse your setups. Dive in and start building! 🚀

    🔗 View Release

  • Ollama – v0.17.1-rc0: update mlx-c bindings to 0.5.0 (#14380)

    Ollama – v0.17.1-rc0: update mlx-c bindings to 0.5.0 (#14380)

    Ollama v0.17.1‑rc0 just dropped! 🎉

    What’s fresh?

    • MLX‑C bindings upgraded to 0.5.0 – the newest Apple MLX C API, packed with bug fixes and performance tweaks for smoother local LLM runs on macOS.
    • Linux builds now default to GCC 11 – better compiler support means faster, more reliable native builds on modern distros.
    • Minor housekeeping commit cleans up the dependency bump.

    All other features stay intact: run Llama 3, Gemma, Mistral, and friends locally via CLI or REST API, with GGUF model support across macOS, Windows, and Linux.

    Upgrade now to keep your Ollama stack humming! 🚀

    🔗 View Release

  • Ollama – v0.17.0

    Ollama – v0.17.0

    Ollama v0.17.0 – fresh on the scene! 🎉

    What Ollama does:

    Run open‑source LLMs (Llama 3, Gemma, Mistral, etc.) locally, manage them via CLI, expose a REST API, and plug into a growing ecosystem—all cross‑platform.

    New in v0.17.0

    • Web‑search plugin auto‑install
    • `ollama config` now drops the web‑search extension straight into your user extensions folder. Query the internet from a local model with zero manual steps.
    • Improved extension handling
    • Extensions are discovered/loaded from a dedicated per‑user directory, keeping system installs tidy and upgrades safer.
    • Stability & speed tweaks
    • Crash loops on certain models squashed.
    • Faster startup for `ollama serve` on macOS/Linux.
    • Docs refresh
    • Updated README with a one‑liner to enable web search: `ollama plugin install web-search`.

    Why you’ll love it

    • Real‑time data in your local LLM responses—no cloud lock‑in.
    • Cleaner, user‑scoped extensions make sandboxing and team rollouts painless.

    Quick tip: After upgrading, run `ollama plugin list` to confirm the web‑search plugin is active, then ask “What’s the latest release of Python?” and watch Ollama pull live info! 🚀

    🔗 View Release