• Ollama – v0.19.0-rc0: ci: harden cuda include path handling (#15093)

    Ollama – v0.19.0-rc0: ci: harden cuda include path handling (#15093)

    🚨 Ollama v0.19.0-rc0 is here! 🚨

    This release is all about CI/CD hardening, especially for Windows + CUDA users. While no flashy new features or model drops, it’s a critical under-the-hood fix that’ll make your builds smoother—especially in automated environments.

    🔍 What’s new / fixed:

    • 🪟 Windows CUDA path fix: Ollama now correctly identifies and uses the real CUDA header directory—even when multiple `include` paths pop up (a common headache in Windows CI setups).
    • 🛠️ More reliable builds: Prevents failures where ambiguous or duplicated CUDA paths caused copy/compile errors.
    • 🧪 CI-friendly: Makes Ollama’s build system more resilient across platforms—great for maintainers and contributors.

    💡 Why it matters: If you’ve ever seen cryptic CUDA include errors on Windows (or in GitHub Actions), this tweak is your new best friend.

    Full details: PR #15093

    Let’s get those local LLMs building flawlessly! 🛠️✨

    🔗 View Release

  • Ollama – v0.18.4-rc1

    Ollama – v0.18.4-rc1

    🚀 Ollama v0.18.4-rc1 is here — and it’s packing a subtle but smart update!

    🔍 What’s new?

    Ollama now warns you if your server context length is below 64k tokens when running local models. Why? Because newer LLMs (like Llama 3.1, Mistral Large, DeepSeek-R1) are built for long contexts — and running them with too little context can lead to truncated outputs or weird behavior. This warning helps you avoid those gotchas before they bite! 💡

    🛠️ Bonus: While the full changelog is still loading on GitHub, this RC likely includes:

    • Stability tweaks for model loading
    • Improved error messages (especially around context handling)
    • Minor CLI/web UI polish

    📌 Pro tip: If you’re using large-context models (e.g., `llama3.1:8b-instruct-q4_K_M`), double-check your `OLLAMA_MAX_LOADED_MODELS` and context settings — this warning is here to help you optimize!

    🔗 Grab the RC: v0.18.4-rc1 on GitHub

    💬 Join the convo: Ollama Discord

    Let us know if you spot any quirks or love the warning — feedback helps shape the final release! 🙌

    🔗 View Release

  • Ollama – v0.18.4-rc0

    Ollama – v0.18.4-rc0

    🚀 Ollama v0.18.4-rc0 is here!

    A new release candidate just dropped — and while the GitHub release notes are still a bit mysteriously missing (👀), we’ve got one confirmed tweak:

    `launch: hide vs code (#15076)`

    → The launcher now hides the VS Code window when spinning up models or background tasks. Less UI clutter, more focus — perfect for devs who just want the model running, not an editor popping up uninvited. 😅

    🔍 What’s still unknown (yet!):

    • Full bug fixes & performance tweaks
    • New models or quantization support (e.g., GGUF updates?)
    • Platform-specific improvements (macOS, Windows, Linux)

    📌 Pro move: Grab the RC and give it a spin — but hold off on production upgrades until the full `v0.18.4` lands with the complete changelog.

    🔗 Ollama v0.18.4-rc0 on GitHub

    Let us know what you test — especially if you spot hidden gems! 🧪✨

    🔗 View Release

  • Ollama – v0.18.3: api/show: overwrite basename for copilot chat (#15062)

    Ollama – v0.18.3: api/show: overwrite basename for copilot chat (#15062)

    🚀 Ollama v0.18.3 is live!

    This patch fixes a subtle but important quirk in how Ollama interacts with GitHub Copilot Chat — specifically around model naming.

    🔹 What’s fixed?

    The `/api/show` endpoint now returns `req.Model` (the actual model name from `/api/tags`) instead of the generic `general.basename`.

    Why you’ll love this:

    • No more confusing name collisions when multiple models share the same basename.
    • Copilot Chat now displays exactly the model name you expect — making selection cleaner and more intuitive.
    • Better alignment with how Ollama itself labels models elsewhere.

    It’s a small but mighty polish update — perfect for Copilot users who want seamless, accurate model switching. 🛠️✨

    👉 Grab the update and keep local LLM-ing!

    `ollama pull ollama` 🎯

    🔗 View Release

  • Ollama – v0.18.3-rc2: api/show: overwrite basename for copilot chat (#15062)

    Ollama – v0.18.3-rc2: api/show: overwrite basename for copilot chat (#15062)

    🚨 Ollama v0.18.3-rc2 is out! 🚨

    This update fixes a pesky bug in GitHub Copilot integration — specifically, how model names are displayed when using `/api/show`.

    🔹 What’s fixed?

    • Previously, Copilot would see internal `general.basename` values (e.g., `deepseek-r1:7b-v3-fp16`) instead of clean, user-friendly model names like `deepseek-r1`.
    • Now, `/api/show` returns the actual requested model name (`req.Model`) — matching what you see in `ollama list` or `/api/tags`.

    Why it’s a win:

    • Cleaner, more intuitive model names in Copilot.
    • No more duplicate or confusing basenames when switching between similar models (e.g., `llama3:8b` vs. `llama3:70b`).

    🧠 Bonus: This small tweak makes Copilot + Ollama feel way more polished — especially for those of us juggling multiple local models.

    📦 RC2 dropped on Mar 25, 2024 — grab it and give Copilot a spin! 🛠️

    🔗 GitHub PR #15062

    Let us know how it feels! 🙌

    🔗 View Release

  • Ollama – v0.18.3-rc1

    Ollama – v0.18.3-rc1

    🚨 Ollama v0.18.3-rc1 is out! 🚨

    A quick heads-up for our fellow AI tinkerers — the latest release candidate for Ollama is here, and while it’s a small RC, it packs a useful fix:

    🔧 What’s new?

    Windows CGO compiler error fixed (PR #15046) — this resolves a CI/CD hiccup that was causing build issues on Windows. If you’ve been hitting weird compilation errors or CI failures on Windows, this one’s for you!

    🔍 Note: Due to a temporary GitHub UI glitch, the full release notes didn’t load cleanly — so this RC currently only has one confirmed change. If you’re feeling adventurous and want to test the latest fixes (especially on Windows), go grab `v0.18.3-rc1` from the GitHub Releases page.

    Let us know if you run into anything — happy prompting! 🧠✨

    🔗 View Release

  • Ollama – v0.18.3-rc0: mlx: add mxfp4/mxfp8/nvfp4 importing (#15015)

    Ollama – v0.18.3-rc0: mlx: add mxfp4/mxfp8/nvfp4 importing (#15015)

    🚨 Ollama v0.18.3-rc0 is here — and it’s a quantization powerhouse! 🚨

    The latest release adds supercharged import support for MLX (Apple’s Metal framework) and NVIDIA’s new FP8 formats — meaning you can now run even more efficient, low-bit models locally. Here’s the breakdown:

    🔹 New Quantization Imports

    ✅ Import BF16 models → convert on-the-fly to:

    • `mxfp4` (Meta’s 4-bit mixed-precision)
    • `mxfp8` (Meta’s 8-bit mixed-precision)
    • `nvfp4` (NVIDIA’s 4-bit floating-point format)

    ✅ Import FP8 models → convert directly to `mxfp8`

    🎯 Why this rocks:

    • 🍏 Apple Silicon users (M1/M2/M3): Run ultra-efficient MLX-native models with minimal memory footprint.
    • 🎮 NVIDIA fans: Get early access to NVFP4 — a promising new format for faster, smaller inference.
    • ⚡ Smaller models + less VRAM = more models on your laptop, fewer cloud trips.

    This is a big leap toward truly portable, hardware-agnostic LLM inference — all from your desktop. 🧠💻

    Curious how `mxfp4` stacks up against `nvfp4`? Let us know — happy to deep dive! 🧵

    🔗 View Release

  • Home Assistant Voice Pe – 26.3.0

    Home Assistant Voice Pe – 26.3.0

    🚨 Home Assistant Voice PE 26.3.0 is live! 🚨

    Big updates in this release—let’s break it down:

    🎧 Media Playback Stability

    No more audio/video hiccups! Expect smoother, rock-solid playback across devices—perfect for voice-triggered media or smart speakers.

    🌐 Multiple Sendspin Servers

    Now supports multiple Sendspin backends simultaneously. More redundancy, better failover, and improved scalability for larger setups.

    ⏱️ TTS Timeout Fix

    Say goodbye to cut-off voice responses! Text-to-speech now waits properly—your AI replies play fully before moving on.

    🌟 New Contributor Alert!

    Shoutout to @akloeckner for their first PR (#558)—welcome to the crew! 🙌

    ✨ Bonus: The project is now officially sponsored by the Open Home Foundation—a huge vote of confidence in open, private-by-design voice control!

    📦 Full changelog: [25.12.4 → 26.3.0](#)

    Let’s make our homes smarter—privately and offline-capable. 🏠🗣️

    🔗 View Release

  • Ollama – v0.18.2: launch: fix openclaw not picking up newly selected model (#14943)

    Ollama – v0.18.2: launch: fix openclaw not picking up newly selected model (#14943)

    🚨 Ollama v0.18.2 is live! 🚨

    This patch fixes a sneaky bug in OpenCLaW (Ollama’s model-launching engine) where switching models mid-session would not actually switch the active model — yikes! 😅

    🔧 What’s fixed?

    • Previously: Changing the primary model in GUI/CLI wouldn’t update active sessions — you’d keep running the old model, even if it looked like you’d switched.
    • Now: Sessions properly refresh when the model changes — no more stale model confusion! ✅

    🎯 Why it matters:

    • Perfect for devs testing multiple models in one session (e.g., comparing Llama 3 vs. DeepSeek).
    • Critical for demos or workflows where model switching is part of the flow — reliability restored!

    🔗 PR #14943

    📦 Verified by BruceMacD on Mar 18, 20:20 UTC

    Ready to upgrade? `ollama update` && happy model-hopping! 🧠✨

    🔗 View Release

  • Ollama – v0.18.2-rc1: launch: fix openclaw not picking up newly selected model (#14943)

    Ollama – v0.18.2-rc1: launch: fix openclaw not picking up newly selected model (#14943)

    🚨 Ollama v0.18.2-rc1 is live! 🚨

    Quick, focused fix in this release candidate — perfect for keeping your local LLM workflows smooth and reliable.

    🔍 What’s New?

    Fixed `openclaw` model-switching bug — previously, if you changed the active model mid-session without restarting, `openclaw` would keep using the old (stale) model. Now it correctly picks up the new selection instantly!

    💡 Why You’ll Care:

    • 🔄 Makes dynamic model switching (e.g., testing Llama 3 vs. DeepSeek-R1) way more reliable in dev/test loops
    • 🧪 Critical for tooling and integrations that rely on runtime model changes
    • ⚡ Clean, minimal change — no side effects or breaking changes

    👉 Grab the RC and give it a spin before final `v0.18.2` drops!

    Let the team know if you spot anything weird 🐞➡️✅

    #Ollama #LLM #LocalAI #DevTools

    🔗 View Release