• Ollama – v0.17.7

    Ollama – v0.17.7

    🚨 Ollama v0.17.7 is out! 🚨

    This patch brings a subtle but important fix under the hood:

    πŸ”Ή Stale context window entries now get properly overridden β€” meaning outdated prompt/chat history data won’t linger and mess with your inference accuracy. πŸ’‘

    🧠 Why you’ll care:

    • Cleaner, more reliable multi-turn conversations
    • Better token efficiency (no hidden bloat from old context!)
    • Smoother long-context handling β€” especially helpful if you’re pushing model limits

    πŸ“¦ No flashy new models or API changes this time, but it’s a solid reliability bump for everyday use.

    πŸ”— Full details: v0.17.7 Release

    Happy local LLM tinkering! πŸ› οΈπŸ€–

    πŸ”— View Release

  • Ollama – v0.17.7-rc2

    Ollama – v0.17.7-rc2

    πŸš€ Ollama v0.17.7-rc2 is out!

    This release candidate brings a handy fix for context window management β€” specifically, overriding stale entries in the context tracking logic. 🧠✨

    πŸ”Ή What’s fixed?

    • Stale context data (e.g., outdated conversation history) no longer lingers and messes with model responses.
    • Improves reliability in multi-turn chats, especially for longer sessions or when switching between conversations.

    πŸ’‘ Why it matters: Cleaner context = more accurate, consistent responses β€” and fewer “wait, why did it say that?!” moments. πŸ˜…

    Since this is an rc2, it’s a pre-release focused on polish and stability ahead of the final `v0.17.7`. No flashy new features yet, but solid under-the-hood improvements!

    πŸ‘‰ Grab it and test: v0.17.7-rc2 on GitHub

    Let us know how it behaves in the wild! πŸ› οΈ

    πŸ”— View Release

  • Ollama – v0.17.7-rc1

    Ollama – v0.17.7-rc1

    🚨 Ollama v0.17.7-rc1 is out! 🚨

    This release is a tiny but tidy patch candidate β€” only one commit landed:

    πŸ”§ `cmd/config: fix cloud model limit lookups in integrations (#14650)`

    βœ… What’s fixed:

    • Resolves a bug where Ollama was misfetching or misapplying model usage limits when integrated with cloud services (e.g., Ollama Cloud or third-party APIs).
    • Ensures smoother, more accurate rate-limit handling in hybrid/local–cloud workflows.

    πŸ“Œ Why it matters:

    • If you’re using Ollama with cloud backends or integrations (like LangChain, LlamaIndex, or custom tooling), this fix helps avoid unexpected throttling or config mismatches.
    • No new features, no breaking changes β€” just more reliability πŸ› οΈ

    πŸ“… Tagged: Mar 5, 2024

    πŸ”— Release on GitHub

    ⚠️ RC = Release Candidate β€” test it out, but maybe wait for the stable drop before pushing to prod.

    Let me know if you want a deep dive into PR #14650 or how this affects your integrations! πŸ€–βœ¨

    πŸ”— View Release

  • Ollama – v0.17.7-rc0

    Ollama – v0.17.7-rc0

    🚨 Ollama v0.17.7-rc0 is here β€” and it’s all about Qwen3.5 love! 🧠✨

    The latest release candidate is a focused update with one standout improvement:

    πŸ”Ή Context length configuration for Qwen3.5 models at launch β€” now you can tweak how much context the model uses right from the start, boosting compatibility and flexibility for longer prompts or multi-turn conversations.

    No flashy new features this time β€” just smart, targeted tuning to make Qwen3.5 models run smoother and more predictably on your machine πŸ› οΈπŸ’»

    Perfect for anyone experimenting with Qwen3.5 locally or building apps around it!

    Curious how it behaves? Drop a test prompt and share your results πŸ‘‡

    πŸ”— View Release

  • Ollama – v0.17.6

    Ollama – v0.17.6

    🚨 Ollama v0.17.6 is out β€” and it’s a quick but important patch! 🚨

    This release is light on features, heavy on precision:

    πŸ”§ Bug fix: Corrected how `glm-ocr` image tags are parsed in renderer prompts

    πŸ”— PR #14584 by @Victor-Quqi

    βœ… Why it matters:

    • If you’re using GLM-OCR (especially for multimodal OCR tasks), image tags like `<image>` in your prompts will now render correctly instead of causing errors or misinterpretations.
    • Ensures smoother integration in custom renderer workflows β€” critical for anyone building multimodal apps or pipelines on top of Ollama.

    πŸ“¦ No new models, no API changes β€” just a clean, targeted fix to keep your local LLM workflows humming.

    If you rely on GLM-OCR or custom multimodal prompts, update away! πŸ› οΈ

    Let me know if you want a breakdown of how Ollama renderers work or how to test this fix! πŸ€–βœ¨

    πŸ”— View Release

  • Voxtral Wyoming – v1.0.0

    Voxtral Wyoming – v1.0.0

    🚨 Voxtral Wyoming v1.0.0 is live β€” and it’s production-ready! πŸš€

    The wait is over: this release marks the stable, final v1.0.0 of Voxtral Wyoming β€” your go-to offline STT service powered by Mistral’s Voxtral models, now fully integrated with Home Assistant Assist via the Wyoming protocol.

    ✨ What’s new (and why it matters):

    βœ… Stable & battle-tested β€” all major bugs squashed, performance optimized for real-world use

    βœ… API finalized β€” no more breaking changes ahead; integrations are safe to lock in

    βœ… Full tooling in place β€” docs, tests, and CI/CD pipelines are now rock-solid

    βœ… Zero flash, all function β€” no flashy new features, just a polished, reliable upgrade ready for production πŸ› οΈ

    🎯 Whether you’re running it on CPU, CUDA (NVIDIA), or MPS (Apple Silicon), and whether your audio comes in MP3, OGG, FLAC, or WAV β€” Voxtral Wyoming handles it all with automatic PCM16 conversion. Config via env vars? Yep β€” host, port, language, model ID… all covered.

    πŸ“¦ Dockerized. Deployed. Ready.

    🟒 Green light for production! Let’s build smarter, offline-first voice assistants β€” together. πŸŽ€πŸ’‘

    πŸ”— View Release

  • Ollama – v0.17.5

    Ollama – v0.17.5

    🚨 Ollama v0.17.5 is live! 🚨

    Hey AI tinkerers β€” fresh update alert! πŸ”₯ Ollama just rolled out v0.17.5, and it’s a quiet but mighty one β€” especially if you love playing with Qwen3 or importing GGUF models. Here’s the lowdown:

    πŸ”Ή GGUF love, expanded! 🎁

    • Full support for importing and running Qwen3 models (like `Qwen3-0.6B`, `Qwen3-1.7B`) β€” straight from Hugging Face or wherever you grab your GGUFs.
    • Smoother imports, fewer hiccups πŸ› οΈ

    πŸ”Ή Under-the-hood polish

    • Bug fixes and stability tweaks (you won’t see them, but you’ll feel the smoother run).

    πŸ’‘ Why care?

    If you’re experimenting with lightweight Qwen3 variants or love the flexibility of GGUF (quantized, portable, efficient πŸ“¦), this update makes your workflow just a little more magical. ✨

    Ready to upgrade? `ollama pull ollama` πŸš€

    Let us know how it runs!

    πŸ”— View Release

  • Voxtral Wyoming – v0.5.0

    Voxtral Wyoming – v0.5.0

    _New update detected._

    πŸ”— View Release

  • Voxtral Wyoming – v0.4.0

    Voxtral Wyoming – v0.4.0

    _New update detected._

    πŸ”— View Release

  • Lemonade – v9.4.1

    Lemonade – v9.4.1

    _New update detected._

    πŸ”— View Release