• Ollama – v0.20.5-rc1

    Ollama – v0.20.5-rc1

    New update alert for Ollama! 🚨

    If you’re running local LLMs like Llama 3, DeepSeek-R1, or Mistral, there’s a fresh release candidate out: v0.20.5-rc1. This update focuses on making your local development workflow much smoother by reducing guesswork when things go wrong.

    What’s new in this release:

    • Improved Error Feedback: The team has introduced a “re-run hint” specifically for dependency errors. 🛠️
    • Smarter Troubleshooting: If a model pull or setup fails because of a missing dependency, the error message will now explicitly nudge you to try running the command again.

    It’s a small but super practical tweak designed to keep your momentum going and help you bypass those annoying setup hiccups without having to dig through logs! 🚀

    🔗 View Release

  • Ollama – v0.20.5-rc0

    Ollama – v0.20.5-rc0

    New update alert for Ollama! 🚨

    If you’re running local LLMs like Llama 3, DeepSeek-R1, or Mistral, a new release candidate (v0.20.5-rc0) just dropped to help make your debugging sessions much less headache-inducing.

    What’s new in this release:

    • Improved Error Messaging: This update specifically fixes how the system handles unknown input item types in responses.
    • Better Debugging: Instead of encountering vague errors that leave you guessing, you’ll now receive much clearer feedback when an unexpected input type is encountered.

    This is a great little tweak for anyone building custom pipelines or experimenting with complex prompts where input types might shift. It makes tracking down configuration errors way smoother! 🛠️

    Keep tinkering! 🚀

    🔗 View Release

  • Lemonade – v10.2.0

    Lemonade – v10.2.0

    Lemonade just dropped a massive v10.2.0 update! 🍋 If you’ve been looking for a high-performance way to run LLMs locally using your GPU or NPU (like AMD Ryzen AI), this toolkit is essential for maximizing your hardware.

    What’s New in v10.2.0:

    • Expanded Integrations: New dedicated user guides for Claude Code integration, plus support for OpenCode via launchpad. 🚀
    • Developer-Friendly Artifacts: New embeddable release artifacts are now available for both Ubuntu and Windows—perfect if you’re bundling Lemonade into your own custom installers.
    • Enhanced Resource Management: The system tray now features a “Max Loaded Models” submenu, making it much easier to manage your VRAM and system resources on the fly.
    • Improved Model Handling & Vision Support:
    • Complete overhaul of the multi-checkpoint storage strategy.
    • Added support for Qwen Image with extra `sd-cpp` parameters.
    • Fixed device reporting bugs in `llamacpp` and `whispercpp` stats.
    • Smart Automation: Introduced a “Smart CLI pull” command to streamline fetching models, along with auto-detection for GPU backends during ESRGAN upscaling. 🛠️

    Platform Availability:

    This release is ready to go across almost every major environment:

    Windows: `.msi` installers (Server + App or Server Only). Note: Windows installers are now officially signed!* ✅

    • Linux: Ubuntu 24.04+ (via Launchpad PPA or AppImage), Fedora 43 (`.rpm`), and Docker/Snap/Arch options.
    • macOS: Beta support is available via `.pkg`.

    🔗 View Release

  • Ollama – v0.20.4: responses: add support for fn call output arrays (#15406)

    Ollama – v0.20.4: responses: add support for fn call output arrays (#15406)

    Ollama Update: v0.20.4 is here! 🛠️

    If you’re running LLMs locally, listen up! The latest release of Ollama focuses on making function calls much more reliable and expanding how the engine handles complex data structures. No more unexpected crashes when your model tries to return a list instead of a single string.

    What’s new in v0.20.4:

    • Enhanced Function Call Support: The engine now supports arrays for function call outputs. Previously, it only handled strings, which caused errors when the output contained multiple pieces of data.
    • Multi-Content Array Support: Beyond just text, the update adds support for arrays containing both image and text content. This makes response handling much more robust for complex, multi-modal tasks. 🖼️
    • Critical Bug Fix: Specifically resolves the `json: cannot unmarshal array into Go struct field` error that was breaking workflows whenever models returned structured list data.

    A quick heads-up for the tinkerers: While this adds great support for text and image content, interleaving (mixing) images and text within these specific arrays is currently limited. Also, keep an eye out—file content support is slated for a future update! 🚀

    🔗 View Release

  • Ollama – v0.20.4-rc2: gemma4: Disable FA on older GPUs where it doesn’t work (#15403)

    Ollama – v0.20.4-rc2: gemma4: Disable FA on older GPUs where it doesn’t work (#15403)

    Ollama – v0.20.4-rc2 🚀

    Ollama continues to be the essential toolkit for anyone looking to run large language models locally, providing a seamless way to experiment with privacy and speed on your own hardware.

    This release focuses on improving stability for users running the gemma4 model:

    • Flash Attention (FA) Compatibility Fix: To prevent crashes, Flash Attention is now automatically disabled on older GPU hardware.
    • Hardware Awareness: Specifically, if your CUDA version is older than 7.5, the system will bypass FA since that hardware lacks the necessary support for the gemma4 model.

    This is a great win for those of us working with slightly older gear—you can now deploy these cutting-edge models without worrying about unexpected errors or stability issues! 🛠️

    🔗 View Release

  • MLX-LM – v0.31.2

    MLX-LM – v0.31.2

    MLX LM: Run LLMs with MLX just dropped an update! 🚀

    If you’re rocking Apple silicon, this package is your best friend for generating text and fine-tuning large language models using the high-performance MLX framework. It’s incredibly versatile, supporting quantization, distributed inference, and thousands of models directly from the Hugging Face Hub.

    What’s new in mlx-lm v0.31.2:

    This minor release is all about internal stability and refining how the engine handles data during inference:

    • Align batch logits processor token contract: This technical tweak ensures more consistent behavior within the batch logits processor, keeping your token handling predictable and smooth. 🛠️

    It might be a small adjustment under the hood, but it’s exactly the kind of refinement that keeps local LLM workflows stable for devs and tinkerers alike!

    🔗 View Release

  • Ollama – v0.20.4-rc1: gemma4: add missing file (#15394)

    Ollama – v0.20.4-rc1: gemma4: add missing file (#15394)

    Ollama v0.20.4-rc1 is here! 🚀

    If you’ve been trying to run Gemma 4 locally and hitting unexpected errors, this release candidate is exactly what you need to get back up and running smoothly. Ollama remains the premier tool for democratizing LLM access, allowing you to spin up models like Llama 3, DeepSeek-R1, and Mistral directly on your hardware without relying on the cloud.

    What’s new in this release:

    • Full Gemma 4 Support: This update resolves a critical issue by adding a missing file required for seamless Gemma 4 integration.
    • Essential Bug Fix: The patch corrects an accidental omission from a previous pull request (#15378), ensuring that the model files are correctly recognized by the framework.

    This is a quick but vital fix to ensure your local AI environment stays stable and capable of running the latest model architectures. Grab it and start tinkering! 🛠️

    🔗 View Release

  • Ollama – v0.20.4-rc0

    Ollama – v0.20.4-rc0

    Ollama v0.20.4-rc0 is officially hitting the radar! 🚀

    If you’re looking to run powerful LLMs like Llama 3, DeepSeek-R1, or Mistral locally without the headache, Ollama is the ultimate toolkit. It handles everything from model downloading to providing a REST API for your own custom builds, making local AI experimentation incredibly smooth across macOS, Windows, and Linux.

    This latest Release Candidate (rc0) is all about tightening up the experience and ensuring stability before the full rollout. Here’s what’s under the hood:

    • Path Cleanup: Experimental paths have been scrubbed to provide a much more predictable environment for your local setups.
    • Enhanced Model Management: Fixed bugs within the “create from existing” functionality, making it easier to build and manage custom model variations.

    Since this is an rc0 release, it’s the perfect time for us tinkerers to jump in, test these refinements, and make sure everything plays nice with our local workflows! 🛠️

    🔗 View Release

  • Ollama – v0.20.3: model/parsers: add gemma4 tool call repair (#15374)

    Ollama – v0.20.3: model/parsers: add gemma4 tool call repair (#15374)

    Ollama v0.20.3 is officially live! 🚀

    If you’ve been running large language models locally on your machine, you know that Ollama is the gold standard for making LLMs like Llama 3, DeepSeek-R1, and Mistral accessible without needing a massive cloud setup. This latest update is a huge win for anyone building agentic workflows or using tool-calling capabilities.

    What’s new in this release:

    • Gemma 4 Tool Call Repair: We’ve all seen it—a model makes a tiny syntax mistake while trying to call a function, and the whole process grinds to a halt. This update introduces a specialized “repair” mechanism for Gemma 4. If the initial strict parse fails, Ollama will now attempt to fix common errors on the fly to keep your automation running smoothly.
    • Smart Error Correction: The new repair logic is specifically tuned to catch and fix:
    • Missing string delimiters.
    • Incorrectly used single-quoted values.
    • Raw terminal strings that need proper formatting according to the tool schema.
    • Missing object closing braces (applied after a successful concrete repair).
    • Improved Stability & Testing: To make sure these fixes don’t cause new headaches, this release includes expanded regression coverage and new unit tests specifically for malformed tool calls.

    This is a massive step forward for reliability when working with cutting-edge models locally. Get updated and keep those agents running! 🛠️

    🔗 View Release

  • Ollama – v0.20.3-rc0: model/parsers: add gemma4 tool call repair (#15374)

    Ollama – v0.20.3-rc0: model/parsers: add gemma4 tool call repair (#15374)

    Ollama v0.20.3-rc0 is officially live! 🚀

    If you are running local LLMs, you know that “agentic” workflows depend entirely on how well a model can call tools and functions. Even a tiny syntax error from the model can crash your entire pipeline. This release is a massive quality-of-life update specifically designed to bridge that gap.

    What’s new in this release:

    • Gemma 4 Tool Call Repair: Instead of letting a malformed tool call break your code, Ollama now features a “repair” layer. It uses a candidate pipeline to catch and fix syntax mistakes on the fly.
    • Smart Error Correction: The repair logic is fine-tuned to handle common model hiccups, such as:
    • Missing Gemma string delimiters.
    • Single-quoted string values or dangling delimiters.
    • Raw terminal strings that need proper formatting per the tool schema.
    • Missing object closing braces.
    • Enhanced Stability: This update includes new regression coverage and unit tests to ensure these repair helpers work reliably across various scenarios, preventing old bugs from resurfacing.

    This is a huge win for anyone building autonomous agents or using Gemma 4 for function calling—it makes your local development much more robust and less prone to frustrating crashes! 🛠️

    🔗 View Release