• Ollama – v0.20.6-rc0: gemma4: update renderer to match new jinja template (#15490)

    Ollama – v0.20.6-rc0: gemma4: update renderer to match new jinja template (#15490)

    New update for the Ollama crew! 🛠️

    If you’ve been running Google’s Gemma 4 models locally, there is a fresh release candidate (v0.20.6-rc0) ready for testing. This update focuses heavily on keeping Ollama in perfect sync with the latest changes from Google to ensure your local inference stays rock solid.

    What’s new in this release:

    • Gemma 4 Template Parity: The renderer has been updated to match Google’s new Jinja template. This ensures that how your prompts are structured matches exactly what the model expects, preventing weird formatting issues.
    • Parser Adjustments: Since the upstream parsing logic changed slightly, the Ollama parser has been tweaked to maintain compatibility and prevent broken inputs during model interaction.
    • Improved Type Handling for Tool-Calling:
    • Added special handling for simple `AnyOf` structures by treating them as type unions.
    • Fixed edge cases specifically around type unions to make tool-calling much more robust and reliable.
    • Better Tool Result Logic:
    • The parser now prefers “empty” over “None” for certain results, which is crucial for handling legitimate empty tool calls correctly without crashing the logic.
    • Added extra care when processing tool results that might have missing IDs to prevent errors in complex workflows.

    It’s a great update for anyone doing heavy lifting with tool-calling and complex prompt engineering on Gemma 4! 🚀

    🔗 View Release

  • Perplexica – v1.12.2

    Perplexica – v1.12.2

    🚀 Perplexica Update: v1.12.2 is Live!

    The open-source alternative to Perplexity AI just leveled up! If you’ve been looking for a way to run a powerful, cited search engine using local LLMs like Llama, Mistral, or DeepSeek, this update is a massive win for the dev community.

    What’s New in v1.12.2:

    • 🧠 Enhanced Deep Research Mode: The research pipeline has been transformed into an aggressive, iterative “Reason-Search-Scrape-Extract-Repeat” loop. It doesn’t just look at links; it actively hunts for top-tier content to ensure much deeper insights.
    • 📦 Dynamic Context Management: To prevent the system from choking on massive amounts of scraped data, information is now processed in optimized, dynamic chunks—keeping your context window clean and efficient.
    • 🎯 Smart Result Filtering: New embedding integrations allow the engine to filter search results effectively, boosting relevance while preventing context overflow.
    • 🌐 Improved Web Scraping: A new Chromium-based scraper has been implemented, making it much more reliable when navigating modern, complex web pages.
    • Optimized Search Execution: The `executeSearch` function has been completely rebuilt for better performance and speed.

    Stability & Bug Fixes:

    • 🛡️ Pipeline Resilience: Individual errors in widgets no longer crash your entire research pipeline.
    • ⏱️ Search Reliability: Added validation and timeouts to prevent hung search requests.
    • 📍 Backend Accuracy: Integrated `serverUtils` and updated the reverse geolocation API for much higher accuracy during location-based tasks.
    • 🛠️ Workflow Stability: Improved error handling for file uploads and resolved several build-time dependency errors.

    Time to pull that Docker image and test out the new Deep Research capabilities! 🛠️✨

    🔗 View Release

  • Ollama – v0.20.5

    Ollama – v0.20.5

    Ollama just dropped a fresh update! 🚀

    If you haven’t been playing with it yet, Ollama is an incredible tool for running large language models (LLMs) locally on your machine. It simplifies the entire process of downloading, managing, and interacting with powerful models like Llama 3, DeepSeek-R1, or Mistral without needing a heavy cloud setup.

    What’s new in v0.20.5:

    • Channel Update: This release includes specific updates to the `openclaw` channel messaging system.
    • Refined Communication: The update focuses on improving how certain messages are handled within that specific channel, ensuring smoother interactions during model usage.

    It’s a small but precise tweak to keep things running smoothly for all you local-LLM enthusiasts! 🛠️

    🔗 View Release

  • Ollama – v0.20.5-rc2

    Ollama – v0.20.5-rc2

    Ollama just dropped a new release candidate, v0.20.5-rc2! 🚀

    If you aren’t running Ollama yet, you are missing out on the gold standard for local LLM orchestration. It handles all the heavy lifting—downloading, configuring, and running models like Llama 3, DeepSeek-R1, and Mistral—so you can experiment with massive power without a cloud budget.

    What’s new in this release:

    • Openclaw Integration Update: This specific release candidate includes an update to the Openclaw channel message handling (#15465). 🛠️

    While this is a targeted fix, keeping an eye on these RC (Release Candidate) builds is essential for us tinkerers to catch any stability hiccups before the official rollout. Keep those local environments primed and ready! 🤖✨

    🔗 View Release

  • Ollama – v0.20.5-rc1

    Ollama – v0.20.5-rc1

    New update alert for Ollama! 🚨

    If you’re running local LLMs like Llama 3, DeepSeek-R1, or Mistral, there’s a fresh release candidate out: v0.20.5-rc1. This update focuses on making your local development workflow much smoother by reducing guesswork when things go wrong.

    What’s new in this release:

    • Improved Error Feedback: The team has introduced a “re-run hint” specifically for dependency errors. 🛠️
    • Smarter Troubleshooting: If a model pull or setup fails because of a missing dependency, the error message will now explicitly nudge you to try running the command again.

    It’s a small but super practical tweak designed to keep your momentum going and help you bypass those annoying setup hiccups without having to dig through logs! 🚀

    🔗 View Release

  • Ollama – v0.20.5-rc0

    Ollama – v0.20.5-rc0

    New update alert for Ollama! 🚨

    If you’re running local LLMs like Llama 3, DeepSeek-R1, or Mistral, a new release candidate (v0.20.5-rc0) just dropped to help make your debugging sessions much less headache-inducing.

    What’s new in this release:

    • Improved Error Messaging: This update specifically fixes how the system handles unknown input item types in responses.
    • Better Debugging: Instead of encountering vague errors that leave you guessing, you’ll now receive much clearer feedback when an unexpected input type is encountered.

    This is a great little tweak for anyone building custom pipelines or experimenting with complex prompts where input types might shift. It makes tracking down configuration errors way smoother! 🛠️

    Keep tinkering! 🚀

    🔗 View Release

  • Lemonade – v10.2.0

    Lemonade – v10.2.0

    Lemonade just dropped a massive v10.2.0 update! 🍋 If you’ve been looking for a high-performance way to run LLMs locally using your GPU or NPU (like AMD Ryzen AI), this toolkit is essential for maximizing your hardware.

    What’s New in v10.2.0:

    • Expanded Integrations: New dedicated user guides for Claude Code integration, plus support for OpenCode via launchpad. 🚀
    • Developer-Friendly Artifacts: New embeddable release artifacts are now available for both Ubuntu and Windows—perfect if you’re bundling Lemonade into your own custom installers.
    • Enhanced Resource Management: The system tray now features a “Max Loaded Models” submenu, making it much easier to manage your VRAM and system resources on the fly.
    • Improved Model Handling & Vision Support:
    • Complete overhaul of the multi-checkpoint storage strategy.
    • Added support for Qwen Image with extra `sd-cpp` parameters.
    • Fixed device reporting bugs in `llamacpp` and `whispercpp` stats.
    • Smart Automation: Introduced a “Smart CLI pull” command to streamline fetching models, along with auto-detection for GPU backends during ESRGAN upscaling. 🛠️

    Platform Availability:

    This release is ready to go across almost every major environment:

    Windows: `.msi` installers (Server + App or Server Only). Note: Windows installers are now officially signed!* ✅

    • Linux: Ubuntu 24.04+ (via Launchpad PPA or AppImage), Fedora 43 (`.rpm`), and Docker/Snap/Arch options.
    • macOS: Beta support is available via `.pkg`.

    🔗 View Release

  • Ollama – v0.20.4: responses: add support for fn call output arrays (#15406)

    Ollama – v0.20.4: responses: add support for fn call output arrays (#15406)

    Ollama Update: v0.20.4 is here! 🛠️

    If you’re running LLMs locally, listen up! The latest release of Ollama focuses on making function calls much more reliable and expanding how the engine handles complex data structures. No more unexpected crashes when your model tries to return a list instead of a single string.

    What’s new in v0.20.4:

    • Enhanced Function Call Support: The engine now supports arrays for function call outputs. Previously, it only handled strings, which caused errors when the output contained multiple pieces of data.
    • Multi-Content Array Support: Beyond just text, the update adds support for arrays containing both image and text content. This makes response handling much more robust for complex, multi-modal tasks. 🖼️
    • Critical Bug Fix: Specifically resolves the `json: cannot unmarshal array into Go struct field` error that was breaking workflows whenever models returned structured list data.

    A quick heads-up for the tinkerers: While this adds great support for text and image content, interleaving (mixing) images and text within these specific arrays is currently limited. Also, keep an eye out—file content support is slated for a future update! 🚀

    🔗 View Release

  • Ollama – v0.20.4-rc2: gemma4: Disable FA on older GPUs where it doesn’t work (#15403)

    Ollama – v0.20.4-rc2: gemma4: Disable FA on older GPUs where it doesn’t work (#15403)

    Ollama – v0.20.4-rc2 🚀

    Ollama continues to be the essential toolkit for anyone looking to run large language models locally, providing a seamless way to experiment with privacy and speed on your own hardware.

    This release focuses on improving stability for users running the gemma4 model:

    • Flash Attention (FA) Compatibility Fix: To prevent crashes, Flash Attention is now automatically disabled on older GPU hardware.
    • Hardware Awareness: Specifically, if your CUDA version is older than 7.5, the system will bypass FA since that hardware lacks the necessary support for the gemma4 model.

    This is a great win for those of us working with slightly older gear—you can now deploy these cutting-edge models without worrying about unexpected errors or stability issues! 🛠️

    🔗 View Release

  • MLX-LM – v0.31.2

    MLX-LM – v0.31.2

    MLX LM: Run LLMs with MLX just dropped an update! 🚀

    If you’re rocking Apple silicon, this package is your best friend for generating text and fine-tuning large language models using the high-performance MLX framework. It’s incredibly versatile, supporting quantization, distributed inference, and thousands of models directly from the Hugging Face Hub.

    What’s new in mlx-lm v0.31.2:

    This minor release is all about internal stability and refining how the engine handles data during inference:

    • Align batch logits processor token contract: This technical tweak ensures more consistent behavior within the batch logits processor, keeping your token handling predictable and smooth. 🛠️

    It might be a small adjustment under the hood, but it’s exactly the kind of refinement that keeps local LLM workflows stable for devs and tinkerers alike!

    🔗 View Release