• Ollama – v0.22.1-rc0: New models (#15861)

    Ollama – v0.22.1-rc0: New models (#15861)

    Ollama just dropped a fresh release candidate (v0.22.1-rc0), and it’s packed with some heavy-hitting model updates and precision improvements! If you’re running local LLMs, this one is definitely worth a look for better quantization and smarter logging. 🛠️

    Here’s the lowdown on what’s new:

    • New Model Support: The team has added support for the Laguna models (via both `mlx` and `ggml`) and implemented support for Nemotron 3 Nano Omni.
    • FP8 Precision Upgrades: A big win for efficiency! Ollama can now import FP8 safetensors. It intelligently handles decoding HF F8_E4M3 weights and uses source-precision metadata to decide the best quantization path (like defaulting FP8-sourced GGUFs to Q8_0). This means better quality when compressing models.
    • Improved Logprobs: The server now preserves `logprobs` during generation, even when using built-in parsers. Previously, logprob-only chunks could get dropped if the parser was buffering content; now, that data stays intact for much more accurate probability tracking. 📈
    • Poolside Integration: Added integration and updated documentation for Poolside, expanding your local ecosystem options.
    • Performance & Fixes: Includes various performance improvements for review comments, updates to the cache setup, and several bug fixes to keep things running smoothly.

    Time to pull that new image and test out those FP8 weights! 🚀

    🔗 View Release

  • Ollama – v0.22.0

    Ollama – v0.22.0

    Ollama v0.22.0 is officially here! 🛠️

    If you’ve been looking for a way to run heavy-hitting LLMs like Llama 3, DeepSeek-R1, or Mistral directly on your own hardware without the cloud latency, Ollama remains the gold standard for local execution. It handles all the heavy lifting of model management and provides a slick REST API for your custom dev projects.

    What’s new in this update:

    • Enhanced Model Support: The library continues to expand, making it even easier to pull and run the latest open-source weights with zero configuration.
    • Performance Optimizations: This release includes under-the-hood tweaks to the inference engine to ensure smoother token generation on both macOS and Linux.
    • Improved CLI Workflow: Smoother management for downloading and switching between different model versions via the command line.

    Whether you’re building a local RAG pipeline or just want a private chatbot that works offline, this update keeps your local ecosystem running at peak performance! 🚀

    🔗 View Release

  • Ollama – v0.22.0

    Ollama – v0.22.0

    🚀 Ollama Update Alert! 🚀

    If you’re running your local LLMs on Apple Silicon, listen up! The latest release (v0.22.0-rc1) is officially here, and it’s bringing some massive performance optimizations via an MLX update. This is a huge deal for anyone trying to squeeze every bit of juice out of their Mac hardware.

    Here’s the breakdown of what’s new:

    • Batch Processing Power: The `mlxrunner` now supports batching the sampler across multiple sequences. If you’re working with large datasets or need to generate multiple outputs at once, this is a massive efficiency win! 📈
    • NVIDIA & MLX Bridge: In a super cool move for cross-platform workflows, MLX now supports importing models optimized via NVIDIA TensorRT. This makes it way easier to move your heavy-duty workflows between NVIDIA and Apple hardware without the headache.
    • Precision Tokenization: A bug fix for multi-regex BPE offset handling is included, ensuring your tokenization stays precise and error-free during complex text processing tasks.

    Time to pull that update and start benchmarking! 🛠️

    🔗 View Release

  • Lemonade – v10.3.0: Refine collection image reply behavior (#1726)

    Lemonade – v10.3.0: Refine collection image reply behavior (#1726)

    🍋 Lemonade SDK v10.3.0 is officially live!

    If you’ve been looking for a way to run high-performance LLMs locally without relying on the cloud, Lemonade is your new best friend. It’s a powerhouse toolkit designed to squeeze every bit of performance out of your hardware by leveraging NPUs (like AMD Ryzen AI) and GPUs via Vulkan support. Whether you’re using GGUF or ONNX models, it provides an OpenAI-compatible API endpoint so you can swap cloud services for local privacy in a snap.

    The latest update focuses on polishing the collection management experience, specifically making image handling within replies much more predictable and stable. 🛠️

    What’s new in v10.3.0:

    • Refined Image Behavior: No more jumping layouts! Image replies are now properly anchored, and rendering is much more consistent across collections.
    • Smoother Navigation: The logic for scrolling and sizing collection images has been streamlined, making the UI feel much more fluid.
    • Single Source of Truth: To prevent “configuration drift,” the fixed image size (512×256) is now centralized in a single TypeScript constant (`collectionImageConfig.ts`). This keeps your CSS, runtime properties, and build-time tools perfectly synced.
    • Stability Boosts: Includes important bug fixes for the backend manager layout and general system stability improvements.

    Perfect for anyone building custom local AI interfaces or managing large model collections! 🚀

    🔗 View Release

  • ComfyUI – v0.20.1

    ComfyUI – v0.20.1

    ComfyUI v0.20.1 🎨

    If you live for node-based generative workflows, you know ComfyUI is the gold standard for granular control over Stable Diffusion and other models. It’s the ultimate playground for anyone who loves connecting nodes to build complex, custom AI image pipelines!

    This specific update is a quick patch! The developer pushed v0.20.1 primarily to fix a release hiccup caused by GitHub technical issues. 🛠️

    • Stability Fix: This tiny increment ensures your installation stays stable and correctly tagged following recent GitHub technical glitches.
    • Reliable Deployment: While there aren’t massive new features in this specific patch, it’s a crucial “under the hood” fix to keep your workflows running smoothly without versioning headaches.

    🔗 View Release

  • ComfyUI – v0.20.0

    ComfyUI – v0.20.0

    ComfyUI v0.20.0 is officially here! 🚀

    If you haven’t dived into ComfyUI yet, it is the ultimate node-based powerhouse for Stable Diffusion. It lets you build complex, professional-grade image and video generation pipelines by simply connecting nodes—no heavy coding required. Whether you’re upscaling, inpainting, or experimenting with ControlNet, this tool gives you total granular control over your creative workflow.

    This major version bump to v0.20.0 marks a massive milestone for the Comfy-Org ecosystem! While we’re keeping an eye on the fine print, a jump this significant typically brings:

    • Core Engine Optimizations: Refined data processing between nodes to squeeze out even more speed during the sampling process.
    • Enhanced Node Compatibility: Smoother integration for custom node suites and updated support for the latest heavy-hitters like SDXL and newer Flux architectures.
    • Workflow Stability: Critical fixes aimed at reducing memory leaks, making those massive, multi-step generation sessions much more reliable.

    Pro-tip for the tinkerers: Whenever you see a major version jump like this, it’s time to fire up your custom node managers! Run a quick update on all your extensions to ensure everything stays compatible with the new core engine. 🛠️

    🔗 View Release

  • Ollama – v0.21.3-rc0

    Ollama – v0.21.3-rc0

    Ollama just dropped a new release candidate, v0.21.3-rc0, and it’s bringing some serious brainpower to your local LLM workflows! 🛠️

    If you aren’t using Ollama yet, it is the ultimate toolkit for running powerful models like Llama 3, DeepSeek-R1, and Mistral directly on your own hardware. It handles all the heavy lifting of downloading and configuring models so you can focus on building.

    Here’s what’s new in this RC update:

    • Reasoning Effort Support: This is a game-changer for anyone playing with chain-of-thought models! The update maps “reasoning effort” to the “think” parameter, giving you much finer control over how much computational “thinking” time a model spends on a prompt. 🧠
    • OpenAI Compatibility Tweaks: The release includes specific updates to better handle OpenAI-style map responses, making it even smoother to swap between APIs without breaking your integration.

    If you’re tinkering with reasoning-heavy models and want to ensure your API calls are handling thought tokens perfectly, grab this RC and give it a spin! ✨

    🔗 View Release

  • Tater – Tater v74

    Tater – Tater v74

    🥔 Tater v74 — “Who Goes There?” is here! 📡🗣️

    Get ready to take your local AI stack completely off-grid! Tater just leveled up from a standard local assistant into a decentralized, identity-aware powerhouse. If you love privacy and autonomy, this update is a massive win for your local setup.

    Meshtastic Portal — Off-Grid Communication

    Tater can now whisper over radio waves using Meshtastic! You can now send and receive messages across a mesh network without needing any internet or cell towers.

    • Fully Local & Encrypted: Your data stays strictly within your nodes.
    • Remote Alerts: Send notifications across your mesh network even in total dead zones.
    • Decentralized Power: Deploy tiny, independent Tater networks that operate entirely via radio waves.

    Speaker ID — Voiceprint Recognition

    Tater just got a lot more observant! We’ve added local voiceprint enrollment so the system can identify specific users directly on your hardware.

    Identity-Aware Control: Tater recognizes you* specifically before any other processes even kick in.

    • Privacy-First: Everything happens locally on your machine—zero cloud processing involved.
    • Seamless Interaction: If an unrecognized voice speaks, Tater simply continues its routine without interruption.

    This release moves Tater beyond simple automation and into the realm of identity-aware, off-grid communication. Whether you’re building a private mesh or just want your AI to know exactly who is talking to it, v74 has you covered! 🧠✨

    🔗 View Release

  • Ollama – v0.21.2

    Ollama – v0.21.2

    Ollama v0.21.2 is officially live! 🚀

    If you’re looking to run heavy-hitting LLMs like Llama 3, DeepSeek-R1, or Mistral directly on your hardware without relying on the cloud, Ollama is your best friend. It turns the complex process of managing local models into a seamless, one-command experience across macOS, Windows, and Linux.

    This latest patch focuses on polishing the user experience and tightening up the engine:

    • Smoother Onboarding: The OpenClaw onboarding flow has been hardened, making that first-time setup much more robust and less prone to hiccups. 🛠️
    • Enhanced Stability: This update includes critical refinements to the underlying launch processes, ensuring your local instances spin up reliably every single time.

    Perfect for those of us building local RAG pipelines or just experimenting with privacy-first AI! 🥔✨

    🔗 View Release

  • Ollama – v0.21.2-rc1

    Ollama – v0.21.2-rc1

    Ollama just dropped a new release candidate, v0.21.2-rc1, and it’s all about smoothing out that initial setup! 🛠️

    If you’re looking to run heavyweights like Llama 3, DeepSeek-R1, or Mistral locally without the headache of manual configuration, this is your go-to tool. It handles all the heavy lifting for model weights and parameters so you can jump straight into prompting.

    What’s new in this release:

    • Hardened OpenClaw Onboarding: The big win here is a much more robust and “hardened” onboarding flow for OpenClaw. 🚀

    The team is clearly focusing on polishing the first-run experience, making sure that even complex local setups are reliable and hiccup-free from the very first click. Perfect for anyone looking to experiment with local LLMs without fighting the installation process!

    🔗 View Release