Category: AI

AI Releases

  • Ollama – v0.13.0-rc0

    Ollama – v0.13.0-rc0

    🚀 Ollama v0.13.0-rc0 just dropped — and it’s packed with power!

    Say hello to DeepSeek-V3.1 (aka Deepseek2) — one of the most capable open LLMs out there, now available with a simple `ollama pull deepseek-ai/deepseek-v3.1`.

    Why it’s awesome:

    • 🚀 MLA (Multi-Layer Attention) is live — cuts memory use, speeds up inference, and keeps reasoning sharp.
    • 🛠️ New engine under the hood = smoother runs, fewer crashes, better future-proofing.
    • 💥 Run state-of-the-art reasoning on your laptop — no cloud needed.

    GGUF? Still supported. API? Still there. CLI? Even better.

    This isn’t just an update — it’s your ticket to running top-tier models locally, faster than ever.

    Go grab it:

    `ollama pull deepseek-ai/deepseek-v3.1`

    #LocalAI #DeepSeek #Ollama #LLM

    🔗 View Release

  • Heretic – v1.0.1

    Heretic – v1.0.1

    Heretic v1.0.1 is live 🎉 — the first public release of the fully automated LLM censorship remover is here, and it’s wilder than you thought.

    No more manual tuning. No labeled data. Just run `heretic Qwen/Qwen3-4B-Instruct-2507` and watch it surgically erase refusal layers using directional ablation. It’s like giving your model a caffeine IV while keeping its brain intact.

    🔥 What’s new in v1.0.1?

    • First stable release: Beta’s over — this is the real deal.
    • 🚀 8B model decensoring in ~45 mins on RTX 3090 — fast, lean, and mean.
    • 🧪 Improved KL divergence control: More original intelligence preserved post-ablation.
    • 💾 Save or push to Hugging Face with one command — no PhD needed.
    • 🛠️ Better MoE support: Now handles Qwen-MoE and Llama-MoE with fewer hiccups.
    • 📊 Enhanced eval suite: Auto-benchmarks refusal rates + output quality in one shot.

    Built with PyTorch 2.2+, AGPL-3.0 licensed, and ready to break the safety chains.

    Go run it. Then ask: “Why did we ever accept this?” 💥

    🔗 View Release

  • Chatterbox – v0.1.2

    Chatterbox – v0.1.2

    Chatterbox v0.1.2 just dropped—and it’s a game-changer for TTS tinkerers 🎙️

    M1/M2 Macs rejoice: Native support via MPS—no more Rosetta slowdowns.

    🔊 Safetensors everywhere: Faster, safer model loads + new WAV examples to play with.

    🛠️ CFG scaling optional: Dial realism or creativity like a knob—perfect for voice acting or AI bots.

    🐛 CUDA errors? Gone. GPU runs smoother than ever.

    🎮 Min_P sampler added for finer audio control—less robotic, more human.

    📚 Docs now crystal clear on OS/Python deps + watermarking (PerTh) best practices.

    📣 New Discord link fixed & live—join to share voice clones, memes, and cat meows 🐱🔊

    🌟 7 fresh contributors brought the heat—thank you!

    Install with `pip install chatterbox-tts` and start cloning voices (or your pet’s purr) in seconds.

    Full changelog: https://github.com/resemble-ai/chatterbox/commits/v0.1.2

    🔗 View Release

  • ComfyUI – v0.3.70

    ComfyUI – v0.3.70

    ComfyUI v0.3.70 just landed — and it’s the quiet hero your workflows have been waiting for 🚀

    • Memory got smarter — Fewer crashes on big SDXL or 4K renders. Keep those long pipelines running without hitting OOM hell.
    • Nodes won’t kill your whole graph — A single failed node? No problem. The rest of your canvas keeps humming along.
    • UI tweaks that matter — Smoother panning, fixed tooltip glitches, cleaner labels. Tiny changes, big comfort.
    • PyTorch & CUDA updates — Linux users, rejoice: better compatibility under the hood.

    Pro tip: Drop your batch size by 1 if you’ve been battling memory limits — you’ll be amazed how much longer your renders last.

    No flashy new nodes… just a more stable, reliable engine. Sometimes the best upgrades are the ones you don’t notice — because they just work. 💪

    🔗 View Release

  • ComfyUI – v0.3.69

    ComfyUI – v0.3.69

    ComfyUI v0.3.69 is live! 🎉

    • New `LatentUpscale` node – Upscale in latent space before decoding for sharper results + faster renders.
    • Smarter memory handling – Fewer crashes on big batches; VRAM spikes? Not today.
    • SDXL Refiner flow fixed – Seamless transitions between base and refiner—no more weird detail jumps.
    • Custom nodes reload properly – Finally! No more restarting ComfyUI after editing your favorite custom node.
    • WebAPI polish – Better compatibility with external tools and automation scripts.

    Perfect for high-res SDXL wizards and node-based tinkerers. Update now and keep those latents crisp! 💫

    🔗 View Release

  • Ollama – v0.12.11

    Ollama – v0.12.11

    🚀 Ollama v0.12.11 just dropped — and it’s a quiet gem for the detail-oriented folks!

    The big win? `logprob` now includes byte-level data 🎯

    No more guessing which bytes map to your tokens. Whether you’re debugging multilingual text, tracking tokenization edge cases, or building precision prompt tools — you now see exactly what’s happening at the byte level.

    Perfect for:

    • Prompt engineers wrestling with weird token splits
    • Researchers analyzing model confidence down to the byte
    • Devs building LLM debuggers or token analyzers

    No UI fluff, no breaking changes — just pure, nerdy utility.

    Upgrade with `ollama pull` and start seeing the hidden layers beneath your prompts. 💡

    🔗 View Release

  • Lemonade – 9.0.2: C++: General Availability (#549)

    Lemonade – 9.0.2: C++: General Availability (#549)

    🚨 Lemonade v9.0.2 just dropped — and C++ is officially GA! 🎉

    No more beta labels. The C++ server is now production-ready, faster, leaner, and built for real-world LLM serving. Here’s the breakdown:

    • C++ is now the future — Python NSIS installer and dev tools are gone. Focus fully on C++ for peak performance.
    • 🚫 Python server is deprecated — start migrating your workflows now.
    • 🔧 Fixed `make_http_request()` bugs + default host = localhost (no more weird network issues).
    • 📚 Docs overhauled — cleaner setup, less confusion.
    • 💥 Removed all `lemonade-server-dev` clutter — clean, production-ready codebase.
    • 📈 All version numbers bumped for clarity.

    If you’re running LLMs locally on Ryzen AI or Radeon GPUs — this is your moment. Drop the Python server, go C++, and unlock low-latency AI at scale. 🚀

    Time to sip something cold… and run LLMs faster than ever.

    🔗 View Release

  • Ollama – v0.12.11-rc1

    Ollama – v0.12.11-rc1

    Ollama v0.12.11-rc1 is here — and Windows GPU fans, rejoice! 🎉

    Vulkan support is officially back on track. No more “Vulkan not found” errors — your RTX, RX, or Arc cards can now accelerate LLM inference without a hitch.

    This is a quiet patch with huge impact: if you’ve been stuck on CPU-only runs, it’s time to fire up your GPU again.

    ⚠️ Still a release candidate — but if you’re on Windows and craving faster generations, this is the one to try.

    Pro tip: Update your Vulkan drivers first! Ollama’s fixed its end — now let your hardware do the heavy lifting. 🚀

    🔗 View Release

  • Lemonade – v8.2.2

    Lemonade – v8.2.2

    Lemonade v8.2.2 just dropped—and it’s a game-changer for local LLM tinkerers! 🚀

    • Vision-Language Models are live 🖼️🧠: Run LLaMA-based VLMs locally—image + text reasoning, no cloud needed.
    • Precise device control: `–device` flag now actually works—tune GPU/CPU with zero guesswork.
    • Linux stability fixed: No more crashes or phantom DLL deps. CLI’s solid now.
    • HF_HUB_CACHE supported: Smarter offline caching for Hugging Face models—perfect if your internet’s spotty.
    • Web UI glow-up: Cleaner layout + new enable_thinking toggle to make models pause & reason before replying.
    • Real-time stats endpoint: Monitor `prompt_tokens` live—ideal for optimizing prompts and performance.
    • FLM Chat Completions patched: No more broken mid-convo responses.

    All wrapped in faster inference and cleaner C++ code. If you’re running LLMs on Ryzen AI or Radeon GPUs—this is your must-update beta. 💪

    🔗 View Release

  • Deep-Live-Cam – Version 2.3c is out now!

    Deep-Live-Cam – Version 2.3c is out now!

    Deep-Live-Cam v2.3c is live 🎭✨

    Fixed that annoying dropdown bug in model & camera selection—no more weird options or blank menus. Now it just works.

    Only available via QuickStart for now (Windows/Mac Silicon users, this is your cue!).

    CUDA, CoreML, DirectML, OpenVINO—your GPU still gets the spotlight.

    Reload, swap faces in real-time, and keep breaking the internet (responsibly). 🚀

    🔗 View Release