Category: AI

AI Releases

  • Crankboy App – v1.1.1

    Crankboy App – v1.1.1

    CrankBoy v1.1.1 just landed โ€” and itโ€™s the quiet hero your Playdateโ€™s been waiting for ๐ŸŽฎ๐Ÿ’™

    Fixed a nasty startup crash on older Linux distros (Ubuntu 20.04, we see you). No more “why wonโ€™t it launch?!” โ€” just instant GB/GBC nostalgia.

    Under the hood:

    • Smoother button responses with subtle UI polish
    • Security deps updated (your ROMs stay safe, no funny business ๐Ÿ˜‰)
    • Better error logs = faster fixes thanks to sharp-eyed community reports

    No flashy features โ€” just a rock-solid, buttery-smooth emulator so you can focus on the real magic: pixel-perfect gameplay and chiptune battles.

    Grab it. Boot up your favorite game. And let the retro vibes roll. ๐Ÿ•น๏ธ

    ๐Ÿ”— View Release

  • Text Generation Webui – v3.20

    Text Generation Webui – v3.20

    ๐ŸŽจ Image Generation is LIVE in Text-Generation-WebUI v3.20!

    Now generate images right inside your LLM UI with `diffusers` โ€” Z-Image-Turbo supported, 4bit/8bit quantized, `torch.compile` optimized, and PNGs auto-stash your generation params. Gallery? Check. Live progress bar? Yep. OpenAI-compatible image API? Absolutely ๐Ÿค–โœจ

    โšก Faster text gen too!

    `flash_attention_2` is now ON by default for Transformers models โ€” smoother, quicker responses.

    ๐Ÿ“ฆ Smaller Linux CUDA builds โ€” download faster, run just as hard.

    ๐Ÿ”ง llama.cpp updated to latest (0a540f9) + ExLlamaV3 v0.0.17 for better inference stability and speed.

    ๐Ÿ–ผ๏ธ Prompt magic upgrade!

    Pass `bos_token` and `eos_token` directly into Jinja2 templates โ€” perfect for Seed-OSS-36B-Instruct and similar models.

    ๐Ÿš€ Portable builds now include:

    • NVIDIA: `cuda12.4`
    • AMD/Intel: `vulkan`
    • CPU only: `cpu`
    • Mac (Apple Silicon): `macos-arm64`

    ๐Ÿ’พ Updating? Just replace the app โ€” keep your `user_data/` folder and all your models, LoRAs, and settings intact.

    Go make art. Or let the AI do it for you. ๐Ÿ˜Ž๐Ÿ–ผ๏ธ

    ๐Ÿ”— View Release

  • Ollama – v0.13.2-rc2: ggml: handle all streams (#13350)

    Ollama – v0.13.2-rc2: ggml: handle all streams (#13350)

    ๐Ÿš€ Ollama v0.13.2-rc2 just dropped โ€” and itโ€™s a quiet win for stability!

    The big fix? ggml now handles all GPU/CPU streams properly. No more leaked buffers or misaligned memory. Think of it as finally tidying up your AI workshop so every tensor has its place.

    โœจ Why youโ€™ll care:

    • Smoother inference on multi-GPU setups
    • Fewer crashes during heavy async loads
    • Better memory cleanup = longer, happier sessions

    If youโ€™ve been battling weird memory hiccups with Llama 3 or DeepSeek-R1 on Linux/macOS/Windows โ€” this is your upgrade. Quiet change, huge impact. ๐Ÿ’จ

    Upgrade now and run like a champ.

    ๐Ÿ”— View Release

  • Lemonade – v9.0.8

    Lemonade – v9.0.8

    ๐Ÿš€ Lemonade v9.0.8 just dropped โ€” and itโ€™s a game-changer for local LLM folks!

    • FLM server hostname? Now configurable. No more fighting hardcoded defaults โ€” deploy how you want. ๐ŸŽฏ
    • Override `llama-server` path via env vars โ€” perfect for custom builds, containers, or weird dev setups. ๐Ÿ› ๏ธ
    • CPU backend is LIVE! Run LLMs on CPU without GPU โ€” ideal for dev, testing, or low-power machines. ๐Ÿ–ฅ๏ธ
    • Debate Arena v2 is here! Smarter, smoother multi-model debates with better eval โ€” test personalities like a pro. ๐Ÿ’ฌ๐Ÿง 
    • Huge props to @bitgamm for their first contribution โ€” welcome to the crew! ๐Ÿ‘

    GGUF + ONNX? Check. OpenAI API compat? Check. Windows & Linux? Double check.

    Time to spin up your next local LLM experiment โ€” faster, freer, and more flexible than ever. ๐Ÿš€

    ๐Ÿ”— View Release

  • Ollama – v0.13.2

    Ollama – v0.13.2

    ๐Ÿš€ Ollama v0.13.2-rc0 just dropped โ€” and itโ€™s a quiet hero update!

    โœ… Multi-GPU CUDA setups? Finally detected properly. No more leaving GPUs on the bench.

    ๐Ÿง  DeepSeek-V3.1โ€™s “thinking” mode? Fixed โ€” it wonโ€™t randomly activate when disabled (goodbye, phantom pondering).

    Huge props to our new contributors: ๐Ÿ‘ @chengcheng84 & @nathan-hook โ€” welcome to the crew! First PRs = nailed it.

    Smooth sailing ahead. Update now and run your models faster, cleaner, and with zero GPU drama.

    ๐Ÿ”— Full details: [v0.13.1…v0.13.2-rc0]

    ๐Ÿ”— View Release

  • Lemonade – v9.0.7

    Lemonade – v9.0.7

    ๐Ÿ”ฅ Lemonade v9.0.7 just dropped โ€” and itโ€™s chaos in the best way.

    Introducing Debate Arena: run 8 LLMs at once in your browser and watch them argue like AI philosophers on caffeine. Ministral-3 vs SmolLM3? Phi4 roasting LFM2? Pure digital TED Talk madness.

    โœจ Whatโ€™s new:

    • ๐ŸŽค `llm-debate.html` โ€” drop it in your browser, hit play, and enjoy the AI showdown.
    • ๐Ÿš€ Load up to 8 GGUF models simultaneously with `lemonade-server serve –max-loaded-models 8`.
    • ๐Ÿ› ๏ธ Fixed web publishing, updated deps to GitHubโ€™s latest, and unveiled the Lemonade Manager (Phase 1) โ€” sleeker, faster, smarter.

    ๐Ÿ’ป Grab the `.msi` (Windows) or `.deb` (Linux), fire it up, and let your GPU do the talking.

    No cloud. No limits. Just pure local LLM mayhem. ๐Ÿค–๐Ÿ’ฅ

    Check it out: https://github.com/lemonade-sdk/lemonade/blob/main/examples/demos/llm-debate.html

    ๐Ÿ”— View Release

  • Ollama – v0.13.2-rc0: ggml update to b7108 (#12992)

    Ollama – v0.13.2-rc0: ggml update to b7108 (#12992)

    Ollama v0.13.2-rc0 just dropped โ€” and itโ€™s a speed demon ๐Ÿš€

    The big win? ggml updated to b7108, powering faster, leaner LLM inference across the board.

    Hereโ€™s whatโ€™s new:

    • โœ… TopK sampling optimized โ€” smarter token selection, especially on big vocab models.
    • โœ… Metal argsort fixed โ€” M-series chips now run smoother than ever ๐Ÿ
    • โœ… Bakllava image-to-text regression patched โ€” multimodal models are back in business.
    • ๐Ÿšจ Projector metadata warning โ€” if youโ€™re using multimodal GGUF files, double-check your metadata.
    • โš ๏ธ Vulkan fixes temporarily reverted โ€” stability first, speed later.

    This is a release candidate โ€” stable enough for daily use, fresh enough to feel the gains. If youโ€™re on Apple Silicon? This is your upgrade.

    Update now and keep those models rolling. ๐Ÿค–๐Ÿ’ป

    ๐Ÿ”— View Release

  • MLX-LM – v0.28.4

    MLX-LM – v0.28.4

    ๐Ÿš€ mlx-lm v0.28.4 is live โ€” and itโ€™s a beast!

    New models? Oh yeah:

    โœ… Minimax-M2, Kimi Linear, Trinity/AfMoE, Ministral3

    โœ… DeepSeek V32 โ€” now in the fold

    โœ… Kimi K2 & OLMo3 fixed for seamless loading

    Performance got a turbo boost:

    ๐Ÿš€ Batching in server mode = faster multi-request handling

    ๐Ÿ’ก Multi-prompt cache now holds multiple prompts at once (chat apps, rejoice!)

    ๐Ÿง  DWQ (Dynamic Weight Quantization) โ€” run massive models with less memory, same punch

    Fixed the niggles:

    ๐Ÿ”ง Adapter loading typo? Gone.

    ๐Ÿงฉ parallel_residual now works on GPTNeoX

    ๐Ÿ“ฆ SentencePiece dependency added โ€” no more tokenizer fails!

    Under the hood:

    ๐Ÿ”„ Switched to GitHub Actions for smoother CI

    ๐Ÿ’ฌ Better type hints โ€” mypy fans, youโ€™re welcome

    ๐Ÿงช Flaky tests squashed + LORA fusion now plays nice with non-affine quantization

    Big shoutout to new contributors: @jyork03, @spotbot2k, @sriting, @tnadav, @Deekshith-Dade โ€” welcome to the crew! ๐ŸŽ‰

    Upgrade. Tweak. Crush your next LLM project. ๐Ÿ’ช

    ๐Ÿ”— View Release

  • Lemonade – v9.0.6

    Lemonade – v9.0.6

    ๐Ÿš€ Lemonade v9.0.6 just dropped โ€” and itโ€™s a game-changer for local LLM folks!

    Now you can load multiple models at once โ€” LLMs, embeddings, and rerankers โ€” all running in parallel. No more restarting to switch contexts. ๐Ÿค–๐Ÿง 

    โœจ New goodies:

    • Run concurrent requests across models โ†’ smoother, faster workflows
    • Linux logs? Less spam. More chill. ๐Ÿง
    • `run` command now works even if the serverโ€™s already up โ€” no more “port in use” headaches
    • Selective tray unloading keeps RAM sane (bye-bye, memory bloat!)
    • Better docs + venv testing + more robust system info

    Try the live demo: open `examples/demos/multi-model-tester.html` in your browser and juggle 3 models like a pro.

    Perfect for devs running RAG pipelines, local agents, or just tinkering with multiple models side-by-side.

    Full changelog: [v9.0.5…v9.0.6](link)

    ๐Ÿ”— View Release

  • ComfyUI – v0.3.77

    ComfyUI – v0.3.77

    ComfyUI v0.3.77 is live โ€” quiet release, huge quality-of-life wins! ๐Ÿ› ๏ธ

    • Fixed critical crashes when loading workflows with missing or corrupted custom nodes โ€” no more sudden dead ends.
    • Smarter memory management for big image batches, especially on low-GPU setups โ€” less OOM, more generating.
    • Crisp node labels on high-DPI displays (finally, no more blurry text!).
    • Updated deps to patch security gaps and keep the backend rock-solid.

    If custom nodes or memory hiccups have been ruining your flow โ€” update now. No flashy features, just smoother, more stable AI tinkering. ๐Ÿ’ก

    Keep those workflows alive!

    ๐Ÿ”— View Release