Lemonade – v9.0.4
π Lemonade v9.0.4 just dropped β and itβs a game-changer for local LLM folks!
- Vulkan, ROCm & Metal are now fully updated to crush the latest Llama.cpp models β faster inference, smoother performance, better hardware love.
- New SOTA models added: Qwen3-VL (yes, multimodal!), FLM2-MoE, and Granite 4.0 MoE β all ready to load in the model manager.
- Infinite inference timeouts? Done. No more hanging on long prompts β your GPU/NPU stays busy, not bored.
- Cleaner installs: zstd purged from .deb, CMakeLists reorganized for sanity (no more “why is this so messy?” moments).
- Health & models endpoints now quiet by default β less noise, more focus.
- FAQ added: Stuck on `HF_HOME`? Weβve got your back now.
- Fixed: RAI detection, startup glitches, test failures β and finally removed those outdated Open WebUI refs.
- Default host address updated in README β less confusion on first launch.
Plus: A shiny new project roadmap is live π β and huge props to @VladimirVLF for their first contribution!
Upgrade. Load up those MoE models. Break some benchmarks. π€π₯
