Lemonade – v9.0.4

Lemonade – v9.0.4

πŸš€ Lemonade v9.0.4 just dropped β€” and it’s a game-changer for local LLM folks!

  • Vulkan, ROCm & Metal are now fully updated to crush the latest Llama.cpp models β€” faster inference, smoother performance, better hardware love.
  • New SOTA models added: Qwen3-VL (yes, multimodal!), FLM2-MoE, and Granite 4.0 MoE β€” all ready to load in the model manager.
  • Infinite inference timeouts? Done. No more hanging on long prompts β€” your GPU/NPU stays busy, not bored.
  • Cleaner installs: zstd purged from .deb, CMakeLists reorganized for sanity (no more “why is this so messy?” moments).
  • Health & models endpoints now quiet by default β€” less noise, more focus.
  • FAQ added: Stuck on `HF_HOME`? We’ve got your back now.
  • Fixed: RAI detection, startup glitches, test failures β€” and finally removed those outdated Open WebUI refs.
  • Default host address updated in README β€” less confusion on first launch.

Plus: A shiny new project roadmap is live πŸ“œ β€” and huge props to @VladimirVLF for their first contribution!

Upgrade. Load up those MoE models. Break some benchmarks. πŸ€–πŸ’₯

πŸ”— View Release