Lemonade – v8.2.2

Lemonade – v8.2.2

Lemonade v8.2.2 just dropped—and it’s a game-changer for local LLM tinkerers! 🚀

  • Vision-Language Models are live 🖼️🧠: Run LLaMA-based VLMs locally—image + text reasoning, no cloud needed.
  • Precise device control: `–device` flag now actually works—tune GPU/CPU with zero guesswork.
  • Linux stability fixed: No more crashes or phantom DLL deps. CLI’s solid now.
  • HF_HUB_CACHE supported: Smarter offline caching for Hugging Face models—perfect if your internet’s spotty.
  • Web UI glow-up: Cleaner layout + new enable_thinking toggle to make models pause & reason before replying.
  • Real-time stats endpoint: Monitor `prompt_tokens` live—ideal for optimizing prompts and performance.
  • FLM Chat Completions patched: No more broken mid-convo responses.

All wrapped in faster inference and cleaner C++ code. If you’re running LLMs on Ryzen AI or Radeon GPUs—this is your must-update beta. 💪

🔗 View Release