Lemonade – v8.2.2

Written by

Tater Totterson

in

Lemonade – v8.2.2

Lemonade v8.2.2 just dropped—and it’s a game-changer for local LLM tinkerers! 🚀

Vision-Language Models are live 🖼️🧠: Run LLaMA-based VLMs locally—image + text reasoning, no cloud needed.
Precise device control: `–device` flag now actually works—tune GPU/CPU with zero guesswork.
Linux stability fixed: No more crashes or phantom DLL deps. CLI’s solid now.
HF_HUB_CACHE supported: Smarter offline caching for Hugging Face models—perfect if your internet’s spotty.
Web UI glow-up: Cleaner layout + new enable_thinking toggle to make models pause & reason before replying.
Real-time stats endpoint: Monitor `prompt_tokens` live—ideal for optimizing prompts and performance.
FLM Chat Completions patched: No more broken mid-convo responses.

All wrapped in faster inference and cleaner C++ code. If you’re running LLMs on Ryzen AI or Radeon GPUs—this is your must-update beta. 💪

🔗 View Release

More posts