Lemonade – v10.4.0: Fix ollama tool calling (#1780)
đ Lemonade SDK v10.4.0 is officially here!
If youâve been looking for a way to squeeze every bit of performance out of your local hardwareâespecially leveraging NPUs and GPUsâLemonade is the toolkit you need. It brings high-performance, private LLM serving right to your Windows or Linux machine with OpenAI API compatibility.
This latest update is a massive win for anyone building AI apps that rely on Ollama for local model execution. We’re seeing much better stability and much clearer communication between the SDK and your local engines.
Whatâs new in v10.4.0:
- Enhanced Model Visibility: You can now pull specific model capabilities and context window sizes directly via the Ollama API. No more “context window guesswork” when building your prompts! đ§
- Fixed Tool Calling Logic: We’ve squashed a major bug where tool calls were being sent in streaming mode, which was causing malformed responses on the client side. Your agents should behave much better now.
- Protocol Alignment: Fixed a mismatch between the Ollama protocol and the OpenAI-compatible protocol used by `llama-server`. This ensures requests forwarded through Lemonade are no longer rejected as malformed.
Pro Tip: This patch was specifically tested with Zed v1.0.0. If you use Zed’s agent chat with built-in Ollama support, your coding workflow just got a whole lot more reliable! đ ď¸
