Text Generation Webui – v4.5

Written by

Text Generation Webui – v4.5

Big news for the local LLM crowd! The legendary text-generation-webui has officially undergone a rebrand and is now known as TextGen! 🚀 This update brings some much-needed stability and performance tweaks to your local inference workflows.

Here is what’s new in this release:

VRAM & Performance Optimization: There is a reduction in peak VRAM usage during prompt logprobs forward passes. If you are running tight hardware setups or trying to squeeze maximum context into your GPU, this is a massive win! 🧠
Improved UI/UX:
Reading long conversations just got easier with a new sky-blue color for quoted text in light mode.
Significant bug fixes prevent chat scrolling from getting stuck on “thinking” blocks and stop tool icons from shrinking during long calls.
Critical Bug Fixes:
Gemma-4 Tool Calling: Fixed issues with handling double quotes and newline characters in arguments, ensuring much more reliable agentic behavior. 🛠️
Token Management: Resolved issues where BOS/EOS tokens weren’t being set correctly for models lacking chat templates, and fixed duplicate BOS token prepending in ExLlamav3.
Under-the-Hood Updates:
The project has moved! Find the new home at `github.com/oobabooga/textgen`.
Includes the latest versions of `llama.cpp` and `ik_llama.cpp` for better backend support.

If you’ve been tinkering with tool-calling models or struggling with VRAM spikes, this is a must-have update for your local stack! 💻✨

🔗 View Release

Text Generation Webui – v4.5

More posts

Ollama – v0.30.0-rc31

Ollama – v0.30.0-rc30

Ollama – v0.30.0-rc29

Ollama – v0.30.0-rc28