Text Generation Webui – v4.5
Big news for the local LLM crowd! The legendary text-generation-webui has officially undergone a rebrand and is now known as TextGen! π This update brings some much-needed stability and performance tweaks to your local inference workflows.
Here is whatβs new in this release:
- VRAM & Performance Optimization: There is a reduction in peak VRAM usage during prompt logprobs forward passes. If you are running tight hardware setups or trying to squeeze maximum context into your GPU, this is a massive win! π§
- Improved UI/UX:
- Reading long conversations just got easier with a new sky-blue color for quoted text in light mode.
- Significant bug fixes prevent chat scrolling from getting stuck on “thinking” blocks and stop tool icons from shrinking during long calls.
- Critical Bug Fixes:
- Gemma-4 Tool Calling: Fixed issues with handling double quotes and newline characters in arguments, ensuring much more reliable agentic behavior. π οΈ
- Token Management: Resolved issues where BOS/EOS tokens weren’t being set correctly for models lacking chat templates, and fixed duplicate BOS token prepending in ExLlamav3.
- Under-the-Hood Updates:
- The project has moved! Find the new home at `github.com/oobabooga/textgen`.
- Includes the latest versions of `llama.cpp` and `ik_llama.cpp` for better backend support.
If you’ve been tinkering with tool-calling models or struggling with VRAM spikes, this is a must-have update for your local stack! π»β¨
