Text Generation Webui – v4.3.1
text-generation-webui v4.3.1 is officially live, and itโs a massive one for anyone looking to push the boundaries of local LLM inference! ๐
This Gradio-based web UI is essentially the “AUTOMATIC1111” equivalent for text generation, providing a comprehensive interface to run Large Language Models locally with support for multiple backends like llama.cpp, Transformers, and ExLlama.
Hereโs whatโs new in this release:
- Model & Inference Upgrades:
- ๐ Gemma 4 Support: Full integration including tool-calling capabilities in both the API and UI.
- ik_llama.cpp Backend: New support via portable builds (or the `–ik` flag for full installs) offering specialized optimizations for MoE models, improved CPU inference, and highly accurate KV cache quantization.
- Transformers Optimization: The UI now auto-detects `torch_dtype` from model configs instead of forcing bf16/f16.
- ExLlamaV3 Fixes: Resolved issues with Qwen3.5 MoE loading and fixed `ban_eos_token` functionality.
- API Enhancements:
- The `/v1/completions` endpoint now supports `echo` and `logprobs` parameters, returning token-level probabilities and new `top_logprobs_ids`.
- Performance & UI Tweaks:
- Snappier Interface: A custom Gradio fork has been optimized to save up to 50ms per UI event (like button clicks).
- Smarter Templates: Instruction templates now detect from model metadata rather than relying on filename patterns.
- Security & Stability:
- Fixed a critical ACL bypass in the Gradio fork for Windows/macOS.
- Added server-side validation for input components (Dropdown, Radio, etc.).
- Patched an SSRF vulnerability in superbooga extensions by validating fetched URLs against private networks.
๐ ๏ธ Pro-tip for updating: If you’re using a portable install, just download the latest version and replace your `user_data` folder. Since version 4.0, you can actually keep `user_data` one level up (next to your install folder) to make future updates even smoother!
