Text Generation Webui – v4.3.3 – Gemma 4 support!
text-generation-webui just dropped a massive update! If you’re looking for the “AUTOMATIC1111” experience for local LLMs, this Gradio-based powerhouse is now even more capable and snappy. ๐
Here is the breakdown of whatโs new in this release:
๐ง New Model & Backend Support
- Gemma 4 Integration: Full support is officially live! You can now run Gemma 4 with full tool-calling capabilities via both the UI and the API.
- ik_llama.cpp Backend: A brand new backend option has arrived, offering much more accurate KV cache quantization (via Hadamard rotation) and specialized optimizations for MoE models and CPU inference.
๐ ๏ธ API & Transformer Enhancements
- Enhanced Completions: The `/v1/completions` endpoint now supports `echo` and `logprobs`, giving you deep visibility into token-level probabilities.
- Smarter Model Loading: The system now auto-detects `torch_dtype` from model configs, providing way more flexibility than the previous forced half-precision method.
- Metadata-Driven Templates: Instruction templates are now intelligently detected via model metadata instead of relying on filename patterns.
โก Performance & UI Polish
- Snappier Interface: A custom Gradio fork has been tuned to save up to 50ms per UI event, making the whole experience feel much more responsive.
- Critical Bug Fixes: Resolved several issues including dropdown crashes, API parsing errors for non-dict JSON tool calls, and `llama.cpp` template parsing bugs.
๐ก๏ธ Security & Stability
- Hardened Protections: Implemented ACL/SSRF fixes for extensions, patched path-matching bypasses on Windows/macOS, and added filename sanitization to prevent manipulation during prompt file operations.
๐ฆ Portable Build Upgrades
New self-contained packages are available for NVIDIA, AMD, Intel, Apple Silicon, and CPU users! Pro tip: You can now move your `user_data` folder one level up to easily share settings across multiple version installs. ๐ ๏ธ
๐ View Release