Text Generation Webui – v4.3
π¨ Text-Generation-WebUI v4.3 is live! π¨
Hey AI tinkerers & devs β fresh update dropped, and itβs packed with performance wins, new backends, and security upgrades. Hereβs the lowdown:
—
πΉ π₯ Brand-new backend: `ik_llama.cpp`
A high-octane fork by the imatrix creator, now baked into TGWU:
- β New quant formats (Q4_K_M, Q6_K, etc.)
- π§ Hadamard-based KV cache quantization β way more accurate, on by default
- β‘ Built for MoE models & CPU inference (yes, really fast)
β Grab it via `textgen-portable-ik` or `–ik` flag!
—
πΉ π§ API upgrades (OpenAI-compatible!)
The `/v1/completions` endpoint now supports:
- `echo`: Returns prompt + completion in one go
- `logprobs`: Token-level log probabilities (prompt & generated)
- `top_logprobs_ids`: Top token IDs per position β perfect for probing model confidence π―
—
πΉ π¨ Gradio UX + Security Boost
- π Custom Gradio fork = ~50ms faster UI interactions
- π Fixed ACL bypass (Windows/macOS path quirks)
- β Server-side validation for Dropdown/Radio/CheckboxGroup
- π‘οΈ SSRF fix in superbooga: blocks internal/private IPs
—
πΉ π§ Bug fixes & polish
- `–idle-timeout` now works for encode/decode + parallel generations β
- Stopping strings fixed (e.g., `<|return|>` vs `<|result|>`)
- Qwen3.5 MoE loads cleanly via ExLlamaV3_HF
- `ban_eos_token` finally works (EOS suppression at logit level)
—
πΉ π¦ Dependency upgrades
- π¦ `llama.cpp` β latest (`a1cfb64`) + Gemma-4 support
- π `ExLlamaV3` β v0.0.28
- π¦ `transformers` β 5.5
- β¨ Auto-detects `torch_dtype` from model config (override with `–bf16`)
- ποΈ Removed obsolete `models/config.yaml` β templates pulled from model metadata now
—
πΉ π Terminology update
“Truncation length” β now “context length” in logs (more accurate, less confusing!)
—
πΉ π¦ Portable builds β GGUF-ready & zero-install
| Platform | Build to Use |
|———-|————–|
| NVIDIA (old driver) | `cuda12.4` |
| NVIDIA (new driver, CUDA >13) | `cuda13.1` |
| AMD/Intel GPU | `vulkan` |
| AMD (ROCm) | `rocm` |
| CPU-only | `cpu` |
| Apple Silicon | `macos-arm64` |
| Intel Mac | `macos-x86_64` |
π Updating? Just swap the folder β keep `user_data/`, and now you can even move it one level up for shared use across versions π
—
Let me know if you want a quick-start walkthrough on `ik_llama.cpp` or portable builds! π οΈπ
