Text Generation Webui – v3.14

Multi-GPU Performance: Enhanced `bitsandbytes` support for faster inference on multi-GPU setups (8 &#038; 4-bit quantization).
ExLlamaV3 Integration: New `/v1/internal/logits` endpoint added for advanced integrations using `exllamav3` and `exllamav3_hf`.
Qwen Support: Now supports Qwen 3-Next models with ExLlamaV3 (requires `flato`).
llama.cpp Update: Upgraded to the latest <a href="https://github.com/ggml-org/llama.cpp/tree/d00cbea63c671cd85a57adaa50abf60b3b87d86f">ggml-org/llama.cpp .
Dependency Updates: `transformers` (v4.57), `exllamav3` (v0.0.7), and `bitsandbytes` (v0.48) all updated!

Bug Fixes: Chat history loading & macOS portable build issues resolved.