Text Generation Webui – v3.21

Text Generation Webui – v3.21

πŸš€ Text Generation WebUI v3.21 just dropped β€” and it’s lighter, faster, smarter!

The portable builds are now leaner: no more bloated llama.cpp symlinks (Python .whl quirks, we see you πŸ˜…). They auto-recreate on first launch β€” clean, efficient, zero hassle.

πŸ”₯ Backend upgrades galore:

  • llama.cpp β†’ updated to latest ggml-org commit (5c8a717) β€” smoother inference, fewer crashes
  • ExLlamaV3 v0.0.18 β€” better quantization + smarter memory use
  • safetensors v0.7 β€” faster load times, tighter security
  • triton-windows 3.5.1.post22 β€” CUDA ops on Windows? Smoother than ever

πŸ“¦ Portable builds now come in 4 flavors:

  • πŸ–₯️ `cuda12.4` (NVIDIA)
  • πŸ’» `vulkan` (AMD/Intel GPUs)
  • 🧠 `cpu` (no GPU? no problem)
  • 🍏 `macos-arm64` (Apple Silicon optimized)

πŸ”„ Update? Just unzip β†’ replace only your `user_data/` folder. All your models, settings, themes β€” untouched. No reconfiguring. No stress.

Perfect for tinkerers who want power without the install drama. Grab it, unzip, and start generating πŸš€

πŸ”— View Release