Text Generation Webui – v3.18

Text Generation Webui – v3.18

πŸ”₯ text-generation-webui v3.18 is live β€” and llama.cpp just leveled up!

  • πŸ–₯️ `–cpu-moe` flag dropped β€” offload MoE experts to CPU and run massive models on low-end GPUs. VRAM? Who needs it.
  • 🐧 ROCm support is HERE! AMD GPU users on Linux β€” rejoice. No CUDA? No problem.
  • 🍎 macOS 13 wheels retired. Time to update your OS if you’re still on Big Sur or earlier.
  • πŸš€ Backend upgrades:
  • llama.cpp β†’ latest commit (10e9780) β€” smoother, faster, more stable
  • ExLlamaV3 v0.0.15 β€” better quant, faster attention
  • peft 0.18.* β€” new LoRA magic for fine-tuning lovers
  • triton-windows 3.5.1.post21 β€” Windows inference just got a turbo boost

πŸ“¦ Portable builds? Still the best part.

Download β†’ unzip β†’ run. No pip, no install.

  • NVIDIA? `cuda12.4`
  • AMD/Intel? Use `vulkan`
  • CPU-only? `cpubuilds` is your hero
  • Mac M1/M2? `macos-arm64` β€” all set

πŸ”§ Upgrading? Just swap the binary. Your `user_data/` folder stays untouched β€” models, configs, themes… all safe.

Go run Llama 3 70B MoE on your old laptop. The future isn’t just local β€” it’s portable. πŸŽ’πŸ’»

πŸ”— View Release