Text Generation Webui – v4.9

Text Generation Webui – v4.9

🚀 Major Update Alert: text-generation-webui v4.9 is here!

If you’ve been looking for the “AUTOMATIC1111” experience for your local LLMs, this update is a massive win for efficiency and workflow smoothness. We’re seeing some serious upgrades to how the UI handles web data and inference speed! 🛠️

Smart Web Search Enhancements

  • Snippet Support: The `web_search` tool now pulls text excerpts directly from search results. Your model can grab answers without the heavy lifting of parsing entire pages!
  • Token Efficiency: The `fetch_webpage` tool is much leaner now, stripping out raw URLs to keep your context window clean and focused on the actual content.
  • Polished UI: Enjoy a new loading spinner during searches and much prettier result rendering in your chat interface.

Performance & Inference Tweaks

  • MTP Speculative Decoding: New support for `draft-mtpas` is live! It auto-enables when loading MTP GGUFs (like Qwen 3.6 MoE), which can significantly boost generation speeds.
  • Live Stats: Monitor your hardware in real-time with live tokens/s tracking and context size monitoring during generation.
  • Auto-mmproj Detection: No more manual hunting! The app now automatically detects and selects sibling `mmproj` files when you load a vision model.

UI & Workflow Improvements

  • Drag-and-Drop: You can now drag files directly into the chat input for lightning-fast uploads. 📂
  • Refined Sidebar: A reorganized sidebar (Mode/Character/Chat) and hidden reasoning controls in simple mode mean much less clutter while you work.
  • Electron Upgrades: New “Check for updates” button, a dedicated model directory folder picker, and a handy right-click context menu for easy text copying.

Security & Stability Fixes

  • Hardened Security: CORS is now restricted to `localhost` by default, and character name loading has been sanitized to prevent path traversal attacks. 🛡️
  • Windows Reliability: Fixed the bug where `llama-server` would hang after the parent process closed on Windows.
  • Dependency Refresh: The engine has been bumped up with the latest updates from `llama.cpp`, `ik_llama.cpp`, and `ExLlamaV3`.

🔗 View Release