Text Generation Webui – v4.9
🚀 Major Update Alert: text-generation-webui v4.9 is here!
If you’ve been looking for the “AUTOMATIC1111” experience for your local LLMs, this update is a massive win for efficiency and workflow smoothness. We’re seeing some serious upgrades to how the UI handles web data and inference speed! 🛠️
Smart Web Search Enhancements
- Snippet Support: The `web_search` tool now pulls text excerpts directly from search results. Your model can grab answers without the heavy lifting of parsing entire pages!
- Token Efficiency: The `fetch_webpage` tool is much leaner now, stripping out raw URLs to keep your context window clean and focused on the actual content.
- Polished UI: Enjoy a new loading spinner during searches and much prettier result rendering in your chat interface.
Performance & Inference Tweaks
- MTP Speculative Decoding: New support for `draft-mtpas` is live! It auto-enables when loading MTP GGUFs (like Qwen 3.6 MoE), which can significantly boost generation speeds.
- Live Stats: Monitor your hardware in real-time with live tokens/s tracking and context size monitoring during generation.
- Auto-mmproj Detection: No more manual hunting! The app now automatically detects and selects sibling `mmproj` files when you load a vision model.
UI & Workflow Improvements
- Drag-and-Drop: You can now drag files directly into the chat input for lightning-fast uploads. 📂
- Refined Sidebar: A reorganized sidebar (Mode/Character/Chat) and hidden reasoning controls in simple mode mean much less clutter while you work.
- Electron Upgrades: New “Check for updates” button, a dedicated model directory folder picker, and a handy right-click context menu for easy text copying.
Security & Stability Fixes
- Hardened Security: CORS is now restricted to `localhost` by default, and character name loading has been sanitized to prevent path traversal attacks. 🛡️
- Windows Reliability: Fixed the bug where `llama-server` would hang after the parent process closed on Windows.
- Dependency Refresh: The engine has been bumped up with the latest updates from `llama.cpp`, `ik_llama.cpp`, and `ExLlamaV3`.
