TaterBytes – Page 9

Ollama – v0.15.0-rc6

Ollama – v0.15.0-rc6

🚀 Ollama v0.15.0-rc6 just dropped — and it’s a quiet hero for GPU users!

If you’ve been hitting CUDA MMA errors when running quantized Llama models on your RTX card, breathe easy. This patch slays those sneaky crashes during inference.

✅ Fixed: CUDA MMA bugs in release builds

🚫 No more mysterious GPU crashes — stable, fast, local LLMs back on track

Perfect for devs pushing limits on NVIDIA hardware. GGUF? Still supported. API? Still sweet. Just… smoother.

Run it hard. Run it local. 🖥️🔥

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc5: llama: fix fattn-tile shared memory overflow on sm_50/52 (#13872)
Ollama – v0.15.0-rc5: llama: fix fattn-tile shared memory overflow on sm_50/52 (#13872)

🚀 Ollama v0.15.0-rc5 just landed — and it’s a quiet hero for legacy GPU folks!

If you’re rocking a GTX 900 series or Titan X (Maxwell, sm_50/52), this update fixes a sneaky shared memory overflow in Flash Attention’s tile kernel. 🛠️

What changed?
- Old: `nthreads=256` + `ncols=4` → blew past 48KB shared mem limit 💥
- New: `nthreads=128` → stays safely under 48KB ✅
No flashy features — just pure, sweet stability. No more OOM crashes during inference on older NVIDIA cards.

Perfect for tinkerers with budget rigs or vintage GPUs who refuse to give up local LLMs. Update, reload your model, and keep grinding! 🖥️🧠

🔗 View Release
January 24, 2026
Wyoming Openai – Response format fix & Groq Orpheus Update (0.4.0)
Wyoming Openai – Response format fix & Groq Orpheus Update (0.4.0)

🎙️ Wyoming OpenAI v0.4.0 just dropped—and it’s a game-changer for self-hosted voice systems!
- WAV is now default 🎧 No more crackly audio from HA’s auto-detection—pure PCM straight from OpenAI APIs. Clean, reliable, no surprises.
- Logs finally work 📝 Debug logs now show up properly. Say goodbye to mystery missing logs and hello to real-time debugging.
- Groq? Nah—Orpheus TTS is in! 🎭 Replaced PlayAI with Orpheus TTS (canopylabs/orpheus-v1-english)—open-source, LLM-powered, and emotionally expressive. Use `[laugh]` or `[whisper]` tags to shape tone. Your voice assistant just got soul.
- Dep upgrades 🚀 OpenAI lib updated to 2.15.0, ruff & pytest refreshed for speed + stability.
- CI security locked down 👮‍♂️ GitHub workflows now have explicit permissions—no more side-eye from your devsecops squad.
Install via `pip install wyoming-openai`, drop it into Home Assistant, and let Orpheus sing to your smart home. 🏠✨

v0.4.0 is live—go make your AI sound human.

🔗 View Release
January 24, 2026
Ollama – v0.15.0-rc4

Ollama – v0.15.0-rc4

Big news for local LLM folks! 🚀 Ollama v0.15.0-rc4 just dropped — and it’s got a quiet game-changer:

`ollama config` is now `ollama launch` 🎯

No more confusion between “configuring” and “starting” your server.

Just run `ollama launch` to fire up your local LLM — clean, intuitive, and way more obvious.

Your existing configs? Still there.

Your scripts? Time to update those aliases! 🛠️

Under the hood: smoother model loading, better stability, and a few sneaky performance tweaks.

Next stop: stable v0.15.0 👀

Time to refresh your workflow — your local LLM stack just got simpler.

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc3: Revert “model: add MLA absorption for glm4moelite (#13810)” (#13869)

Ollama – v0.15.0-rc3: Revert “model: add MLA absorption for glm4moelite (#13810)” (#13869)

🚨 Ollama v0.15.0-rc3 just dropped — and it’s a revert!

The team pulled back the MLA (Multi-Layer Attention) absorption patch for GLM4-MoE-Lite (#13810) in #13869.

Why? Stability. Compatibility. No coffee spills today. ☕🚫

This isn’t a feature drop — it’s a strategic pause. If you’re using GLM4-MoE-Lite, stick with v0.14.x for now. The MLA integration is still in the lab — expect something smoother, smarter, and more stable soon.

Ollama’s still your go-to for local LLMs: Llama 3, Mistral, Phi-4, Gemma — all running smooth. Just hold off on GLM4-MoE-Lite’s latest “enhancement” until the next drop.

Keep tinkering — good things come to those who wait (and test). 🚀

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc2: x/imagegen: fix image editing support (#13866)
Ollama – v0.15.0-rc2: x/imagegen: fix image editing support (#13866)

Big news for image gen tinkerers! 🎨 Ollama v0.15.0-rc2 just dropped with serious image editing upgrades:
- 🛠️ Fixed a crash in `ollama show` when inspecting image generation models — no more unexpected panics!
- 🖼️ Flux2KleinPipeline now has built-in vision support — edit images with context-aware prompts, zero extra setup.
- 📦 Transparent PNGs? Say hello to clean outputs — they’re now auto-flattened onto a white background.
Small tweaks, massive gains for local image editing with LLMs. Perfect if you’re blending text + visuals on your machine. 🚀

🔗 View Release
January 24, 2026
Ollama – v0.15.0
Ollama – v0.15.0

Ollama v0.15.0-rc1 just dropped, and it’s a game-changer for local AI tinkerers! 🎨✨

ImageGen got a MASSIVE upgrade — now you can edit images directly, not just generate them. Say goodbye to sketchy memory estimates; we’re now showing actual weight sizes for way more accurate predictions. (And yes, qwen_image/qwen_image_edit are temporarily out for stability — we’ll bring ‘em back stronger.)

CLI got slicker too:
- New `ollama config` command to breeze through integrations
- Smoother multiline input when loading models — no more broken Enter key chaos
Under-the-hood tweaks = faster loads, cleaner runs.

Ready to edit images locally without the cloud? Go grab it 👇

🔗 View Release
January 23, 2026
Ollama – v0.15.0-rc0: x/imagegen: remove qwen_image and qwen_image_edit models (#13827)
Ollama – v0.15.0-rc0: x/imagegen: remove qwen_image and qwen_image_edit models (#13827)

🚀 Ollama v0.15.0-rc0 just dropped — and it’s a clean sweep! 🧹

The Qwen image generation (`qwen_image`) and editing (`qwen_image_edit`) models have been temporarily removed to tidy up the codebase. Not gone forever — just taking a breather before coming back better.

What’s new:
- 🗑️ Deleted 15 files from `x/imagegen/models/qwen_image/` and `qwen_image_edit/`
- 🚫 Removed CLI flags & imports tied to Qwen image models
- ✏️ Cleaned up old comments in `cache/step.go`
No breaking changes — just housekeeping! 💪

In the meantime, try `flux`, `dalle3`, or `stable-diffusion` for your image gen fixes.

Image tools are getting a glow-up — stay tuned! 🎨

🔗 View Release
January 21, 2026
Ollama – v0.14.3

Ollama – v0.14.3

🚀 Ollama v0.14.3 is live — and it’s a quiet game-changer for power users!

🖼️ Image generation now respects `OLLAMA_MODELS` — finally, your custom model directory is honored for manifests and blobs. No more hidden paths or messy defaults. Whether you’re running Ollama in containers, on a remote server, or just OCD-organizing your models, everything stays where you put it.

No breaking changes. Just clean, predictable, and beautifully configurable storage.

Perfect for devs who want control without the clutter. Update, reorganize, and keep building! 🛠️

🔗 View Release

January 21, 2026
Lemonade – v9.1.4
Lemonade – v9.1.4

🔥 Lemonade v9.1.4 just dropped — your local LLM game just leveled up!

AMD GPU users, this one’s for you: GLM-4.7-Flash-GGUF now runs on ROCm & Vulkan — no more NVIDIA-only FOMO.

✨ New features:
- Install for ALL USERS on Windows & Linux — perfect for shared rigs and servers.
- Direct local GGUF paths in `pull` CLI — ditch the workarounds, point & run.
- LFM2.5 models are live! Faster reasoning, leaner inference.
- Perplexica added to apps — explore & benchmark models with a slick UI.
- Server load + bench tools in dev CLI — test performance like a pro, right from terminal.
- ROCm detection fixed for Strix Halo on Ubuntu 24.04 OEM kernel — finally, it just works.
- Docker & Linux fixes: caching, health checks, docs — all polished.
📦 Cleaned up recipe system, added libasound2t64 for smoother audio integration, and welcome our new contributors: @sofiageo, @goodtiding5, @ItzCrazyKn!

Upgrade now — your local LLM stack just got a serious power-up. 🚀

🔗 View Release
January 21, 2026