Skip to content

Category: AI

AI Releases

Ollama – v0.15.1-rc0: build: add -O3 optimization to CGO flags (#13877)

Ollama – v0.15.1-rc0: build: add -O3 optimization to CGO flags (#13877)

🚀 Ollama v0.15.1-rc0 just landed — and it’s fast now.

The secret sauce? `-O3` optimization is finally enabled for CGO code on macOS 🎯

Before, C/C++ components were built without optimization flags — even though Go uses `-O2` by default. Result? Sluggish release builds. Not anymore.

✅ Now: `-O3` in `CGO_CFLAGS` & `CGO_CXXFLAGS` → faster model loading

✅ Docker builds keep your custom flags (no more overwrites!)

✅ Your LLMs? They’ll spin up quicker — especially on edge devices or cloud VMs

No flashy UI, no new models… just pure, sweet performance gains.

If you’re running Ollama locally or in production — this one’s a game-changer.

Pro tip: Double-check your `CGO_CFLAGS` if building from source — don’t accidentally undo the magic! 🛠️

#Ollama #AI #Performance #GoLang #Optimization

🔗 View Release

January 24, 2026
Tater – Tater v48
Tater – Tater v48

Tater v48 just dropped—and your chat just turned into a full-blown AI workspace 🚀

Drop files straight into the WebUI:
- 📷 Images → Render inline, no links needed
- 🔊 Audio → Play right in the chat (no downloads!)
- 🎞️ Videos → Thumbnails + inline playback
- 📎 Any file → Auto-saved as downloadable attachments
And here’s the kicker: plugins can now access these files directly via Redis.

→ Summarize a PDF? Done.

→ Transcribe an audio clip? Easy.

→ Analyze an image? Already happening.

Overseerr got quiet but powerful stability fixes too—smoother than ever.

Your chat isn’t just talking anymore… it’s working.

Check out the README and start dragging & dropping!

🔗 View Release
January 24, 2026
Ollama – v0.15.0

Ollama – v0.15.0

🚀 Ollama v0.15.0 is live — and it’s all about stability!

CUDA MMA errors on NVIDIA GPUs? Gone. 🐞💥

This update crushes those pesky GPU crashes during Llama model inference, making local runs smoother than ever — especially for Linux users with NVIDIA cards.

No flashy new features… just solid under-the-hood fixes.

Perfect if you’re running Ollama in production or pushing models hard on local hardware.

💡 Pro tip: Update + reboot your Ollama service on Linux for the full benefit.

GGUF, Llama 3, Mistral — all running cleaner now.

#Ollama #LocalLLMs #CUDA #GPUComputing

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc6

Ollama – v0.15.0-rc6

🚀 Ollama v0.15.0-rc6 just dropped — and it’s a quiet hero for GPU users!

If you’ve been hitting CUDA MMA errors when running quantized Llama models on your RTX card, breathe easy. This patch slays those sneaky crashes during inference.

✅ Fixed: CUDA MMA bugs in release builds

🚫 No more mysterious GPU crashes — stable, fast, local LLMs back on track

Perfect for devs pushing limits on NVIDIA hardware. GGUF? Still supported. API? Still sweet. Just… smoother.

Run it hard. Run it local. 🖥️🔥

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc5: llama: fix fattn-tile shared memory overflow on sm_50/52 (#13872)
Ollama – v0.15.0-rc5: llama: fix fattn-tile shared memory overflow on sm_50/52 (#13872)

🚀 Ollama v0.15.0-rc5 just landed — and it’s a quiet hero for legacy GPU folks!

If you’re rocking a GTX 900 series or Titan X (Maxwell, sm_50/52), this update fixes a sneaky shared memory overflow in Flash Attention’s tile kernel. 🛠️

What changed?
- Old: `nthreads=256` + `ncols=4` → blew past 48KB shared mem limit 💥
- New: `nthreads=128` → stays safely under 48KB ✅
No flashy features — just pure, sweet stability. No more OOM crashes during inference on older NVIDIA cards.

Perfect for tinkerers with budget rigs or vintage GPUs who refuse to give up local LLMs. Update, reload your model, and keep grinding! 🖥️🧠

🔗 View Release
January 24, 2026
Wyoming Openai – Response format fix & Groq Orpheus Update (0.4.0)
Wyoming Openai – Response format fix & Groq Orpheus Update (0.4.0)

🎙️ Wyoming OpenAI v0.4.0 just dropped—and it’s a game-changer for self-hosted voice systems!
- WAV is now default 🎧 No more crackly audio from HA’s auto-detection—pure PCM straight from OpenAI APIs. Clean, reliable, no surprises.
- Logs finally work 📝 Debug logs now show up properly. Say goodbye to mystery missing logs and hello to real-time debugging.
- Groq? Nah—Orpheus TTS is in! 🎭 Replaced PlayAI with Orpheus TTS (canopylabs/orpheus-v1-english)—open-source, LLM-powered, and emotionally expressive. Use `[laugh]` or `[whisper]` tags to shape tone. Your voice assistant just got soul.
- Dep upgrades 🚀 OpenAI lib updated to 2.15.0, ruff & pytest refreshed for speed + stability.
- CI security locked down 👮‍♂️ GitHub workflows now have explicit permissions—no more side-eye from your devsecops squad.
Install via `pip install wyoming-openai`, drop it into Home Assistant, and let Orpheus sing to your smart home. 🏠✨

v0.4.0 is live—go make your AI sound human.

🔗 View Release
January 24, 2026
Ollama – v0.15.0-rc4

Ollama – v0.15.0-rc4

Big news for local LLM folks! 🚀 Ollama v0.15.0-rc4 just dropped — and it’s got a quiet game-changer:

`ollama config` is now `ollama launch` 🎯

No more confusion between “configuring” and “starting” your server.

Just run `ollama launch` to fire up your local LLM — clean, intuitive, and way more obvious.

Your existing configs? Still there.

Your scripts? Time to update those aliases! 🛠️

Under the hood: smoother model loading, better stability, and a few sneaky performance tweaks.

Next stop: stable v0.15.0 👀

Time to refresh your workflow — your local LLM stack just got simpler.

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc3: Revert “model: add MLA absorption for glm4moelite (#13810)” (#13869)

Ollama – v0.15.0-rc3: Revert “model: add MLA absorption for glm4moelite (#13810)” (#13869)

🚨 Ollama v0.15.0-rc3 just dropped — and it’s a revert!

The team pulled back the MLA (Multi-Layer Attention) absorption patch for GLM4-MoE-Lite (#13810) in #13869.

Why? Stability. Compatibility. No coffee spills today. ☕🚫

This isn’t a feature drop — it’s a strategic pause. If you’re using GLM4-MoE-Lite, stick with v0.14.x for now. The MLA integration is still in the lab — expect something smoother, smarter, and more stable soon.

Ollama’s still your go-to for local LLMs: Llama 3, Mistral, Phi-4, Gemma — all running smooth. Just hold off on GLM4-MoE-Lite’s latest “enhancement” until the next drop.

Keep tinkering — good things come to those who wait (and test). 🚀

🔗 View Release

January 24, 2026
Ollama – v0.15.0-rc2: x/imagegen: fix image editing support (#13866)
Ollama – v0.15.0-rc2: x/imagegen: fix image editing support (#13866)

Big news for image gen tinkerers! 🎨 Ollama v0.15.0-rc2 just dropped with serious image editing upgrades:
- 🛠️ Fixed a crash in `ollama show` when inspecting image generation models — no more unexpected panics!
- 🖼️ Flux2KleinPipeline now has built-in vision support — edit images with context-aware prompts, zero extra setup.
- 📦 Transparent PNGs? Say hello to clean outputs — they’re now auto-flattened onto a white background.
Small tweaks, massive gains for local image editing with LLMs. Perfect if you’re blending text + visuals on your machine. 🚀

🔗 View Release
January 24, 2026
Ollama – v0.15.0
Ollama – v0.15.0

Ollama v0.15.0-rc1 just dropped, and it’s a game-changer for local AI tinkerers! 🎨✨

ImageGen got a MASSIVE upgrade — now you can edit images directly, not just generate them. Say goodbye to sketchy memory estimates; we’re now showing actual weight sizes for way more accurate predictions. (And yes, qwen_image/qwen_image_edit are temporarily out for stability — we’ll bring ‘em back stronger.)

CLI got slicker too:
- New `ollama config` command to breeze through integrations
- Smoother multiline input when loading models — no more broken Enter key chaos
Under-the-hood tweaks = faster loads, cleaner runs.

Ready to edit images locally without the cloud? Go grab it 👇

🔗 View Release
January 23, 2026

←Previous Page

1 … 25 26 27 28 29 … 44