Ollama – v0.15.0-rc6
π Ollama v0.15.0-rc6 just dropped β and itβs a quiet hero for GPU users!
If youβve been hitting CUDA MMA errors when running quantized Llama models on your RTX card, breathe easy. This patch slays those sneaky crashes during inference.
β Fixed: CUDA MMA bugs in release builds
π« No more mysterious GPU crashes β stable, fast, local LLMs back on track
Perfect for devs pushing limits on NVIDIA hardware. GGUF? Still supported. API? Still sweet. Just… smoother.
Run it hard. Run it local. π₯οΈπ₯
