• Ollama – v0.15.0-rc0: x/imagegen: remove qwen_image and qwen_image_edit models (#13827)

    Ollama – v0.15.0-rc0: x/imagegen: remove qwen_image and qwen_image_edit models (#13827)

    🚀 Ollama v0.15.0-rc0 just dropped — and it’s a clean sweep! 🧹

    The Qwen image generation (`qwen_image`) and editing (`qwen_image_edit`) models have been temporarily removed to tidy up the codebase. Not gone forever — just taking a breather before coming back better.

    What’s new:

    • 🗑️ Deleted 15 files from `x/imagegen/models/qwen_image/` and `qwen_image_edit/`
    • 🚫 Removed CLI flags & imports tied to Qwen image models
    • ✏️ Cleaned up old comments in `cache/step.go`

    No breaking changes — just housekeeping! 💪

    In the meantime, try `flux`, `dalle3`, or `stable-diffusion` for your image gen fixes.

    Image tools are getting a glow-up — stay tuned! 🎨

    🔗 View Release

  • Ollama – v0.14.3

    Ollama – v0.14.3

    🚀 Ollama v0.14.3 is live — and it’s a quiet game-changer for power users!

    🖼️ Image generation now respects `OLLAMA_MODELS` — finally, your custom model directory is honored for manifests and blobs. No more hidden paths or messy defaults. Whether you’re running Ollama in containers, on a remote server, or just OCD-organizing your models, everything stays where you put it.

    No breaking changes. Just clean, predictable, and beautifully configurable storage.

    Perfect for devs who want control without the clutter. Update, reorganize, and keep building! 🛠️

    🔗 View Release

  • Lemonade – v9.1.4

    Lemonade – v9.1.4

    🔥 Lemonade v9.1.4 just dropped — your local LLM game just leveled up!

    AMD GPU users, this one’s for you: GLM-4.7-Flash-GGUF now runs on ROCm & Vulkan — no more NVIDIA-only FOMO.

    ✨ New features:

    • Install for ALL USERS on Windows & Linux — perfect for shared rigs and servers.
    • Direct local GGUF paths in `pull` CLI — ditch the workarounds, point & run.
    • LFM2.5 models are live! Faster reasoning, leaner inference.
    • Perplexica added to apps — explore & benchmark models with a slick UI.
    • Server load + bench tools in dev CLI — test performance like a pro, right from terminal.
    • ROCm detection fixed for Strix Halo on Ubuntu 24.04 OEM kernel — finally, it just works.
    • Docker & Linux fixes: caching, health checks, docs — all polished.

    📦 Cleaned up recipe system, added libasound2t64 for smoother audio integration, and welcome our new contributors: @sofiageo, @goodtiding5, @ItzCrazyKn!

    Upgrade now — your local LLM stack just got a serious power-up. 🚀

    🔗 View Release

  • Ollama – v0.14.3-rc3: model: add lfm2 architecture and LFM2.5-1.2B-Thinking support (#13792)

    Ollama – v0.14.3-rc3: model: add lfm2 architecture and LFM2.5-1.2B-Thinking support (#13792)

    Big news for AI tinkerers! 🚀

    Ollama v0.14.3-rc3 just dropped with native support for the brand-new LFM2 architecture and its first model: LFM2.5-1.2B-Thinking — a lean 1.2B parameter model built for reasoning, not just generation.

    🧠 Think step-by-step problem solving, code reasoning, and complex QA — all running locally with zero cloud latency.

    Pull it in seconds:

    `ollama pull lfm2.5:1.2b-thinking`

    No more waiting for APIs — now you’ve got a tiny, thinking LLM on your machine. Perfect for dev experiments, edge deployments, or just geeking out in privacy.

    #Ollama #LLMs #LocalAI #LFM2

    🔗 View Release

  • ComfyUI – v0.10.0

    ComfyUI – v0.10.0

    ComfyUI v0.10.0 just dropped—and it’s a game-changer 🎨⚡

    • Native WebUI Integration: Drag & drop your Stable Diffusion WebUI models directly. No more conversion headaches.
    • Dynamic Prompts in Nodes: Use `{prompt}`, `{seed}`, or `{CFG}` inside inputs—batch test variations without cloning nodes.
    • 30% Faster Workflows: Smarter node caching = quicker loads on massive pipelines.
    • New “Batch Sampler” Node: Generate 50+ variations in one go—randomize seeds, styles, CFG—all from a single node.
    • Dark Mode Upgrades: Smoother, higher contrast—perfect for late-night prompt tinkering.
    • Linux ARM64 Support: M-series Macs & Raspberry Pi 5? You’re now fully supported. 🍏🧠

    Pro tip: Combine Batch Sampler + Dynamic Prompts to auto-generate character concepts in seconds. Perfect for artists, devs, and AI tinkerers.

    Update now. The nodes are alive. 🎮✨

    🔗 View Release

  • Ollama – v0.14.3-rc2

    Ollama – v0.14.3-rc2

    🚀 Ollama v0.14.3-rc2 just dropped — and it’s a quiet hero for your RAM!

    💥 Bug squashed: Image models (SDXL, DALL·E, etc.) no longer get loaded into memory during model deletion. They now stay out of your way until you actually call them.

    🧠 Why it rocks:

    • Less RAM bloat = faster model swaps
    • Smoother performance on laptops & tiny servers
    • Cleaner shutdowns + smarter cleanup of unused vision models

    Perfect if you’re juggling multimodal AI or running vision models in prod. Still a release candidate, but solid — keep those GPUs cool and your memory free! 🖥️✨

    🔗 View Release

  • MLX-LM – v0.30.4

    MLX-LM – v0.30.4

    MLX LM v0.30.4 just dropped and it’s a beast 🚀

    • AWQ/GPTQ weight transforms now live — convert quantized models in one line.
    • Nemotron Super 49B v1.5 and GLM4 MoE Lite added — big brains, bigger performance on Apple silicon.
    • Batch generation? Fixed. MambaCache, CacheList, IQuestLoopCoder — all smoothed out.
    • New continuous batching server benchmark — measure your throughput like a pro.
    • LongCat Flash now supports sharding + extended context — longer prompts, zero headaches.
    • GPT-OSS & Minimax tensor sharding — distributed inference just got way easier.
    • SwiGLU compiled, Falcon H1 embeddings fixed, tokenizer errors now warn instead of crash.
    • Huge shoutout to new contributors: Eric, Nikhil, Solarpunkin, Evanev7 & Andrew! 🎉

    All powered by the latest MLX + smarter caching. Upgrade, benchmark, and go build something wild.

    🔗 View Release

  • MLX-LM – v0.30.3

    MLX-LM – v0.30.3

    MLX LM v0.30.3 just dropped and it’s a beast 🚀

    • AWQ & GPTQ quantization now fully supported — load quantized models like it’s nothing.
    • New models: IQuest Coder V1 Loop (code gen on steroids) + GLM4 MoE Lite (lightweight but mighty).
    • Nemotron Super 49B v1.5 and Falcon H1 with tied embeddings & muP scaling — optimized for peak performance.
    • Batching got a massive overhaul: sliding window + cache handling fixed, `CacheList`/`ArraysCache` now batchable, empty caches? Handled.
    • First-ever server benchmark for continuous batching — real-world numbers, not just benchmarks.
    • LongCat Flash now sharded + extended context — generate longer texts without choking.
    • Minitensor sharding (Minimax) + GPT-OSS sharding — scale your models smarter, not harder.
    • SwiGLU fixed, tokenizer errors now use `warnings`, MLX updated to latest — all the polish you didn’t know you needed.

    Massive thanks to @ericcurtin, @nikhilmitrax, @tibbes, @solarpunkin, @AndrewTan517, and @Evanev7 for the wins!

    Update. Run. Build something wild. 🤖💻

    🔗 View Release

  • Ollama – v0.14.3-rc1: MLX – dynamic loading of mlx-c (#13735)

    Ollama – v0.14.3-rc1: MLX – dynamic loading of mlx-c (#13735)

    🚀 Ollama v0.14.3-rc1 just dropped — and it’s a game-changer for Mac & Linux tinkerers!

    MLX is now dynamically loaded via `dlopen` — meaning:

    ✅ Ollama starts even if MLX isn’t installed

    ✅ Swap MLX paths on the fly (perfect for custom builds or multi-env setups)

    ✅ Graceful fallbacks — no more crashing if dependencies are missing

    No more “why won’t it start?!” headaches. Just pure, flexible local LLM power.

    Perfect if you’re running M-series Macs or Linux with custom CUDA/MLX builds.

    Tests fixed, reviews addressed — clean, stable, and ready for your next experiment.

    Try it out. If MLX isn’t there? Ollama just shrugs… and keeps going. 😎

    🔗 View Release

  • Ollama – v0.14.3-rc0

    Ollama – v0.14.3-rc0

    🚀 Ollama v0.14.3-rc0 just dropped — and macOS users, this one’s for you!

    No more ghost processes after rebooting. The Ollama app now properly shuts down during logouts and restarts — clean, quiet, and respectful of your system’s power management. 🍎💤

    Under the hood:

    • Smoother background cleanup
    • Better memory & resource handling on shutdown
    • Minor stability tweaks (zero breaking changes)

    This is a release candidate — stable, tested, and perfect for Mac folks tired of unresponsive apps after a reboot.

    Grab it if you run Ollama locally and value a clean, hassle-free experience. 🛠️✨

    🔗 View Release