Ollama – v0.18.4-rc1

Ollama – v0.18.4-rc1

πŸš€ Ollama v0.18.4-rc1 is here β€” and it’s packing a subtle but smart update!

πŸ” What’s new?

Ollama now warns you if your server context length is below 64k tokens when running local models. Why? Because newer LLMs (like Llama 3.1, Mistral Large, DeepSeek-R1) are built for long contexts β€” and running them with too little context can lead to truncated outputs or weird behavior. This warning helps you avoid those gotchas before they bite! πŸ’‘

πŸ› οΈ Bonus: While the full changelog is still loading on GitHub, this RC likely includes:

  • Stability tweaks for model loading
  • Improved error messages (especially around context handling)
  • Minor CLI/web UI polish

πŸ“Œ Pro tip: If you’re using large-context models (e.g., `llama3.1:8b-instruct-q4_K_M`), double-check your `OLLAMA_MAX_LOADED_MODELS` and context settings β€” this warning is here to help you optimize!

πŸ”— Grab the RC: v0.18.4-rc1 on GitHub

πŸ’¬ Join the convo: Ollama Discord

Let us know if you spot any quirks or love the warning β€” feedback helps shape the final release! πŸ™Œ

πŸ”— View Release