Releases · ollama/ollama

@bryanhonof

What's Changed

Fixed error that would occur when running Ollama on Linux machines with the ARM architecture
Ollama will now show an improved error message when attempting to run unsupported models
Fixed issue where Ollama would not auto-detect the chat template for Llama 3.1 models
OLLAMA_HOST will now work with with URLs that contain paths

New Contributors

@bryanhonof made their first contribution in #6074

Full Changelog: v0.3.8...v0.3.9

@seankhatiri

What's Changed

Fixed error where the ollama CLI couldn't be found on the path when upgrading Ollama on Windows

New Contributors

@seankhatiri made their first contribution in #6530

Full Changelog: v0.3.7...v0.3.8

@pamelafox

New Models

Hermes 3: Hermes 3 is the latest version of the flagship Hermes series of LLMs by Nous Research, which includes support for tool calling.
Phi 3.5: A lightweight AI model with 3.8 billion parameters with performance overtaking similarly and larger sized models.
SmolLM: A family of small models with 135M, 360M, and 1.7B parameters, trained on a new high-quality dataset.

What's Changed

CUDA 12 support: improving performance by up to 10% on newer NVIDIA GPUs
Improved performance of ollama pull and ollama push on slower connections
Fixed issue where setting OLLAMA_NUM_PARALLEL would cause models to be reloaded on lower VRAM systems
Ollama on Linux is now distributed as a tar.gz file, which contains the ollama binary along with required libraries.

New Contributors

@pamelafox made their first contribution in #6345
@eust-w made their first contribution in #5964

Full Changelog: v0.3.6...v0.3.7

What's Changed

Fixed issue where /api/embed would return an error instead of loading the model when the input field was not provided.
ollama create can now import Phi-3 models from Safetensors
Added progress information to ollama create when importing GGUF files
Ollama will now import GGUF files faster by minimizing file copies

Full Changelog: v0.3.5...v0.3.6

@jessegross

What's Changed

Fixed Incorrect function error when downloading models on Windows
Fixed issue where temporary files would not be cleaned up
Fix rare error when Ollama would start up due to invalid model data
Ollama will now provide an error instead of crashing on Windows when running models that are too large to fit into total memory

New Contributors

@jessegross made their first contribution in #6145
@rgbkrk made their first contribution in #5985
@Nicholas42 made their first contribution in #6235
@cognitivetech made their first contribution in #6305

Full Changelog: v0.3.4...v0.3.5

@av

New embedding models

BGE-M3: a large embedding model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity.
BGE-Large: a large embedding model trained in english.
Paraphrase-Multilingual: A multilingual embedding model trained on parallel data for 50 languages.

New embedding API with batch support

Ollama now supports a new API endpoint /api/embed for embedding generation:

curl http://localhost:11434/api/embed -d '{
  "model": "all-minilm",
  "input": ["Why is the sky blue?", "Why is the grass green?"]
}'

This API endpoint supports new features:

Batches: generate embeddings for several documents in one request
Normalized embeddings: embeddings are now normalized, improving similarity results
Truncation: a new truncate parameter that will error if set to false
Metrics: responses include load_duration, total_duration and prompt_eval_count metrics

See the API documentation for more details and examples.

What's Changed

Fixed initial slow download speeds on Windows
NUMA support will now be autodetected by Ollama to improve performance
Fixed issue where the /api/embed would sometimes return embedding results out of order

New Contributors

@av made their first contribution in #6147
@sryu1 made their first contribution in #6151
@rick-github made their first contribution in #6154

Full Changelog: v0.3.3...v0.3.4

@slouffka

What's Changed

The /api/embed endpoint now returns statistics: total_duration, load_duration, and prompt_eval_count
Added usage metrics to the /v1/embeddings OpenAI compatibility API
Fixed issue where /api/generate would respond with an empty string if provided a context
Fixed issue where /api/generate would return an incorrect value for context
/show modefile will now render MESSAGE commands correctly

New Contributors

@slouffka made their first contribution in #6115

Full Changelog: v0.3.2...v0.3.3

@longseespace

What's Changed

Fixed issue where ollama pull would not resume download progress
Fixed issue where phi3 would report an error on older versions

New Contributors

@longseespace made their first contribution in #6096

Full Changelog: v0.3.1...v0.3.2

@Robitx

Google's Gemma 2 has a new 2B parameter small model!

ollama run gemma2:2b

New models

Gemma 2 2B: A new 2B parameter model by Google DeepMind

What's Changed

Added support for min_p sampling option
ollama create will now autodetect required stop parameters when importing certain models
Ollama on Windows will now show better error messages if required files are missing
Fixed issue where /save would cause parameters to be saved incorrectly
OpenAI-compatible API will now return a finish_reason of tool_calls if a tool call occured.
Performance and reliability improvements when downloading models using ollama pull
Ollama's Linux install script will now return a better error on unsupported CUDA versions

New Contributors

@Robitx made their first contribution in #1825
@hellerve made their first contribution in #6041

Full Changelog: v0.3.0...v0.3.1

@lreed-mdsol

Tool support

Ollama now supports tool calling with popular models such as Llama 3.1. This enables a model to answer a given prompt using tool(s) it knows about, making it possible for models to perform more complex tasks or interact with the outside world.

Example tools include:

Functions and APIs
Web browsing
Code interpreter
much more!

352317275-aea4d7c1-f1be-41fd-9077-023d37a9d052.mov

To use tools, provide the tools field when using Ollama's Chat API:

import ollama

response = ollama.chat(
    model='llama3.1',
    messages=[{'role': 'user', 'content': 'What is the weather in Toronto?'}],

    # provide a weather checking tool to the model
    tools=[{
      'type': 'function',
      'function': {
        'name': 'get_current_weather',
        'description': 'Get the current weather for a city',
        'parameters': {
          'type': 'object',
          'properties': {
            'city': {
              'type': 'string',
              'description': 'The name of the city',
            },
          },
          'required': ['city'],
        },
      },
    },
  ],
)

print(response['message']['tool_calls'])

More information:

New models

Llama 3.1: a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes with support for tool calling.
Mistral Large 2: Mistral's new 123B flagship model that is significantly more capable in code generation, tool calling, mathematics, and reasoning with 128k context window and support for dozens of languages.
Firefunction v2: An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
Llama-3-Groq-Tool-Use: A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.

What's Changed

Fixed duplicate error message when running ollama create

New Contributors

@lreed-mdsol made their first contribution in #5757
@ajhai made their first contribution in #5799

Full Changelog: v0.2.8...v0.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

New Models

What's Changed

New Contributors

Contributors

What's Changed

What's Changed

New Contributors

Contributors

New embedding models

New embedding API with batch support

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

New models

What's Changed

New Contributors

Contributors

Tool support

New models

What's Changed

New Contributors

Contributors

Releases: ollama/ollama

v0.3.9

What's Changed

New Contributors

Contributors

v0.3.8

What's Changed

New Contributors

Contributors

v0.3.7

New Models

What's Changed

New Contributors

Contributors

v0.3.6

What's Changed

v0.3.5

What's Changed

New Contributors

Contributors

v0.3.4

New embedding models

New embedding API with batch support

What's Changed

New Contributors

Contributors

v0.3.3

What's Changed

New Contributors

Contributors

v0.3.2

What's Changed

New Contributors

Contributors

v0.3.1

New models

What's Changed

New Contributors

Contributors

v0.3.0

Tool support

New models

What's Changed

New Contributors

Contributors