The Ollama team has released a significant update that brings native tool calling, structured JSON output, and enhanced vision model support to local AI inference. This release closes one of the biggest feature gaps between local and cloud-hosted model APIs.

Key Features

Native Tool Calling: Models can now invoke user-defined functions through a standardized interface. You define tools with JSON schemas, and compatible models will generate structured function calls — just like the OpenAI function calling API, but running entirely on your hardware.

Structured Output: Force models to respond in valid JSON matching a specific schema. No more parsing hacks or retry loops for structured data extraction.

Vision Improvements: Better support for multimodal models that accept both text and image inputs, with faster image preprocessing and reduced memory overhead.

What This Enables

Local AI agents are now significantly easier to build. Tool calling means your self-hosted model can search databases, call APIs, read files, and execute code — all without sending data to external services. Combined with structured output, you can build reliable data extraction pipelines entirely on-premises.

Frequently Asked Questions

Which models support tool calling in Ollama?

Models specifically fine-tuned for function calling work best. Check the Ollama model library for models tagged with tool-calling support.

Does this work with the existing Ollama API?

Yes, it extends the existing /api/chat endpoint with a new tools parameter. Existing code continues to work without changes.