vluzrmos / ollama
Cliente PHP para a API do Ollama/OpenAI
Requires
- php: >=5.6.0
- ext-curl: *
- ext-json: *
- guzzlehttp/guzzle: ^6.5
Requires (Dev)
- ext-xdebug: *
- phpunit/phpunit: 5.7.*
README
PHP client for the Ollama native API and OpenAI-compatible endpoints, with support for PHP 5.6+.
This library provides:
- Native Ollama API access
- OpenAI-compatible API access
- Reusable model configuration objects
- Typed message helpers for chat, tools, and multimodal inputs
- Streaming and async requests
- Tool calling and tool execution helpers
- High-level chat orchestration with automatic tool recursion
- Agent and agent-group abstractions
- Image helpers for vision models
- Embeddings, model management, and runtime utilities
Features
- Compatible with PHP 5.6+
- Native Ollama client
- OpenAI-compatible client
- Sync and async APIs
- Streaming responses
- Chat completions and text completions
- Embeddings
- Vision and image message helpers
- Function calling and tool execution
- Built-in tool registry and result conversion
- High-level chat session wrapper
- Agent and agent group abstractions
- Model lifecycle operations for Ollama
- Response wrappers with array-style access
- Configurable HTTP client options and Bearer token support
Requirements
- PHP >= 5.6.0
- ext-curl
- ext-json
Installation
composer require vluzrmos/ollama
Quick Start
Ollama Native API
<?php require_once 'vendor/autoload.php'; use Vluzrmos\Ollama\Ollama; use Vluzrmos\Ollama\Models\Message; $ollama = new Ollama('http://localhost:11434'); $response = $ollama->chat([ 'model' => 'llama3.2', 'messages' => [ Message::system('You are a helpful assistant.'), Message::user('Hello!'), ], ]); echo $response['message']['content'];
OpenAI-Compatible API
<?php require_once 'vendor/autoload.php'; use Vluzrmos\Ollama\OpenAI; use Vluzrmos\Ollama\Models\Message; $openai = new OpenAI('http://localhost:11434/v1', 'ollama'); $response = $openai->chat('llama3.2', [ Message::system('You are a helpful assistant.'), Message::user('Hello!'), ]); echo $response['choices'][0]['message']['content'];
Core Concepts
Model
The Model class lets you define a reusable model plus default request parameters.
<?php use Vluzrmos\Ollama\Models\Model; $model = (new Model('llama3.2')) ->setTemperature(0.8) ->setTopP(0.9) ->setNumCtx(4096) ->setNumPredict(512) ->setSeed(42) ->setStop(['END']);
Supported convenience setters include:
setTemperature()setTopP()setTopK()setRepeatPenalty()setSeed()setNumCtx()setNumPredict()setStop()setParameter()andsetOption()for custom values
Message
The Message helper supports standard chat roles, tool messages, and multimodal payloads.
<?php use Vluzrmos\Ollama\Models\Message; $messages = [ Message::system('You are a helpful assistant.'), Message::user('Explain embeddings in one paragraph.'), Message::assistant('Embeddings are vector representations...'), Message::tool('{"status":"ok"}', 'my_tool'), ];
Available constructors:
Message::system($content)Message::user($content, array $images = null)Message::assistant($content)Message::tool($content, $toolName)Message::image($text, $imageUrl, $role = 'user')
Response Wrappers
Responses are returned as wrapper objects that implement array-style access and can also be converted back to raw arrays.
ResponseResponseMessageResponseEmbedding
Useful helpers include:
toArray()getMessages()getToolCalls()hasToolCalls()getEmbeddings()
Ollama Client
Class: Vluzrmos\Ollama\Ollama
Default base URL: http://localhost:11434
Main Operations
<?php use Vluzrmos\Ollama\Ollama; $client = new Ollama('http://localhost:11434');
Text generation
$response = $client->generate([ 'model' => 'llama3.2', 'prompt' => 'Why is the sky blue?', ]); echo $response['response'];
Chat completions
$response = $client->chat([ 'model' => 'llama3.2', 'messages' => [ Message::user('Tell me a short joke'), ], ]); echo $response['message']['content'];
Embeddings
$response = $client->embeddings([ 'model' => 'all-minilm', 'input' => 'Text for embedding', ]); $vector = $response['embeddings'][0];
Model and Runtime Management
The native Ollama client exposes the following operations:
listModels()andlistModelsAsync()showModel()andshowModelAsync()copyModel()andcopyModelAsync()deleteModel()anddeleteModelAsync()pullModel()andpullModelAsync()pushModel()andpushModelAsync()createModel()andcreateModelAsync()listRunningModels()andlistRunningModelsAsync()version()andversionAsync()blobExists()andblobExistsAsync()pushBlob()andpushBlobAsync()
Example:
$models = $client->listModels(); $info = $client->showModel('llama3.2'); $version = $client->version();
Authentication
If your Ollama-compatible endpoint requires a Bearer token:
$client->setApiToken('your-token');
OpenAI-Compatible Client
Class: Vluzrmos\Ollama\OpenAI
Default base URL: http://localhost:11434/v1
Default API key: ollama
<?php use Vluzrmos\Ollama\OpenAI; $client = new OpenAI('http://localhost:11434/v1', 'ollama');
Supported Endpoints
/v1/chat/completions/v1/completions/v1/embeddings/v1/models/v1/models/{model}
High-Level Convenience Methods
chat()andchatAsync()chatStream()andchatStreamAsync()complete()andcompleteAsync()completeStream()andcompleteStreamAsync()embed()andembedAsync()listModels()andlistModelsAsync()retrieveModel()andretrieveModelAsync()
Low-Level Request Methods
chatCompletions()andchatCompletionsAsync()completions()andcompletionsAsync()embeddings()andembeddingsAsync()
Chat Example
<?php use Vluzrmos\Ollama\Models\Message; use Vluzrmos\Ollama\Models\Model; $model = (new Model('qwen2.5:3b'))->setTemperature(0.6); $response = $client->chat($model, [ Message::system('You are a helpful assistant that replies in English.'), Message::user('Summarize the advantages of vector databases.'), ]); echo $response['choices'][0]['message']['content'];
Text Completion Example
$response = $client->complete('llama3.2', 'Write a tagline for an AI product.', [ 'max_tokens' => 60, 'temperature' => 0.7, ]); echo $response['choices'][0]['text'];
Embedding Example
$response = $client->embed('all-minilm', [ 'First text', 'Second text', ]); echo count($response['data'][0]['embedding']);
Streaming
Both clients support streamed responses through a callback that receives decoded chunks.
Ollama Streaming
$client->generate([ 'model' => 'llama3.2', 'prompt' => 'Tell me a short story', 'stream' => true, ], function ($chunk) { if (isset($chunk['response'])) { echo $chunk['response']; } });
OpenAI Streaming
$client->chatStream('llama3.2', [ Message::user('Tell me a short story'), ], function ($chunk) { if (isset($chunk['choices'][0]['delta']['content'])) { echo $chunk['choices'][0]['delta']['content']; } });
Async Requests
Async variants are available on both clients and return Guzzle promises.
Examples:
generateAsync()chatAsync()embeddingsAsync()chatCompletionsAsync()completeAsync()embedAsync()
$promise = $client->chatAsync('llama3.2', [ Message::user('Hello'), ]); $promise->then(function ($response) { echo $response['choices'][0]['message']['content']; });
Concurrent Requests (Pool)
Use the pool() method to send multiple requests concurrently via Guzzle's Pool. Each yielded value must be a callable that returns a promise (not a promise directly).
<?php use Vluzrmos\Ollama\OpenAI; use Vluzrmos\Ollama\Models\Model; $openai = new OpenAI('http://localhost:11434/v1', 'ollama'); $model = new Model('llama3.2'); $prompts = [ 'What is the capital of France?', 'Who won the FIFA World Cup in 2018?', 'What is the tallest mountain in the world?', ]; function make_requests(OpenAI $openai, $model, array $prompts) { foreach ($prompts as $index => $prompt) { yield $index => function () use ($openai, $model, $prompt) { return $openai->completeAsync($model, $prompt, [ 'max_tokens' => 100, ]); }; } } $pool = $openai->pool(make_requests($openai, $model, $prompts), [ 'concurrency' => 4, 'fulfilled' => function ($response, $index) use ($prompts) { echo "Prompt: '{$prompts[$index]}'\n"; echo "Answer: " . $response['choices'][0]['text'] . "\n\n"; }, 'rejected' => function ($reason, $index) use ($prompts) { echo "Error for: '{$prompts[$index]}'\n"; echo "Reason: " . $reason->getMessage() . "\n\n"; }, ]); $pool->promise()->wait();
Pool configuration options:
| Option | Type | Description |
|---|---|---|
concurrency |
int | Maximum number of requests to send concurrently |
fulfilled |
callable | Callback invoked when a request completes. Receives ($response, $index) |
rejected |
callable | Callback invoked when a request fails. Receives ($reason, $index) |
Important: Each yielded value must be a closure that returns a promise (e.g. wrapping an
*Async()call), not the promise itself. Passing promises directly will throw anInvalidArgumentException.
The pool() method is available on both OpenAI and Ollama clients. You can use any async method (chatAsync(), completeAsync(), embedAsync(), etc.) inside the generator.
Vision and Images
Use Message::image() together with ImageHelper for vision-capable models.
<?php use Vluzrmos\Ollama\Models\Message; use Vluzrmos\Ollama\Utils\ImageHelper; $image = ImageHelper::encodeImageUrl(__DIR__ . '/sample.jpg'); $response = $client->chat('qwen2.5vl:3b', [ Message::image('Describe this image.', $image), ]);
Available helpers:
ImageHelper::encodeImage()ImageHelper::encodeImages()ImageHelper::encodeImageUrl()ImageHelper::encodeImagesUrl()ImageHelper::isValidImage()ImageHelper::getImageInfo()
JSON Mode and Structured Output
OpenAI-compatible chat requests accept response_format options, including json_object and json_schema.
$response = $client->chat('llama3.2', [ Message::system('Always respond with valid JSON.'), Message::user('List 3 primary colors'), ], [ 'response_format' => [ 'type' => 'json_schema', 'json_schema' => [ 'name' => 'primary_colors', 'strict' => true, 'schema' => [ 'type' => 'object', 'properties' => [ 'colors' => [ 'type' => 'array', 'items' => ['type' => 'string'], ], ], 'required' => ['colors'], ], ], ], ]);
Tools and Function Calling
The library includes a complete tool system with registration, serialization, execution, and conversion of tool results back into chat messages.
Creating a Custom Tool
<?php use Vluzrmos\Ollama\Exceptions\ToolExecutionException; use Vluzrmos\Ollama\Tools\AbstractTool; class CalculatorTool extends AbstractTool { public function getName() { return 'calculator'; } public function getDescription() { return 'Performs basic mathematical operations'; } public function getParametersSchema() { return [ 'type' => 'object', 'properties' => [ 'operation' => [ 'type' => 'string', 'enum' => ['add', 'subtract', 'multiply', 'divide'], ], 'a' => ['type' => 'number'], 'b' => ['type' => 'number'], ], 'required' => ['operation', 'a', 'b'], ]; } public function execute(array $arguments) { $a = $arguments['a']; $b = $arguments['b']; $operation = $arguments['operation']; switch ($operation) { case 'add': return $a + $b; case 'subtract': return $a - $b; case 'multiply': return $a * $b; case 'divide': if ($b == 0) { throw new ToolExecutionException('Division by zero'); } return $a / $b; } throw new ToolExecutionException('Unsupported operation'); } }
ToolManager
<?php use Vluzrmos\Ollama\Tools\ToolManager; $toolManager = new ToolManager(); $toolManager->registerTool(new CalculatorTool());
Available operations:
registerTool()unregisterTool()getTool()hasTool()listTools()toArray()jsonSerialize()executeTool()executeToolCalls()decodeToolCallArguments()toolCallResultsToMessages()getStats()
OpenAI-Compatible Tool Calling Flow
$toolManager = new ToolManager(); $toolManager->registerTool(new CalculatorTool()); $messages = [ Message::user('What is 15 + 27?'), ]; $response = $client->chatCompletions([ 'model' => 'llama3.2', 'messages' => $messages, 'tools' => $toolManager, ]); $toolCalls = $response['choices'][0]['message']['tool_calls']; $results = $toolManager->executeToolCalls($toolCalls); $toolMessages = $toolManager->toolCallResultsToMessages($results); $finalResponse = $client->chatCompletions([ 'model' => 'llama3.2', 'messages' => array_merge($messages, [$response['choices'][0]['message']], $toolMessages), ]);
For dynamic values such as the current date or current time, do not include the answer itself in the system prompt or conversation context if you expect a tool call. If the model already sees that value in the prompt, it may answer directly and skip the tool. Add an explicit instruction to use the tool for time-sensitive questions and, when your OpenAI-compatible provider supports it, consider passing
tool_choicefor deterministic tool selection.
Built-In Tools
TimeToolBrasilAPI\CEPInfoToolBrasilAPI\CNPJInfoToolBrasilAPI\FeriadosNacionaisTool
High-Level Chat Orchestrator
Class: Vluzrmos\Ollama\Chat\Chat
This wrapper maintains conversation history, registers tools, sends messages through a chat adapter, and automatically re-enters the model when tool calls are returned.
<?php use Vluzrmos\Ollama\Chat\Chat; use Vluzrmos\Ollama\Chat\OpenAIClientAdapter; use Vluzrmos\Ollama\Models\Message; use Vluzrmos\Ollama\Models\Model; use Vluzrmos\Ollama\OpenAI; $chat = new Chat(new OpenAIClientAdapter(new OpenAI())); $chat->withModel(new Model('qwen2.5:3b')); $chat->addMessage(Message::system('You are a helpful assistant.')); $response = $chat->chat([ Message::user('What time is it in UTC?'), ]);
Main operations:
withModel()addMessage()addMessages()registerTool()registerTools()getModel()getMessages()getTools()chat()
Agents
The agent layer builds on top of the clients and tool manager to provide reusable role-based assistants.
Agent
Class: Vluzrmos\Ollama\Agents\Agent
An agent has:
- name
- description
- system instructions
- a model
- a client adapter
- optional tools
- default request options
<?php use Vluzrmos\Ollama\Agents\Agent; use Vluzrmos\Ollama\Agents\OpenAIClientAdapter; use Vluzrmos\Ollama\OpenAI; use Vluzrmos\Ollama\Tools\TimeTool; $agent = new Agent( 'Assistant', new OpenAIClientAdapter(new OpenAI()), 'qwen2.5:3b', 'You are a helpful assistant.', 'General purpose assistant', [new TimeTool()], ['temperature' => 0.4] ); $response = $agent->processQuery('What time is it in UTC?');
Agent operations:
processQuery()canHandle()addTool()removeTool()getTools()setOptions()getOptions()
AgentGroup
Class: Vluzrmos\Ollama\Agents\AgentGroup
An agent group uses a selector prompt to either:
- answer directly
- route the query to the most appropriate specialized agent
Available operations:
processQuery()addAgent()removeAgent()getAgents()getAgent()getTools()setSelectorInstructions()
Client Adapters
The agent system includes adapters for both client styles:
Vluzrmos\Ollama\Agents\OpenAIClientAdapterVluzrmos\Ollama\Agents\OllamaClientAdapter
HTTP Client Configuration
Both top-level clients accept HTTP options in the constructor and expose the underlying HttpClient through getHttpClient().
<?php use Vluzrmos\Ollama\Ollama; use Vluzrmos\Ollama\OpenAI; $ollama = new Ollama('http://localhost:11434', [ 'timeout' => 60, 'connect_timeout' => 10, 'verify_ssl' => false, ]); $openai = new OpenAI('http://localhost:11434/v1', 'ollama', [ 'timeout' => 120, ]);
Error Handling
Custom exceptions include:
OllamaExceptionHttpExceptionRequiredParameterExceptionToolExecutionException
<?php use Vluzrmos\Ollama\Exceptions\OllamaException; use Vluzrmos\Ollama\Models\Message; try { $response = $ollama->chat([ 'model' => 'non-existent-model', 'messages' => [Message::user('Hello')], ]); } catch (OllamaException $e) { echo $e->getMessage(); }
Examples
The repository includes working examples for the main library features:
examples/basic_usage.phpexamples/openai_usage.phpexamples/advanced_chat.phpexamples/agents_demo.phpexamples/simple_agent.phpexamples/tool_execution_demo.phpexamples/simple_tool_test.phpexamples/test_tool_manager.phpexamples/brasilapi.php
Custom example tools are available in:
examples/tools/CalculatorTool.phpexamples/tools/WeatherTool.php
Testing
Run the test suite with Docker:
docker build -t ollama-php56 . docker run -it --rm \ -e OPENAI_API_URL="http://localhost:11434/v1" \ -e OLLAMA_API_URL="http://localhost:11434" \ -e RUN_INTEGRATION_TESTS=1 \ -e TEST_MODEL="llama3.2:1b" \ ollama-php56
Docker Development Environment
The included Docker image can also be used to run examples or custom commands.
docker build -t ollama-php .
docker run -it --rm ollama-php
Supported environment variables:
OPENAI_API_URLwith defaulthttp://localhost:11434/v1OLLAMA_API_URLwith defaulthttp://localhost:11434RUN_INTEGRATION_TESTSwith default1TEST_MODELwith defaultqwen2.5:3bTEST_VISION_MODELwith defaultqwen2.5vl:3b
Run a custom command:
docker run -it --rm ollama-php php examples/basic_usage.php
Mount your local project into the container if needed:
docker run -it --rm -v /path/to/your/project:/app ollama-php php your-script.php
License
MIT
Contributing
Contributions are welcome. See CONTRIBUTING.md for project guidelines.