README

A unified AI interface for Laravel — LLMs, embeddings, vector databases, RAG pipelines, agents, and more.

Installation
Configuration
Quick Start
LLM Providers
Embedding Providers
Vector Databases
RAG Pipeline
Agents & Tool Calling
Memory
Image Generation
Speech Processing
Document Ingestion & Chunking
Testing
Events & Observability
Architecture

Installation

composer require manik/neuro

Publish the configuration:

php artisan vendor:publish --tag=neuro-config

Run migrations (for persistent memory and document storage):

php artisan migrate

Configuration

Add the following to your .env file based on the providers you use:

LLM Providers

# OpenAI
AI_DEFAULT_LLM=openai
OPENAI_API_KEY=sk-...
OPENAI_LLM_MODEL=gpt-4o

# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_LLM_MODEL=claude-3-5-sonnet-20241022

# Google Gemini
GEMINI_API_KEY=...
GEMINI_LLM_MODEL=gemini-2.0-flash

# Local Ollama
AI_DEFAULT_LLM=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3

# xAI Grok
XAI_API_KEY=...
XAI_LLM_MODEL=grok-2

# Mistral
MISTRAL_API_KEY=...
MISTRAL_LLM_MODEL=mistral-large-latest

# Cohere
COHERE_API_KEY=...
COHERE_LLM_MODEL=command-r-plus

Embedding Providers

# Default
AI_DEFAULT_EMBEDDING=openai
OPENAI_API_KEY=sk-...
OPENAI_EMBEDDING_MODEL=text-embedding-3-small

# Local Ollama (no API key needed)
AI_DEFAULT_EMBEDDING=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=llama3

# Gemini
GEMINI_API_KEY=...
GEMINI_EMBEDDING_MODEL=text-embedding-004

# Mistral
MISTRAL_API_KEY=...
MISTRAL_EMBEDDING_MODEL=mistral-embed

# Cohere
COHERE_API_KEY=...
COHERE_EMBEDDING_MODEL=embed-english-v3.0

Vector Databases

# Default
AI_DEFAULT_VECTOR=qdrant
QDRANT_HOST=http://localhost:6333

# Pinecone
AI_DEFAULT_VECTOR=pinecone
PINECONE_API_KEY=...
PINECONE_ENVIRONMENT=...
PINECONE_INDEX_HOST=https://...pinecone.io

# pgvector
AI_DEFAULT_VECTOR=pgvector
PGVECTOR_CONNECTION=pgsql
PGVECTOR_TABLE=vector_embeddings
PGVECTOR_DIMENSIONS=1536

# Weaviate
AI_DEFAULT_VECTOR=weaviate
WEAVIATE_HOST=http://localhost:8080

# Milvus (v2 API)
AI_DEFAULT_VECTOR=milvus
MILVUS_HOST=http://localhost:19530
MILVUS_TOKEN=your_token

# Chroma
AI_DEFAULT_VECTOR=chroma
CHROMA_HOST=http://localhost:8000

# Vector Collections (optional — overrides the default collection name per driver)
AI_DEFAULT_VECTOR_COLLECTION=default
QDRANT_COLLECTION=my_qdrant_collection
PINECONE_COLLECTION=my_pinecone_index
PGVECTOR_COLLECTION=my_pgvector_collection
WEAVIATE_COLLECTION=MyWeaviateClass
MILVUS_COLLECTION=my_milvus_collection
CHROMA_COLLECTION=my_chroma_collection

All vector database methods accept a collection name as the first argument. If omitted or null, the driver falls back to the per-driver collection config, then to AI_DEFAULT_VECTOR_COLLECTION, then to 'default':

use Neuro::vector()->driver()->defaultCollection(); // resolves the fallback chain
Neuro::vector()->upsert('my_collection', $records); // explicit name
Neuro::vector()->upsert(null, $records);            // uses config fallback

Temperature & Generation Settings

Configure temperature, max tokens, and timeout per provider:

# OpenAI
OPENAI_TEMPERATURE=0.7
OPENAI_MAX_TOKENS=4096
OPENAI_TIMEOUT=60

# Anthropic
ANTHROPIC_TEMPERATURE=0.7
ANTHROPIC_MAX_TOKENS=4096

# Gemini
GEMINI_TEMPERATURE=0.7
GEMINI_MAX_TOKENS=4096

# Ollama
OLLAMA_TEMPERATURE=0.7
OLLAMA_MAX_TOKENS=4096
OLLAMA_TIMEOUT=120

# Grok (xAI)
XAI_TEMPERATURE=0.7

# Mistral
MISTRAL_TEMPERATURE=0.7

# Cohere
COHERE_TEMPERATURE=0.7

Override temperature at runtime:

$response = Neuro::chat()
    ->provider('openai')
    ->model('gpt-4o')
    ->message('Explain Laravel')
    ->options(['temperature' => 0.3])
    ->chat();

System Instructions

Cross-driver support for system prompts. Pass a message with role: 'system':

$response = Neuro::chat()
    ->message(['role' => 'system', 'content' => 'You are a helpful assistant.'])
    ->message('Explain Laravel')
    ->chat();

The driver automatically handles the provider-specific format:

Driver	API Field
OpenAI	`messages[].role = system` (native)
Anthropic	Top-level `system` field
Gemini	`systemInstruction` field
Cohere	`preamble` field
Ollama	`messages[].role = system` (native)
Grok	`messages[].role = system` (native)
Mistral	`messages[].role = system` (native)

Embedding Dimensions

Control embedding output dimensions to ensure compatibility with your vector database:

# OpenAI (default: 1536)
OPENAI_EMBEDDING_DIMENSIONS=768

# Gemini (default: 768 — use 768 to match Qdrant collections)
GEMINI_EMBEDDING_DIMENSIONS=768

# Ollama
OLLAMA_EMBEDDING_DIMENSIONS=4096

# Mistral
MISTRAL_EMBEDDING_DIMENSIONS=1024

# Cohere
COHERE_EMBEDDING_DIMENSIONS=1024

RAG Configuration

AI_RAG_CHUNK_STRATEGY=recursive
AI_RAG_CHUNK_SIZE=1000
AI_RAG_CHUNK_OVERLAP=200
AI_RAG_TOP_K=5
AI_RAG_MIN_SCORE=0.0

Cache & Rate Limiting

AI_CACHE_ENABLED=false
AI_CACHE_STORE=redis
AI_CACHE_TTL=3600

AI_RATE_LIMIT_ENABLED=false
AI_RATE_LIMIT_MAX=60
AI_RATE_LIMIT_DECAY=60

Quick Start

use Manik\Neuro\Facades\Neuro;

Chat Completion

$response = Neuro::chat()
    ->provider('openai')
    ->model('gpt-4o')
    ->message('Explain Laravel to a beginner')
    ->chat();

// $response['content'] => string
// $response['role'] => 'assistant'

Streaming Chat

$stream = Neuro::chat()
    ->provider('openai')
    ->message('Write a poem about Laravel')
    ->stream();

foreach ($stream as $chunk) {
    echo $chunk['content'];
    ob_flush();
    flush();
}

Embeddings

$result = Neuro::chat()
    ->text('The text to embed')
    ->embed('openai');

// $result['embedding'] => array of floats
// $result['dimensions'] => int

Batch embed:

$results = Neuro::chat()
    ->embedBatch(['text one', 'text two', 'text three'], 'openai');

Vector Search

$results = Neuro::vector()
    ->driver('qdrant')
    ->search('my_collection', $vector, ['top_k' => 10]);

LLM Providers

Use the Neuro::llm() method to access the LLM manager directly:

$driver = Neuro::llm()->driver('openai');
$response = $driver->chat([['role' => 'user', 'content' => 'Hello!']]);

Supported Providers

Provider	Driver Key	Chat	Stream	Tools	Config Key
OpenAI	`openai`	✅	✅	✅	`ai.llm.openai`
Anthropic	`anthropic`	✅	✅	✅	`ai.llm.anthropic`
Google Gemini	`gemini`	✅	✅	❌	`ai.llm.gemini`
Ollama	`ollama`	✅	✅	✅	`ai.llm.ollama`
xAI Grok	`grok`	✅	✅	✅	`ai.llm.grok`
Mistral	`mistral`	✅	✅	✅	`ai.llm.mistral`
Cohere	`cohere`	✅	✅	✅	`ai.llm.cohere`

Tool Calling

$response = Neuro::chat()
    ->provider('openai')
    ->model('gpt-4o')
    ->message('What is the weather in Paris?')
    ->tools([
        [
            'type' => 'function',
            'function' => [
                'name' => 'get_weather',
                'description' => 'Get the weather for a city',
                'parameters' => [
                    'type' => 'object',
                    'properties' => [
                        'city' => ['type' => 'string'],
                    ],
                ],
            ],
        ],
    ]);

Custom Ollama Setup

Ollama runs locally with no API key required:

# Install Ollama
brew install ollama

# Pull a model
ollama pull llama3

# Run the server
ollama serve

AI_DEFAULT_LLM=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_LLM_MODEL=llama3

$response = Neuro::chat()
    ->provider('ollama')
    ->model('llama3')
    ->message('Hello, how are you?')
    ->chat();

Embedding Providers

Provider	Driver Key	Default Dimensions	Config Key
OpenAI	`openai`	1536	`ai.embedding.openai`
Ollama	`ollama`	4096	`ai.embedding.ollama`
Gemini	`gemini`	768	`ai.embedding.gemini`
Mistral	`mistral`	1024	`ai.embedding.mistral`
Cohere	`cohere`	1024	`ai.embedding.cohere`

Override dimensions via .env to match your vector database:

OPENAI_EMBEDDING_DIMENSIONS=768
GEMINI_EMBEDDING_DIMENSIONS=768

$vector = Neuro::embedding()
    ->driver('openai')
    ->embed('Your text here');

Vector Databases

Supported Databases

Database	Driver Key	Create	Upsert	Search	Delete	Filter	Config Key
Qdrant	`qdrant`	✅	✅	✅	✅	✅	`ai.vector.qdrant`
Pinecone	`pinecone`	❌	✅	✅	✅	✅	`ai.vector.pinecone`
pgvector	`pgvector`	✅	✅	✅	✅	✅	`ai.vector.pgvector`
Weaviate	`weaviate`	✅	✅	✅	✅	✅	`ai.vector.weaviate`
Milvus	`milvus`	✅	✅	✅	✅	✅	`ai.vector.milvus`
Chroma	`chroma`	✅	✅	✅	✅	✅	`ai.vector.chroma`

Basic Usage

// Get a vector driver
$vector = Neuro::vector()->driver('qdrant');

// Create a collection
$vector->createCollection('knowledge', 1536);

// Upsert vectors
$vector->upsert('knowledge', [
    [
        'id' => '1',
        'vector' => [0.1, 0.2, ...],
        'payload' => ['text' => 'Some content', 'source' => 'docs'],
    ],
]);

// Search
$results = $vector->search('knowledge', [0.1, 0.2, ...], [
    'top_k' => 10,
    'filter' => ['source' => 'docs'],
]);

// Delete
$vector->delete('knowledge', '1');

pgvector Setup

# Add pgvector extension to your PostgreSQL database
CREATE EXTENSION vector;

# Then use in your code
Neuro::vector()->driver('pgvector')
    ->createCollection('embeddings', 1536);

Neuro::vector()->driver('pgvector')
    ->upsert('embeddings', [
        [
            'id' => 'doc_1',
            'vector' => [0.1, 0.2, ...],
            'payload' => ['text' => 'Hello world'],
        ],
    ]);

Qdrant Setup

# With Docker
docker run -p 6333:6333 qdrant/qdrant

# Then use in your code
Neuro::vector()->driver('qdrant')
    ->createCollection('documents', 1536);

RAG Pipeline

The RAG (Retrieval Augmented Generation) pipeline retrieves relevant context from a vector store and uses it to answer questions.

Basic RAG

$response = Neuro::rag()
    ->collection('knowledge_base')
    ->question('What is Laravel?')
    ->answer();

// $response['answer'] => string (the LLM's answer with context)
// $response['sources'] => array (the retrieved chunks)
// $response['tokens'] => array (token usage)

Using the Pipeline Directly

$pipeline = Neuro::rag()->pipeline();

$response = $pipeline
    ->collection('knowledge_base')
    ->question('What is Laravel?')
    ->topK(10)
    ->minScore(0.5)
    ->answer();

// Just search without LLM
$results = $pipeline->search();

The RAG Flow

User Question
    ↓
[1] Embed the question
    ↓
[2] Search vector database for similar content
    ↓
[3] Build context from retrieved chunks
    ↓
[4] Send question + context to LLM
    ↓
Answer with sources

Agents & Tool Calling

Creating an Agent

$agent = Neuro::agent('openai')
    ->session('user-123')
    ->maxSteps(5)
    ->tool('get_time', function () {
        return now()->toDateTimeString();
    }, 'Get the current date and time')
    ->tool('calculate', function (float $a, string $op, float $b) {
        return match ($op) { '+' => $a + $b, '-' => $a - $b, '*' => $a * $b, '/' => $a / $b };
    }, 'Perform a calculation');

$result = $agent->run('What time is it?');
// $result['response'] => string
// $result['steps'] => int

Registering Tools via Manager

use Manik\Neuro\Facades\Neuro;

$driver = Neuro::llm()->driver('openai');
$driver->tools($messages, [
    [
        'type' => 'function',
        'function' => [
            'name' => 'search_web',
            'description' => 'Search the web for information',
            'parameters' => [
                'type' => 'object',
                'required' => ['query'],
                'properties' => [
                    'query' => ['type' => 'string', 'description' => 'Search query'],
                ],
            ],
        ],
    ],
]);

Memory

Available Drivers

Driver	Key	Storage	Description
Session	`session`	Laravel session	Per-request/session memory
Conversation	`conversation`	In-memory	Runtime conversation history
Persistent	`persistent`	Database	Long-term persistent storage

Usage

// Using the memory manager
Neuro::memory()->driver('session')->add('session-1', [
    'role' => 'user',
    'content' => 'Hello!',
]);

$history = Neuro::memory()->driver('session')->get('session-1');
// Returns array of messages, limited by config

Neuro::memory()->driver('session')->clear('session-1');

Persistent Memory

// Requires running the migration
Neuro::memory()->driver('persistent')->add('user-456', [
    'role' => 'user',
    'content' => 'Remember my name is John',
]);

$history = Neuro::memory()->driver('persistent')->get('user-456');

Image Generation

OpenAI DALL-E

$result = Neuro::image()
    ->driver('openai')
    ->generate('A serene mountain landscape at sunset', [
        'size' => '1024x1024',
        'quality' => 'hd',
    ]);

// $result['url'] => string
// $result['revised_prompt'] => string

Edit Image

$result = Neuro::image()
    ->driver('openai')
    ->edit('/path/to/image.png', 'Add a rainbow to the sky');

Variations

$result = Neuro::image()
    ->driver('openai')
    ->variations('/path/to/image.png', ['n' => 3]);

Speech Processing

Text-to-Speech

$audioContent = Neuro::speech()
    ->driver('openai')
    ->synthesize('Hello, welcome to Laravel AI!', [
        'voice' => 'alloy',
        'model' => 'tts-1',
    ]);

// Save to file
Storage::put('audio/welcome.mp3', $audioContent);

Speech-to-Text

$transcription = Neuro::speech()
    ->driver('openai')
    ->transcribe('/path/to/audio.mp3');

// $transcription['text'] => string

Document Ingestion & Chunking

Supported Formats

Format	Auto-detected
`.txt`	✅
`.md`	✅
`.html`	✅
`.csv`	✅
`.json`	✅

Ingest a Document

Neuro::rag()->ingestion()
    ->ingestFromPath(storage_path('docs/laravel-intro.md'), 'knowledge_base');

Ingest Raw Content

Neuro::rag()->ingestion()
    ->ingestRaw('# Laravel\nLaravel is a PHP framework...', 'knowledge_base', [
        'source' => 'manual',
        'author' => 'John',
    ]);

Chunking Strategies

Strategy	Class	Description
Fixed Size	`FixedSizeChunking`	Split by character count with overlap
Recursive	`RecursiveChunking`	Split by paragraphs → sentences → chars
Semantic	`SemanticChunking`	Split by headings and blank lines
Sliding Window	`SlidingWindowChunking`	Overlapping windows with stride

use Manik\Neuro\RAG\Chunking\SemanticChunking;

Neuro::rag()->ingestion()
    ->setChunkStrategy(new SemanticChunking)
    ->ingestRaw($markdownContent, 'docs');

Full Ingestion Pipeline

Document
    ↓ Chunking (FixedSize / Recursive / Semantic / SlidingWindow)
Chunks
    ↓ Embedding (via configured embedding provider)
Vector Embeddings
    ↓ Upsert to Vector Store
Stored in Collection

Testing

Fake Responses

use Manik\Neuro\Facades\Neuro;

// Enable fake mode
Neuro::fake();

// All chat calls now return fake responses
$response = Neuro::chat()
    ->message('This will not hit the API')
    ->chat();

// $response['content'] === 'fake response'

Fake Embeddings

Neuro::fake();

$result = Neuro::chat()
    ->text('Test text')
    ->embed('openai');

// Returns zeroed-out embedding vector with 1536 dimensions

Events & Observability

Events

Event	Description	Payload
`MessageSending`	Before an LLM call is made	provider, model, messages, options
`MessageReceived`	After an LLM response	provider, model, response, latency
`EmbeddingCreated`	After embedding is generated	provider, model, text, dimensions
`VectorStored`	After vectors are upserted	provider, collection, record_count
`DocumentIndexed`	After a document is indexed	collection, document, chunk_count

use Manik\Neuro\Events\MessageReceived;

Event::listen(MessageReceived::class, function (MessageReceived $event) {
    Log::info('LLM call completed', [
        'provider' => $event->provider,
        'model' => $event->model,
        'latency' => $event->latency,
    ]);
});

Observability Configuration

// config/neuro.php
'observability' => [
    'track_cost' => env('AI_TRACK_COST', false),
    'track_tokens' => env('AI_TRACK_TOKENS', false),
    'track_latency' => env('AI_TRACK_LATENCY', false),
    'store' => env('AI_OBSERVABILITY_STORE', 'log'),
],

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Facade (Neuro::)                      │
├─────────────────────────────────────────────────────────┤
│                    AIClient                             │
├─────────────────────────────────────────────────────────┤
│                   NeuroManager                         │
├──────┬──────┬──────┬──────┬──────┬──────┬──────┬───────┤
│  LLM │Embed │Vector│ Image│Speech│ RAG  │Memory│ Agent │
│Manager│Mgr   │Mgr   │ Mgr  │ Mgr  │ Mgr  │ Mgr  │       │
├──────┼──────┼──────┼──────┼──────┼──────┼──────┼───────┤
│Driver│Driver│Driver│Driver│Driver│Pipeline│Drivers│Agent│
│  AI  │  AI  │  DB  │  AI  │  AI  │+Chunk │Session│+Tools│
│Anthrop│Ollama│Qdrant│DALL-E│TTS   │+Ingest│Persist│      │
│Gemini│Mistral│Pinecn│      │STT   │+Rerank│       │      │
│Ollama│Cohere│pgvec │      │      │       │       │      │
│Others│      │Weav. │      │      │       │       │      │
└──────┴──────┴──────┴──────┴──────┴──────┴──────┴───────┘

Extending with Custom Drivers

You can register custom drivers at runtime:

// Custom LLM driver
Neuro::llm()->extend('my-provider', function ($app) {
    return new MyCustomDriver(config('ai.llm.my-provider'));
});

// Custom embedding driver
Neuro::embedding()->extend('my-embedder', function ($app) {
    return new MyEmbedder(config('ai.embedding.my-embedder'));
});

// Custom vector driver
Neuro::vector()->extend('my-vector-db', function ($app) {
    return new MyVectorDB(config('ai.vector.my-vector-db'));
});

Then add the corresponding config to config/neuro.php and use it:

$response = Neuro::chat()
    ->provider('my-provider')
    ->message('Hello')
    ->chat();

License

The MIT License (MIT). See LICENSE for more information.

manik / neuro

Maintainers

Package info

Statistics

Security

README

Installation

Configuration

LLM Providers

Embedding Providers

Vector Databases

Temperature & Generation Settings

System Instructions

Embedding Dimensions

RAG Configuration

Cache & Rate Limiting

Quick Start

Chat Completion

Streaming Chat

Embeddings

Vector Search

LLM Providers

Supported Providers

Tool Calling

Custom Ollama Setup

Embedding Providers

Vector Databases

Supported Databases

Basic Usage

pgvector Setup

Qdrant Setup

RAG Pipeline

Basic RAG

Using the Pipeline Directly

The RAG Flow

Agents & Tool Calling

Creating an Agent

Registering Tools via Manager

Memory

Available Drivers

Usage

Persistent Memory

Image Generation

OpenAI DALL-E

Edit Image

Variations

Speech Processing

Text-to-Speech

Speech-to-Text

Document Ingestion & Chunking

Supported Formats

Ingest a Document

Ingest Raw Content

Chunking Strategies

Full Ingestion Pipeline

Testing

Fake Responses

Fake Embeddings

Events & Observability

Events

Observability Configuration

Architecture

Extending with Custom Drivers

License