README

Build related content links using vector embeddings and pgvector for Laravel.

Features

🔗 Pre-computed Related Links - Related content is calculated on save, not on every page load
🚀 Fast Lookups - O(1) relationship queries instead of real-time similarity search
🔄 Cross-Model Relationships - Find related content across different model types (Blog → Events → Questions)
🧠 Multiple Embedding Providers - Support for OpenAI and Ollama
📦 Queue Support - Process embeddings in the background
🔍 Semantic Search - Search content by meaning, not just keywords

Requirements

PHP 8.3+
Laravel 11, 12, or 13
PostgreSQL with pgvector extension

Installation

1. Install pgvector extension in PostgreSQL

CREATE EXTENSION IF NOT EXISTS vector;

The migration runs this automatically, but CREATE EXTENSION requires a privileged database user. On managed Postgres (RDS, Cloud SQL, Supabase, etc.) enable the extension up front — as shown above or via the provider's dashboard — so the migration only needs to create the tables.

2. Install the package via Composer

composer require vlados/laravel-related-content

3. Publish the config and migrations

php artisan vendor:publish --tag="related-content-config"
php artisan vendor:publish --tag="related-content-migrations"
php artisan migrate

4. Configure your environment

# Embedding provider (openai or ollama)
RELATED_CONTENT_PROVIDER=openai

# OpenAI settings
OPENAI_API_KEY=your-api-key
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_EMBEDDING_DIMENSIONS=1536

# Or Ollama settings
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=nomic-embed-text

Usage

1. Add the trait to your models

use Vlados\LaravelRelatedContent\Concerns\HasRelatedContent;

class BlogPost extends Model
{
    use HasRelatedContent;

    /**
     * Define which fields should be embedded.
     */
    public function embeddableFields(): array
    {
        return ['title', 'excerpt', 'content'];
    }
}

2. Configure models for cross-model relationships

In config/related-content.php:

'models' => [
    \App\Models\BlogPost::class,
    \App\Models\Event::class,
    \App\Models\Question::class,
],

3. Related content is automatically synced on save

$post = BlogPost::create([
    'title' => 'Electric Vehicle Charging Guide',
    'content' => '...',
]);

// Embedding is generated and related content is found automatically

4. Retrieve related content

// Get all related content
$related = $post->getRelatedModels();

// Get related content of a specific type
$relatedEvents = $post->getRelatedOfType(Event::class);

// Get the raw relationship with similarity scores (this model as source only)
$post->relatedContent()->with('related')->get();

getRelatedModels() and getRelatedOfType() are the complete views — they merge both directions (where this model is the source or the related target). The raw relatedContent() relation returns only the rows where this model is the source; because each pair is stored once (and refreshed on re-sync), prefer the helper methods when you want the full set.

5. Use in Blade templates

@if($post->relatedContent->isNotEmpty())
    <div class="related-content">
        <h3>Related Content</h3>
        @foreach($post->getRelatedModels(5) as $item)
            <a href="{{ $item->url }}">{{ $item->title }}</a>
        @endforeach
    </div>
@endif

Artisan Commands

Rebuild embeddings and related content

# Process models missing embeddings (default behavior)
php artisan related-content:rebuild

# Process a specific model (missing only)
php artisan related-content:rebuild "App\Models\BlogPost"

# Force regenerate all embeddings
php artisan related-content:rebuild --force

# Process synchronously (instead of queuing)
php artisan related-content:rebuild --sync

# With custom chunk size
php artisan related-content:rebuild --chunk=50

Semantic Search

You can also use the package for semantic search:

use Vlados\LaravelRelatedContent\Services\RelatedContentService;

$service = app(RelatedContentService::class);

// Search across all embeddable models
$results = $service->search('electric vehicle charging');

// Search specific model types
$results = $service->search('charging stations', [
    \App\Models\Event::class,
    \App\Models\BlogPost::class,
]);

// By default search returns the closest N matches regardless of distance.
// Pass a minimum similarity (0-1) to filter out weak matches:
$results = $service->search('charging stations', [], limit: 10, threshold: 0.5);

Configuration

return [
    // Embedding provider: 'openai' or 'ollama'
    'provider' => env('RELATED_CONTENT_PROVIDER', 'openai'),

    // Provider-specific settings
    'providers' => [
        'openai' => [
            'api_key' => env('OPENAI_API_KEY'),
            'base_url' => env('OPENAI_BASE_URL', 'https://api.openai.com/v1'),
            'model' => env('OPENAI_EMBEDDING_MODEL', 'text-embedding-3-small'),
            'dimensions' => env('OPENAI_EMBEDDING_DIMENSIONS', 1536),
        ],
        'ollama' => [
            'base_url' => env('OLLAMA_BASE_URL', 'http://localhost:11434'),
            'model' => env('OLLAMA_EMBEDDING_MODEL', 'nomic-embed-text'),
            'dimensions' => env('OLLAMA_EMBEDDING_DIMENSIONS', 768),
        ],
    ],

    // Maximum related items per model
    'max_related_items' => 10,

    // Minimum similarity threshold (0-1)
    'similarity_threshold' => 0.5,

    // Queue settings
    'queue' => [
        'connection' => 'default',
        'name' => 'default',
    ],

    // Models to include in cross-model relationships
    'models' => [],

    // Database table names
    'tables' => [
        'embeddings' => 'embeddings',
        'related_content' => 'related_content',
    ],
];

Embedding dimensions

The active provider's dimensions value is the single source of truth. It sizes the vector column when the migration runs and determines the length of every stored vector, so the two can never drift. The top-level dimensions key is only a fallback used when the active provider does not define its own.

Because the column width is fixed at migration time, changing the effective dimension count (or switching to a provider with a different one) requires a fresh migration of the embeddings table — re-run related-content:rebuild afterwards to regenerate the vectors.

Disabling

Set RELATED_CONTENT_ENABLED=false (or leave the provider's credentials empty) and the package degrades gracefully: no jobs are dispatched and no embeddings are written, so existing rows are left untouched.

Events

The package dispatches events you can listen to:

use Vlados\LaravelRelatedContent\Events\RelatedContentSynced;

class HandleRelatedContentSynced
{
    public function handle(RelatedContentSynced $event): void
    {
        // $event->model - The model that was synced
    }
}

How It Works

On Model Save: When a model with HasRelatedContent is saved, a job is dispatched
Generate Embedding: The job generates a vector embedding from the model's embeddable fields
Find Similar: Uses pgvector to find similar content across all configured models
Store Links: Stores the related content relationships in the related_content table
Fast Retrieval: When displaying related content, it's a simple database lookup (no API calls)

Bidirectional Relationships

Related content works in both directions automatically. When a new BlogPost is saved and finds an Event as related, the Event will also show the BlogPost in its related content - without needing to re-sync the Event.

This is achieved by querying both directions:

Forward: where this model is the source
Reverse: where this model is the related target

Results are deduplicated and sorted by similarity score.

When a model is re-synced (its embeddable content changed), every link incident to it is rebuilt in both directions, so similarity scores never go stale. Links to models that are no longer mutually similar are rebuilt on their own next sync, or run related-content:rebuild --force to refresh the whole graph at once.

Search accuracy at scale

The embeddings table uses an HNSW index, which performs approximate nearest -neighbour search. When you mix several model types and filter by type (or by the similarity threshold), pgvector may occasionally return fewer than max_related_items candidates because the type filter is applied after the index narrows the search. If you rely on cross-model results over a large dataset, raise hnsw.ef_search for the session or consider pgvector 0.8+ iterative index scans.

Performance

Embedding Generation: ~200-500ms per model (depends on text length and provider)
Related Content Lookup: ~5ms (simple database query)
Storage: ~6KB per embedding (1536 dimensions x 4 bytes)

License

MIT License. See LICENSE for more information.

Credits

Vladislav Stoitsov

vlados / laravel-related-content

Maintainers

Package info

Statistics

Security