drupal/ai_recipe_content_search_vector

Provides basic configuration for implementing RAG search within site content.

Maintainers

Package info

git.drupalcode.org/project/ai_recipe_content_search_vector.git

Type:drupal-recipe

pkg:composer/drupal/ai_recipe_content_search_vector

Statistics

Installs: 2

Dependents: 1

Suggesters: 0

1.x-dev 2026-06-11 15:15 UTC

This package is auto-updated.

Last update: 2026-06-11 13:16:00 UTC


README

Sets up a vector search index over all site nodes using Search API and the AI Search module. Provides the foundation for RAG (Retrieval-Augmented Generation) features such as AI-powered chatbots and semantic search.

Requirements

  • Drupal 10.5+ or 11.2+
  • AI module with a configured default provider for the embeddings operation type (e.g. OpenAI text-embedding-3-small, text-embedding-ada-002) at /admin/config/ai/settings → Default Providers
  • A vector-capable database backend supported by search_api_ai_search (e.g. PostgreSQL with the pgvector extension)

Apply

composer require drupal/ai_recipe_content_search_vector
php core/scripts/drupal recipe recipes/contrib/ai_recipe_content_search_vector
drush cache:rebuild

If no default embeddings provider is configured, the recipe apply will abort and roll back. Configure a default model and re-run.

What it does

  • Installs node, search_api, and ai_search.
  • Creates a search_index view mode for every node type by cloning the existing Default view display. This view mode is used to render nodes into clean HTML for embedding.
  • Creates a Search API server (content_vector) using the search_api_ai_search backend with the following embedding strategy:
    • Strategy: contextual chunks
    • Chunk size: 1000 characters, minimum overlap: 100 characters
    • Contextual content contribution: up to 30 % of chunk
    • Vector similarity metric: cosine similarity
    • Storage collection: content_database_index
  • Creates a Search API index (content_vector) that indexes all node bundles and all languages, with three fields:

    FieldTypePurpose
    rendered_itemtextRendered HTML (anonymous, search_index view mode) — main content for embedding
    titletextNode title — contextual content
    urlstringAbsolute URL — contextual content, passed through to search results
  • Index options: nodes are indexed immediately on save (index_directly), cron processes up to 5 pending items per run as a fallback, failed items are removed from the index (delete_on_fail), and reference changes are tracked.

Indexing content

After applying the recipe, trigger an initial full index:

drush search-api:index content_vector

Or visit Configuration → Search API → Content Vector and click Index now. Progress can be monitored on the same page.

New and updated nodes are re-indexed automatically on save. To re-index all content (e.g. after changing chunk settings or the embedding model):

drush search-api:clear content_vector
drush search-api:index content_vector

Customising the index

Adding or excluding content types

Visit Configuration → Search API → Content Vector → Edit and adjust the datasource bundle selection under NodeBundles.

Changing what gets embedded

Edit the search_index view display for a content type at Structure → Content types → [type] → Manage display → Search index. Remove fields that should not be embedded, or add fields relevant to search quality.

Adjusting chunk settings

Edit the content_vector Search API server at Configuration → Search API → Content Vector server → Edit. Changing chunk size or overlap requires re-indexing all content.

Cost note

Every node save triggers one or more embeddings API calls — one per chunk produced from the rendered content. Long nodes produce more chunks. Bulk re-indexing a large content library will incur proportional API cost. Review your provider's per-token pricing before running a full re-index in production.

Issue queue

Bugs and feature requests: https://www.drupal.org/project/issues/ai_recipe_content_search_vector