Generate product feed for AI Crawlers

Installs: 1

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 0

Forks: 0

Open Issues: 0

Type:magento2-module

pkg:composer/orangecat/feed

1.0.0 2025-11-30 16:15 UTC

This package is auto-updated.

Last update: 2025-11-30 17:12:16 UTC


README

The Orangecat Feed module is a specialized tool designed to generate high-quality, structured product feeds specifically optimized for AI Crawlers and Large Language Models (LLMs). By providing data in a machine-readable format (JSON), it ensures that AI agents can accurately index, understand, and retrieve your product catalog information.

Key Features

  • AI-Optimized Output: Generates feeds in JSON format, which is the preferred structure for modern AI training and retrieval systems.
  • Automated Generation: Uses Magento's Cron system to automatically generate and update feeds on a schedule you define, ensuring your data is always fresh.
  • Chunking Support: Capable of handling large catalogs by splitting feeds into smaller, manageable chunks (e.g., 500 products per file).
  • Customizable Content:
    • Select specific product attributes to include (e.g., description, price, stock status).
    • Option to include product images.
    • FAQ Integration: Seamlessly integrates with the Orangecat Faqs module to include product-specific FAQs in the feed, providing richer context for AI models.
  • Multistore Support: Generates separate feeds for each store view, respecting localizations and currency settings.

Configuration

You can configure the module settings in Stores > Configuration > Orange Cat > Product Feed for AI.

General Settings

  • Enable Feed Generation: Turn the automatic feed generation on or off.
  • Cron Schedule: Define how often the feed should be regenerated using standard cron expression syntax (default: daily at 2 AM).
  • Feed Filename: Set a custom base name for your feed files (e.g., products).
  • Products Per Chunk: Define the number of products per JSON file. Set to 0 to generate a single large file.
  • Product Attributes: Select which Magento product attributes to include in the feed payload. SKU, Name, and Price are always included.
  • Include Product Image: Toggle the inclusion of the main product image URL.
  • Include FAQs: If the Orangecat Faqs module is installed, this option allows you to embed related FAQs directly into the product data object.
  • Output Format: Choose the output format:
    • JSON: Standard structured data.

Log Cleanup

  • Log Retention (Days): Automatically clean up old generation logs after a specified number of days to save database space.

For Developers

Feed Location

Generated feeds are stored in the pub/media/feed directory (or similar public path depending on configuration), making them easily accessible for external crawlers via HTTP.

Extensibility

The module uses a modular architecture for data collection:

  • Data Collectors: You can implement additional data collectors to inject custom data into the feed by extending the module's service layer.
  • Events: Dispatch events during feed generation to allow other modules to modify the data stream.