subhashladumor1/laravel-ai-guard

Laravel AI Guard πŸ›‘οΈ β€” AI cost & budget control for Laravel AI SDK. Track token usage, control OpenAI & LLM spending, enforce AI budgets, and prevent unexpected billing spikes.

Fund package maintenance!
subhashladumor1
Patreon

Installs: 16

Dependents: 0

Suggesters: 1

Security: 0

Stars: 17

Watchers: 0

Forks: 0

Open Issues: 0

pkg:composer/subhashladumor1/laravel-ai-guard

1.0.2 2026-02-10 02:15 UTC

This package is auto-updated.

Last update: 2026-02-14 08:14:42 UTC


README

Track costs β€’ Set budgets β€’ Never get surprised by the bill.

Laravel AI Guard is a powerful AI cost optimization package built for the Laravel AI SDK (12.x) πŸš€. It helps Laravel developers track OpenAI & LLM token usage πŸ“Š, estimate AI costs before execution ⚠️, enforce per-user or per-tenant AI budgets 🧾, and prevent unexpected AI billing spikes πŸ’₯ in production.

Designed for Laravel SaaS applications, APIs, and AI-powered platforms, Laravel AI Guard acts as a financial firewall πŸ›‘οΈ between your app and AI providersβ€”keeping AI usage safe, predictable, and cost-efficient πŸ’Έ.

---

πŸ“‘ Quick Navigation

Jump to Jump to
What's Inside How It Works
Quick Start Usage Examples
Configuration Package Structure

✨ What's Inside

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                                                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚   β”‚   TRACK     β”‚  β”‚   BUDGET    β”‚  β”‚  ESTIMATE   β”‚  β”‚   BLOCK     β”‚           β”‚
β”‚   β”‚  Every call β”‚  β”‚ Per user/   β”‚  β”‚ Before you  β”‚  β”‚ Over-spend  β”‚           β”‚
β”‚   β”‚  in DB      β”‚  β”‚ tenant/app  β”‚  β”‚ call (free) β”‚  β”‚ requests    β”‚           β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚                                                                                  β”‚
β”‚                        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                              β”‚
β”‚                        β”‚   🚨 KILL SWITCH        β”‚                              β”‚
β”‚                        β”‚   Disable all AI        β”‚                              β”‚
β”‚                        β”‚   in one config change  β”‚                              β”‚
β”‚                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                              β”‚
β”‚                                                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Works with: Laravel AI SDK (12.x) β€’ OpenAI β€’ Anthropic β€’ Any AI API

πŸ”„ How It Works

Request Flow (Before β†’ During β†’ After)

flowchart TD
    subgraph BEFORE["πŸ›‘οΈ BEFORE"]
        A[Request arrives] --> B{Budget OK?}
        B -->|Yes| C[Optional: Estimate cost]
        B -->|No| D[❌ Block - 402]
        C --> E[Continue]
    end

    subgraph DURING["⚑ DURING"]
        E --> F[Your app calls AI]
        F --> G[Laravel AI SDK or any API]
    end

    subgraph AFTER["πŸ“Š AFTER"]
        G --> H[Record tokens, cost, user]
        H --> I[Save to ai_usages]
        I --> J[Update ai_budgets]
    end

    BEFORE --> DURING --> AFTER
Loading

Budget Hierarchy (Checked in Order)

flowchart LR
    subgraph layers["Budget layers checked top to bottom"]
        direction TB
        A["🌍 GLOBAL<br/>Whole app limit"]
        B["🏒 TENANT<br/>Org/team limit"]
        C["πŸ‘€ USER<br/>Per-user limit"]
    end

    A --> B --> C

    C --> D{All OK?}
    D -->|Yes βœ“| E[Allow request]
    D -->|Any exceeded βœ—| F[Block - 402]
Loading

TL;DR: Laravel AI SDK does the AI. Laravel AI Guard decides whether you're allowed to call and how much you spent. They work together.

πŸ€” Why Should I Care?

     WITHOUT AI GUARD                    WITH AI GUARD
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ’Έ Surprise bill       β”‚      β”‚  πŸ“Š Full visibility     β”‚
β”‚  πŸ› Runaway loop?       β”‚  β†’   β”‚  πŸ›‘ Budget limits       β”‚
β”‚  😰 Invoice shock       β”‚      β”‚  😌 Predictable costs   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

AI APIs charge by the token. One heavy user, one bugβ€”and your bill spikes. Most apps don't track until the invoice arrives. AI Guard gives you visibility, limits, and control.

πŸ“ Under the Hood

Cost Calculation

flowchart LR
    subgraph inputs["Usage Inputs"]
        A[Input Tokens]
        B[Output Tokens]
        C[Cache Hits/Writes]
        D[Images/Audio/Video]
    end

    subgraph calculation["Calculation"]
        E["Text Cost<br/>(Standard + Long Context)"]
        F["Cache Cost<br/>(Read + Write)"]
        G["Multimodal Cost<br/>(Pixel/Second/Token)"]
    end

    subgraph result["Total"]
        H["Total Cost $"]
    end

    A --> E
    B --> E
    C --> F
    D --> G

    E --> H
    F --> H
    G --> H
Loading

Example: 500 input + 200 output tokens (gpt-4o: $0.0025/1k in, $0.01/1k out)

Step Calculation Result
Input cost (500 Γ· 1000) Γ— 0.0025 $0.00125
Output cost (200 Γ· 1000) Γ— 0.01 $0.00200
Total $0.00325

Cost Optimization (Context Caching) ⚑

Laravel AI Guard supports advanced pricing models including Context Caching (Anthropic, Gemini, OpenAI) to help you track savings accuracy.

Supported Pricing Dimensions:

  • Input Tokens (Standard)
  • Output Tokens (Standard)
  • Cached Input Tokens (Read from cache β€” typically ~50-90% cheaper)
  • Cache Creation Tokens (Write to cache β€” sometimes higher cost)
  • Long context (e.g. >200k tokens β€” premium input_long / output_long rates)
  • Modality-specific: image tokens, audio tokens, per image, per second video, per minute transcription, TTS per 1M characters, web search per 1k calls, embeddings per 1k tokens

Configuration Example (config/ai-guard.php):

'claude-3-5-sonnet' => [
    'input' => 0.003,
    'output' => 0.015,
    'cache_write' => 0.00375, // +25% overhead
    'cached_input' => 0.0003, // -90% savings
],

The package automatically detects cache usage from provider responses and applies the correct lower rate.

Supported Providers, Models & Cost Coverage πŸ“

Pricing is aligned with official 2026 API docs for maximum accurate cost calculation across Chat, Assistants, Agents, and modality-specific use cases.

Provider Pricing Source Coverage
OpenAI Pricing GPT-5.x, GPT-4o, o1, Realtime (Audio/Text), DALLΒ·E 3, Whisper, TTS, Web Search
Google Gemini Pricing Gemini 3 Pro/Flash, 2.5 Pro/Flash, 1.5, Imagen 3, Veo (Video), Embeddings
Anthropic Pricing Claude 4.5, 3.5 Sonnet, 3 Opus, Haiku, Prompt Caching, Long Context
xAI Grok Models Grok 4, Grok 3, Grok Beta, Web Search Tool
Mistral AI Pricing Mistral Large 2, Small, Codestral, Embeddings
DeepSeek Pricing DeepSeek-V3, R1 (Reasoner), Cache Hit/Miss pricing

Full Multimodal Cost Support:

  • LLM / Chat: Input, Output, Cached Input, Cache Write, Long-Context pricing
  • Agents: Web Search (per 1k calls), Code Interpreter (Session based)
  • Audio:
    • Input: Audio tokens (e.g. Gemini 2.5 Flash audio_in, GPT-4o audio_in)
    • Output: Audio tokens (e.g. GPT-4o audio_out)
    • Transcription: Per minute (Whisper)
    • TTS: Per 1M characters (OpenAI TTS)
  • Video:
    • Input: Video tokens (e.g. Gemini video_in)
    • Generation: Per second (Veo per_second_video)
  • Image:
    • Input: Image tokens (e.g. GPT-4o image_in)
    • Generation: Per image (DALLΒ·E 3, Imagen)
  • Embeddings: Per 1k tokens

Pass extended usage when recording to get accurate totals:

AIGuard::recordAndApplyBudget([
    'provider' => 'gemini',
    'model' => 'gemini-2.5-flash',
    'input_tokens' => 1000,
    'output_tokens' => 200,
    'usage' => [
        'input_tokens' => 1000,           // Text tokens
        'output_tokens' => 200,           // Text output
        'video_tokens_in' => 5000,        // Video understanding tokens
        'audio_tokens_in' => 2000,        // Audio input tokens
        'images_generated' => 1,          // Image gen quantity
        'web_search_calls' => 2,          // Per-call tool usage
    ],
    'user_id' => auth()->id(),
]);

Estimation (No API Call = No Cost)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  AIGuard::estimate($prompt)                               β”‚
β”‚                                                           β”‚
β”‚  Input tokens  β‰ˆ  characters Γ· 4    (configurable)       β”‚
β”‚  Output tokens β‰ˆ  input Γ— 0.5       (configurable)       β”‚
β”‚                                                           β”‚
β”‚  "Write a short poem" (18 chars) β†’ ~5 in, ~3 out β†’ 8     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Kill Switch

Method How
.env (recommended) AI_GUARD_DISABLED=true
Config 'ai_disabled' => true

Result: Middleware returns 503 Service Unavailable β€” no AI calls get through.

πŸ’‘ 5 Ways to Reduce AI Costs

    β‘  ESTIMATE         β‘‘ BUDGET          β‘’ TRACK           β‘£ KILL SWITCH      β‘€ TAG
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Show cost   β”‚   β”‚ Set limits   β”‚   β”‚ Run report  β”‚   β”‚ Emergency   β”‚   β”‚ Break down  β”‚
β”‚ before call β”‚   β”‚ per user/    β”‚   β”‚ to see      β”‚   β”‚ stop all    β”‚   β”‚ by feature  β”‚
β”‚             β”‚   β”‚ tenant       β”‚   β”‚ where $ goesβ”‚   β”‚ AI if neededβ”‚   β”‚ (chat, etc) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“‹ Requirements

Requirement Version
PHP 8.1+
Laravel 10.x, 11.x, or 12.x
Laravel AI SDK Optional (for agents/streaming)

πŸš€ Quick Start (3 Steps)

flowchart LR
    subgraph step1["Step 1"]
        A[composer require]
    end

    subgraph step2["Step 2"]
        B[publish config<br/>& migrations]
    end

    subgraph step3["Step 3"]
        C[migrate]
    end

    A --> B --> C
Loading

1. Install

composer require subhashladumor1/laravel-ai-guard

2. Publish & migrate

php artisan vendor:publish --tag=ai-guard-config
php artisan vendor:publish --tag=ai-guard-migrations
php artisan migrate

3. Optional β€” translations

php artisan vendor:publish --tag=ai-guard-lang

creates: ai_usages (tracks every request & cost) + ai_budgets (stores current usage vs limit)

βš™οΈ Configuration

Edit config/ai-guard.php after publishing:

Setting Purpose
ai_disabled Turn off all AI
pricing Cost per 1k tokens per model
default_model Fallback (e.g. gpt-4o)
default_provider Fallback (e.g. openai)
budgets Limits (global, user, tenant); period
estimation Chars per token, output multiplier

Example .env:

AI_GUARD_DISABLED=false
AI_GUARD_GLOBAL_LIMIT=100
AI_GUARD_USER_LIMIT=10
AI_GUARD_TENANT_LIMIT=50

πŸ“– Usage Examples

With Laravel AI SDK (12.x)

sequenceDiagram
    participant App
    participant AIGuard
    participant AI

    App->>AIGuard: checkAllBudgets()
    App->>AIGuard: estimate(prompt)
    App->>AI: prompt()
    AI-->>App: response
    App->>AIGuard: recordFromResponse()
Loading
// 1. Before β€” check budget
AIGuard::checkAllBudgets(auth()->id(), $tenantId);
$estimate = AIGuard::estimate($userPrompt);

// 2. Call AI (as normal)
$response = (new YourAgent)->prompt($userPrompt);

// 3. After β€” record usage
AIGuard::recordFromResponse($response, userId: auth()->id(), tenantId: $tenantId, tag: 'chat');

Multi-model: Pass model and provider so estimate and budgets use the right cost:

$estimate = AIGuard::estimate($userPrompt, model: 'gpt-4o-mini', provider: 'openai');
AIGuard::recordFromResponse($response, userId: auth()->id(), provider: 'openai', model: 'gpt-4o-mini');

Streaming: record in ->then() callback when stream finishes.

With Any Other AI API

// Before β€” same
AIGuard::checkAllBudgets(auth()->id(), $tenantId);

// After β€” record manually
AIGuard::recordAndApplyBudget([
    'provider' => 'openai',
    'model' => 'gpt-4o',
    'input_tokens' => 400,
    'output_tokens' => 250,
    'user_id' => auth()->id(),
    'tenant_id' => $tenantId,
    'tag' => 'chat',
]);

Extended usage (audio, video, image, tools) β€” pass a usage array for accurate cost when using modalities or tools:

AIGuard::recordAndApplyBudget([
    'provider' => 'openai',
    'model' => 'gpt-4o',
    'input_tokens' => 500,
    'output_tokens' => 300,
    'usage' => [
        'input_tokens' => 500,
        'output_tokens' => 300,
        'cached_input_tokens' => 0,
        'images_generated' => 2,           // DALLΒ·E / image models
        'web_search_calls' => 5,           // agent tool calls
        'transcription_minutes' => 1.5,    // Whisper / transcribe
        'tts_characters' => 2500,         // TTS
        'embedding_tokens' => 1000,        // embeddings
        'video_seconds' => 10,             // Veo / video gen
    ],
    'user_id' => auth()->id(),
    'tag' => 'agent-with-search',
]);

Multi-model and dynamic cost (no config change)

Cost is resolved in order: per-call override β†’ runtime pricing β†’ config. So you can support many models and change costs at runtime without editing config/ai-guard.php.

1. Per-call pricing override β€” pass pricing for a single estimate or record:

// Estimate with custom cost per 1k tokens (no config entry needed)
$estimate = AIGuard::estimate($userPrompt, 'my-model', 'my-provider', [
    'input' => 0.001,
    'output' => 0.002,
]);

// Record with custom pricing when cost isn't pre-calculated
AIGuard::recordFromResponse($response, auth()->id(), $tenantId, 'openai', 'gpt-4o', 'chat', [
    'input' => 0.0025,
    'output' => 0.01,
]);

// record() can omit 'cost' and use 'pricing' to calculate
AIGuard::record([
    'provider' => 'openai',
    'model' => 'gpt-4o',
    'input_tokens' => 400,
    'output_tokens' => 250,
    'pricing' => ['input' => 0.0025, 'output' => 0.01],
    'user_id' => auth()->id(),
]);

2. Runtime pricing registry β€” register models once (e.g. in a service provider or from DB); then estimate() and recording use them automatically:

$calc = AIGuard::getCostCalculator();

// Single model
$calc->setPricing('openai', 'gpt-4o-mini', ['input' => 0.00015, 'output' => 0.0006]);

// Many models at once
$calc->setPricingMap([
    'openai' => [
        'gpt-4o' => ['input' => 0.0025, 'output' => 0.01],
        'gpt-4o-mini' => ['input' => 0.00015, 'output' => 0.0006],
    ],
    'anthropic' => [
        'claude-3-5-sonnet' => ['input' => 0.003, 'output' => 0.015],
    ],
]);

// Now estimate/record use these models without config
$estimate = AIGuard::estimate($userPrompt, 'gpt-4o-mini', 'openai');
AIGuard::checkAllBudgets(auth()->id(), $tenantId);

Add, update or remove models at runtime:

$calc = AIGuard::getCostCalculator();

// Add or update a model
$calc->setPricing('openai', 'gpt-4o', ['input' => 0.0025, 'output' => 0.01]);

// Remove a model from runtime (falls back to config, or 0 if not in config)
$calc->removePricing('openai', 'gpt-4o');

// Clear all runtime pricing
$calc->clearRuntimePricing();

Config file β€” publish and edit config/ai-guard.php to add, remove or update models permanently:

'pricing' => [
    'openai' => [
        'gpt-4o' => ['input' => 0.0025, 'output' => 0.01],
        'gpt-4o-mini' => ['input' => 0.00015, 'output' => 0.0006],
        // Add new models here
    ],
    // Add new providers here
],

Budget checks use the same cost you record (per user/tenant), so multi-model costs and budgets work together.

Middleware

Route::post('/chat', ChatController::class)->middleware('ai.guard');
Condition Response
Over budget 402 + JSON
AI disabled 503

Artisan Commands

Command Purpose
php artisan ai-guard:report Usage & cost report
php artisan ai-guard:report --period=month Monthly report
php artisan ai-guard:report --days=7 Last 7 days
php artisan ai-guard:reset-budgets Reset when period ends
php artisan ai-guard:reset-budgets --dry-run Preview only

Schedule reset: $schedule->command('ai-guard:reset-budgets')->daily();

πŸ—‚οΈ Package Structure

flowchart TB
    subgraph entry["Entry Points"]
        F[AIGuard Facade]
        M[EnforceAIBudget Middleware]
        C1[ai-guard:report]
        C2[ai-guard:reset-budgets]
    end

    subgraph core["Core"]
        GM[GuardManager]
    end

    subgraph services["Services"]
        BR[BudgetResolver]
        BE[BudgetEnforcer]
        TE[TokenEstimator]
        CC[CostCalculator]
    end

    subgraph storage["Storage"]
        AU[AiUsage]
        AB[AiBudget]
    end

    F --> GM
    M --> GM
    C1 --> GM
    C2 --> GM
    GM --> BR
    GM --> BE
    GM --> TE
    GM --> CC
    BR --> AB
    BE --> AB
    CC --> AU
Loading
laravel-ai-guard/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ GuardManager.php          # Core logic
β”‚   β”œβ”€β”€ Facades/AIGuard.php
β”‚   β”œβ”€β”€ Budget/                   # BudgetResolver, BudgetEnforcer
β”‚   β”œβ”€β”€ Cost/                     # TokenEstimator, CostCalculator
β”‚   β”œβ”€β”€ Models/                   # AiUsage, AiBudget
β”‚   β”œβ”€β”€ Middleware/
β”‚   β”œβ”€β”€ Commands/
β”‚   └── Exceptions/
β”œβ”€β”€ database/migrations/
β”œβ”€β”€ lang/                         # 11 locales
└── tests/

🌍 Real-World Scenarios

1. The "Safe" Chatbot πŸ€– (OpenAI + Laravel AI SDK)

Goal: Build a chatbot that users can't abuse to run up a huge bill. Safety Check: Estimate cost before the request.

use Subhashladumor1\LaravelAiGuard\Facades\AIGuard;
use Illuminate\Http\Request;

public function chat(Request $request) 
{
    $user = auth()->user();
    $prompt = $request->input('message');

    // 1️⃣ Run budget check (throws overflow exception if user is over limit)
    AIGuard::checkAllBudgets($user->id, $user->team_id);

    // 2️⃣ Estimate cost (OpenAI/Text is roughly 4 chars/token)
    // If the prompt is huge (e.g. paste-bin attack), stop it here.
    $estimatedCost = AIGuard::estimate($prompt, 'gpt-4o', 'openai');
    
    if ($estimatedCost > 0.50) {
        return response()->json(['error' => 'Message too long/expensive.'], 400);
    }
    
    // 3️⃣ Call AI (Laravel AI SDK simple example)
    $response = \AI::chat($prompt);

    // 4️⃣ Record actual usage
    // Tracks input, output, and updates User + Tenant budgets
    AIGuard::recordFromResponse($response, $user->id, $user->team_id, 'openai', 'gpt-4o', 'chatbot');
    
    return response()->json(['reply' => $response]);
}

2. Video Analysis Agent πŸŽ₯ (Gemini 2.5) β€” Multimodal

Goal: Analyze uploaded videos. Video processing is expensive per second. Method: Use specific keys for video_seconds or video_tokens.

// User uploads a 30-second video clip
$videoPath = $request->file('video')->store('videos');

// Call Gemini API (Direct HTTP / Google Client - No Laravel SDK)
$geminiResponse = Http::post('https://generativelanguage.googleapis.com/...', [
    // ... payload with video data ...
]);

$result = $geminiResponse->json();

// πŸ’‘ Record complex usage:
AIGuard::recordAndApplyBudget([
    'provider' => 'gemini',
    'model' => 'gemini-2.5-flash',
    'input_tokens' => 500,        // Prompt text
    'output_tokens' => 200,       // Analysis text
    'usage' => [
        'input_tokens' => 500,
        'video_tokens_in' => 7500, // Video tokens (approx 250/sec)
        // OR use direct billing unit if supported: 'video_seconds' => 30
    ],
    'user_id' => auth()->id(),
    'tag' => 'video-analysis'
]);

3. Long Document Summarizer πŸ“„ (Claude 3.5 Sonnet + Caching)

Goal: Summarize a 100-page PDF. Reuse the PDF context for follow-up questions to save 90% cost. Method: Track cached_input_tokens.

// 1st Call: Upload & Cache
// Anthropic returns 'cache_creation_input_tokens' (write cost)
AIGuard::recordAndApplyBudget([
    'provider' => 'anthropic',
    'model' => 'claude-3-5-sonnet',
    'input_tokens' => 50000,
    'usage' => [
        'input_tokens' => 50000,
        'cache_write_tokens' => 50000, // Expensive write
    ],
    'user_id' => auth()->id(),
]);

// 2nd Call: Ask question about PDF
// Anthropic returns 'cache_read_input_tokens' (Cheap read! ~10% cost)
AIGuard::recordAndApplyBudget([
    'provider' => 'anthropic',
    'model' => 'claude-3-5-sonnet',
    'input_tokens' => 50100, // 50k context + 100 new prompt
    'usage' => [
        'input_tokens' => 50100,
        'cached_input_tokens' => 50000, // Cheap HIT!
        'output_tokens' => 500,
    ],
    // AIGuard automatically calculates the lower bill for cached tokens
    'user_id' => auth()->id(),
]);

4. Background Data Processing βš™οΈ (DeepSeek / Mistral + Batch)

Goal: Process 10,000 rows of data nightly. Optimisation: Use a cheaper model (DeepSeek V3 / Mistral Small).

foreach ($rows as $row) {
    // Check global budget first to prevent runaway loops
    try {
        AIGuard::checkAllBudgets(null, $tenant->id); 
    } catch (\Exception $e) {
        Log::alert("Budget exceeded during batch! Stopping.");
        break;
    }

    // Call DeepSeek API directly
    $response = Http::withToken($key)->post('https://api.deepseek.com/chat/completions', [
        'model' => 'deepseek-chat',
        'messages' => [['role' => 'user', 'content' => "Analyze: " . $row->text]]
    ]);

    // Track it
    AIGuard::recordAndApplyBudget([
        'provider' => 'deepseek',
        'model' => 'deepseek-chat', 
        'input_tokens' => $response['usage']['prompt_tokens'],
        'output_tokens' => $response['usage']['completion_tokens'],
        'usage' => [
            'cached_input_tokens' => $response['usage']['prompt_cache_hit_tokens'] ?? 0, 
        ],
        'tenant_id' => $tenant->id,
        'tag' => 'nightly-batch'
    ]);
}

🌍 Multi-Language

11 locales: en, ar, es, fr, de, zh, hi, bn, pt, ru, ja

App locale used automatically. Customize: php artisan vendor:publish --tag=ai-guard-lang

🏒 Multi-Tenant (SaaS)

  • Store tenant_id on each usage
  • Set tenant budgets in config
  • Middleware reads tenant from X-Tenant-ID header or request attribute

🚧 Beta Notice: Laravel AI Guard is currently in beta. Please report any issues with cost calculation, token estimation, or edge cases by opening a GitHub issue. Community feedback is highly appreciated.

πŸ§ͺ Testing

composer install && php artisan test

πŸ“„ License

MIT. See LICENSE.