subhashladumor1 / laravel-ai-guard
Laravel AI Guard π‘οΈ β AI cost & budget control for Laravel AI SDK. Track token usage, control OpenAI & LLM spending, enforce AI budgets, and prevent unexpected billing spikes.
Fund package maintenance!
subhashladumor1
Patreon
Installs: 16
Dependents: 0
Suggesters: 1
Security: 0
Stars: 17
Watchers: 0
Forks: 0
Open Issues: 0
pkg:composer/subhashladumor1/laravel-ai-guard
Requires
- php: ^8.1
- illuminate/support: ^10.0|^11.0|^12.0
Requires (Dev)
- php: ^8.1|^8.2|^8.3
- illuminate/cache: ^9.0|^10.0|^11.0|^12.0
- illuminate/database: ^9.0|^10.0|^11.0|^12.0
- illuminate/http: ^9.0|^10.0|^11.0|^12.0
- illuminate/session: ^9.0|^10.0|^11.0|^12.0
- illuminate/support: ^9.0|^10.0|^11.0|^12.0
- orchestra/testbench: ^8.0
- phpunit/phpunit: ^9.0|^10.0|^11.0|^12.0
Suggests
- laravel/ai: π€ Use with Laravel AI SDK (12.x) to track token usage, estimate AI cost, and enforce budgets automatically.
This package is auto-updated.
Last update: 2026-02-14 08:14:42 UTC
README
Track costs β’ Set budgets β’ Never get surprised by the bill.
Laravel AI Guard is a powerful AI cost optimization package built for the Laravel AI SDK (12.x) π. It helps Laravel developers track OpenAI & LLM token usage π, estimate AI costs before execution β οΈ, enforce per-user or per-tenant AI budgets π§Ύ, and prevent unexpected AI billing spikes π₯ in production.
Designed for Laravel SaaS applications, APIs, and AI-powered platforms, Laravel AI Guard acts as a financial firewall π‘οΈ between your app and AI providersβkeeping AI usage safe, predictable, and cost-efficient πΈ.
---π Quick Navigation
| Jump to | Jump to |
|---|---|
| What's Inside | How It Works |
| Quick Start | Usage Examples |
| Configuration | Package Structure |
β¨ What's Inside
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β TRACK β β BUDGET β β ESTIMATE β β BLOCK β β
β β Every call β β Per user/ β β Before you β β Over-spend β β
β β in DB β β tenant/app β β call (free) β β requests β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β
β βββββββββββββββββββββββββββ β
β β π¨ KILL SWITCH β β
β β Disable all AI β β
β β in one config change β β
β βββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Works with: Laravel AI SDK (12.x) β’ OpenAI β’ Anthropic β’ Any AI API
π How It Works
Request Flow (Before β During β After)
flowchart TD
subgraph BEFORE["π‘οΈ BEFORE"]
A[Request arrives] --> B{Budget OK?}
B -->|Yes| C[Optional: Estimate cost]
B -->|No| D[β Block - 402]
C --> E[Continue]
end
subgraph DURING["β‘ DURING"]
E --> F[Your app calls AI]
F --> G[Laravel AI SDK or any API]
end
subgraph AFTER["π AFTER"]
G --> H[Record tokens, cost, user]
H --> I[Save to ai_usages]
I --> J[Update ai_budgets]
end
BEFORE --> DURING --> AFTER
Loading
Budget Hierarchy (Checked in Order)
flowchart LR
subgraph layers["Budget layers checked top to bottom"]
direction TB
A["π GLOBAL<br/>Whole app limit"]
B["π’ TENANT<br/>Org/team limit"]
C["π€ USER<br/>Per-user limit"]
end
A --> B --> C
C --> D{All OK?}
D -->|Yes β| E[Allow request]
D -->|Any exceeded β| F[Block - 402]
Loading
TL;DR: Laravel AI SDK does the AI. Laravel AI Guard decides whether you're allowed to call and how much you spent. They work together.
π€ Why Should I Care?
WITHOUT AI GUARD WITH AI GUARD
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
β πΈ Surprise bill β β π Full visibility β
β π Runaway loop? β β β π Budget limits β
β π° Invoice shock β β π Predictable costs β
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
AI APIs charge by the token. One heavy user, one bugβand your bill spikes. Most apps don't track until the invoice arrives. AI Guard gives you visibility, limits, and control.
π Under the Hood
Cost Calculation
flowchart LR
subgraph inputs["Usage Inputs"]
A[Input Tokens]
B[Output Tokens]
C[Cache Hits/Writes]
D[Images/Audio/Video]
end
subgraph calculation["Calculation"]
E["Text Cost<br/>(Standard + Long Context)"]
F["Cache Cost<br/>(Read + Write)"]
G["Multimodal Cost<br/>(Pixel/Second/Token)"]
end
subgraph result["Total"]
H["Total Cost $"]
end
A --> E
B --> E
C --> F
D --> G
E --> H
F --> H
G --> H
Loading
Example: 500 input + 200 output tokens (gpt-4o: $0.0025/1k in, $0.01/1k out)
| Step | Calculation | Result |
|---|---|---|
| Input cost | (500 Γ· 1000) Γ 0.0025 | $0.00125 |
| Output cost | (200 Γ· 1000) Γ 0.01 | $0.00200 |
| Total | $0.00325 |
Cost Optimization (Context Caching) β‘
Laravel AI Guard supports advanced pricing models including Context Caching (Anthropic, Gemini, OpenAI) to help you track savings accuracy.
Supported Pricing Dimensions:
- Input Tokens (Standard)
- Output Tokens (Standard)
- Cached Input Tokens (Read from cache β typically ~50-90% cheaper)
- Cache Creation Tokens (Write to cache β sometimes higher cost)
- Long context (e.g. >200k tokens β premium
input_long/output_longrates) - Modality-specific: image tokens, audio tokens, per image, per second video, per minute transcription, TTS per 1M characters, web search per 1k calls, embeddings per 1k tokens
Configuration Example (config/ai-guard.php):
'claude-3-5-sonnet' => [ 'input' => 0.003, 'output' => 0.015, 'cache_write' => 0.00375, // +25% overhead 'cached_input' => 0.0003, // -90% savings ],
The package automatically detects cache usage from provider responses and applies the correct lower rate.
Supported Providers, Models & Cost Coverage π
Pricing is aligned with official 2026 API docs for maximum accurate cost calculation across Chat, Assistants, Agents, and modality-specific use cases.
| Provider | Pricing Source | Coverage |
|---|---|---|
| OpenAI | Pricing | GPT-5.x, GPT-4o, o1, Realtime (Audio/Text), DALLΒ·E 3, Whisper, TTS, Web Search |
| Google Gemini | Pricing | Gemini 3 Pro/Flash, 2.5 Pro/Flash, 1.5, Imagen 3, Veo (Video), Embeddings |
| Anthropic | Pricing | Claude 4.5, 3.5 Sonnet, 3 Opus, Haiku, Prompt Caching, Long Context |
| xAI Grok | Models | Grok 4, Grok 3, Grok Beta, Web Search Tool |
| Mistral AI | Pricing | Mistral Large 2, Small, Codestral, Embeddings |
| DeepSeek | Pricing | DeepSeek-V3, R1 (Reasoner), Cache Hit/Miss pricing |
Full Multimodal Cost Support:
- LLM / Chat: Input, Output, Cached Input, Cache Write, Long-Context pricing
- Agents: Web Search (per 1k calls), Code Interpreter (Session based)
- Audio:
- Input: Audio tokens (e.g. Gemini 2.5 Flash
audio_in, GPT-4oaudio_in) - Output: Audio tokens (e.g. GPT-4o
audio_out) - Transcription: Per minute (Whisper)
- TTS: Per 1M characters (OpenAI TTS)
- Input: Audio tokens (e.g. Gemini 2.5 Flash
- Video:
- Input: Video tokens (e.g. Gemini
video_in) - Generation: Per second (Veo
per_second_video)
- Input: Video tokens (e.g. Gemini
- Image:
- Input: Image tokens (e.g. GPT-4o
image_in) - Generation: Per image (DALLΒ·E 3, Imagen)
- Input: Image tokens (e.g. GPT-4o
- Embeddings: Per 1k tokens
Pass extended usage when recording to get accurate totals:
AIGuard::recordAndApplyBudget([ 'provider' => 'gemini', 'model' => 'gemini-2.5-flash', 'input_tokens' => 1000, 'output_tokens' => 200, 'usage' => [ 'input_tokens' => 1000, // Text tokens 'output_tokens' => 200, // Text output 'video_tokens_in' => 5000, // Video understanding tokens 'audio_tokens_in' => 2000, // Audio input tokens 'images_generated' => 1, // Image gen quantity 'web_search_calls' => 2, // Per-call tool usage ], 'user_id' => auth()->id(), ]);
Estimation (No API Call = No Cost)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AIGuard::estimate($prompt) β
β β
β Input tokens β characters Γ· 4 (configurable) β
β Output tokens β input Γ 0.5 (configurable) β
β β
β "Write a short poem" (18 chars) β ~5 in, ~3 out β 8 β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Kill Switch
| Method | How |
|---|---|
.env (recommended) |
AI_GUARD_DISABLED=true |
| Config | 'ai_disabled' => true |
Result: Middleware returns 503 Service Unavailable β no AI calls get through.
π‘ 5 Ways to Reduce AI Costs
β ESTIMATE β‘ BUDGET β’ TRACK β£ KILL SWITCH β€ TAG
βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Show cost β β Set limits β β Run report β β Emergency β β Break down β
β before call β β per user/ β β to see β β stop all β β by feature β
β β β tenant β β where $ goesβ β AI if neededβ β (chat, etc) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ
π Requirements
| Requirement | Version |
|---|---|
| PHP | 8.1+ |
| Laravel | 10.x, 11.x, or 12.x |
| Laravel AI SDK | Optional (for agents/streaming) |
π Quick Start (3 Steps)
flowchart LR
subgraph step1["Step 1"]
A[composer require]
end
subgraph step2["Step 2"]
B[publish config<br/>& migrations]
end
subgraph step3["Step 3"]
C[migrate]
end
A --> B --> C
Loading
1. Install
composer require subhashladumor1/laravel-ai-guard
2. Publish & migrate
php artisan vendor:publish --tag=ai-guard-config php artisan vendor:publish --tag=ai-guard-migrations php artisan migrate
3. Optional β translations
php artisan vendor:publish --tag=ai-guard-lang
creates: ai_usages (tracks every request & cost) + ai_budgets (stores current usage vs limit)
βοΈ Configuration
Edit config/ai-guard.php after publishing:
| Setting | Purpose |
|---|---|
ai_disabled |
Turn off all AI |
pricing |
Cost per 1k tokens per model |
default_model |
Fallback (e.g. gpt-4o) |
default_provider |
Fallback (e.g. openai) |
budgets |
Limits (global, user, tenant); period |
estimation |
Chars per token, output multiplier |
Example .env:
AI_GUARD_DISABLED=false AI_GUARD_GLOBAL_LIMIT=100 AI_GUARD_USER_LIMIT=10 AI_GUARD_TENANT_LIMIT=50
π Usage Examples
With Laravel AI SDK (12.x)
sequenceDiagram
participant App
participant AIGuard
participant AI
App->>AIGuard: checkAllBudgets()
App->>AIGuard: estimate(prompt)
App->>AI: prompt()
AI-->>App: response
App->>AIGuard: recordFromResponse()
Loading
// 1. Before β check budget AIGuard::checkAllBudgets(auth()->id(), $tenantId); $estimate = AIGuard::estimate($userPrompt); // 2. Call AI (as normal) $response = (new YourAgent)->prompt($userPrompt); // 3. After β record usage AIGuard::recordFromResponse($response, userId: auth()->id(), tenantId: $tenantId, tag: 'chat');
Multi-model: Pass model and provider so estimate and budgets use the right cost:
$estimate = AIGuard::estimate($userPrompt, model: 'gpt-4o-mini', provider: 'openai'); AIGuard::recordFromResponse($response, userId: auth()->id(), provider: 'openai', model: 'gpt-4o-mini');
Streaming: record in ->then() callback when stream finishes.
With Any Other AI API
// Before β same AIGuard::checkAllBudgets(auth()->id(), $tenantId); // After β record manually AIGuard::recordAndApplyBudget([ 'provider' => 'openai', 'model' => 'gpt-4o', 'input_tokens' => 400, 'output_tokens' => 250, 'user_id' => auth()->id(), 'tenant_id' => $tenantId, 'tag' => 'chat', ]);
Extended usage (audio, video, image, tools) β pass a usage array for accurate cost when using modalities or tools:
AIGuard::recordAndApplyBudget([ 'provider' => 'openai', 'model' => 'gpt-4o', 'input_tokens' => 500, 'output_tokens' => 300, 'usage' => [ 'input_tokens' => 500, 'output_tokens' => 300, 'cached_input_tokens' => 0, 'images_generated' => 2, // DALLΒ·E / image models 'web_search_calls' => 5, // agent tool calls 'transcription_minutes' => 1.5, // Whisper / transcribe 'tts_characters' => 2500, // TTS 'embedding_tokens' => 1000, // embeddings 'video_seconds' => 10, // Veo / video gen ], 'user_id' => auth()->id(), 'tag' => 'agent-with-search', ]);
Multi-model and dynamic cost (no config change)
Cost is resolved in order: per-call override β runtime pricing β config. So you can support many models and change costs at runtime without editing config/ai-guard.php.
1. Per-call pricing override β pass pricing for a single estimate or record:
// Estimate with custom cost per 1k tokens (no config entry needed) $estimate = AIGuard::estimate($userPrompt, 'my-model', 'my-provider', [ 'input' => 0.001, 'output' => 0.002, ]); // Record with custom pricing when cost isn't pre-calculated AIGuard::recordFromResponse($response, auth()->id(), $tenantId, 'openai', 'gpt-4o', 'chat', [ 'input' => 0.0025, 'output' => 0.01, ]); // record() can omit 'cost' and use 'pricing' to calculate AIGuard::record([ 'provider' => 'openai', 'model' => 'gpt-4o', 'input_tokens' => 400, 'output_tokens' => 250, 'pricing' => ['input' => 0.0025, 'output' => 0.01], 'user_id' => auth()->id(), ]);
2. Runtime pricing registry β register models once (e.g. in a service provider or from DB); then estimate() and recording use them automatically:
$calc = AIGuard::getCostCalculator(); // Single model $calc->setPricing('openai', 'gpt-4o-mini', ['input' => 0.00015, 'output' => 0.0006]); // Many models at once $calc->setPricingMap([ 'openai' => [ 'gpt-4o' => ['input' => 0.0025, 'output' => 0.01], 'gpt-4o-mini' => ['input' => 0.00015, 'output' => 0.0006], ], 'anthropic' => [ 'claude-3-5-sonnet' => ['input' => 0.003, 'output' => 0.015], ], ]); // Now estimate/record use these models without config $estimate = AIGuard::estimate($userPrompt, 'gpt-4o-mini', 'openai'); AIGuard::checkAllBudgets(auth()->id(), $tenantId);
Add, update or remove models at runtime:
$calc = AIGuard::getCostCalculator(); // Add or update a model $calc->setPricing('openai', 'gpt-4o', ['input' => 0.0025, 'output' => 0.01]); // Remove a model from runtime (falls back to config, or 0 if not in config) $calc->removePricing('openai', 'gpt-4o'); // Clear all runtime pricing $calc->clearRuntimePricing();
Config file β publish and edit config/ai-guard.php to add, remove or update models permanently:
'pricing' => [ 'openai' => [ 'gpt-4o' => ['input' => 0.0025, 'output' => 0.01], 'gpt-4o-mini' => ['input' => 0.00015, 'output' => 0.0006], // Add new models here ], // Add new providers here ],
Budget checks use the same cost you record (per user/tenant), so multi-model costs and budgets work together.
Middleware
Route::post('/chat', ChatController::class)->middleware('ai.guard');
| Condition | Response |
|---|---|
| Over budget | 402 + JSON |
| AI disabled | 503 |
Artisan Commands
| Command | Purpose |
|---|---|
php artisan ai-guard:report |
Usage & cost report |
php artisan ai-guard:report --period=month |
Monthly report |
php artisan ai-guard:report --days=7 |
Last 7 days |
php artisan ai-guard:reset-budgets |
Reset when period ends |
php artisan ai-guard:reset-budgets --dry-run |
Preview only |
Schedule reset: $schedule->command('ai-guard:reset-budgets')->daily();
ποΈ Package Structure
flowchart TB
subgraph entry["Entry Points"]
F[AIGuard Facade]
M[EnforceAIBudget Middleware]
C1[ai-guard:report]
C2[ai-guard:reset-budgets]
end
subgraph core["Core"]
GM[GuardManager]
end
subgraph services["Services"]
BR[BudgetResolver]
BE[BudgetEnforcer]
TE[TokenEstimator]
CC[CostCalculator]
end
subgraph storage["Storage"]
AU[AiUsage]
AB[AiBudget]
end
F --> GM
M --> GM
C1 --> GM
C2 --> GM
GM --> BR
GM --> BE
GM --> TE
GM --> CC
BR --> AB
BE --> AB
CC --> AU
Loading
laravel-ai-guard/
βββ src/
β βββ GuardManager.php # Core logic
β βββ Facades/AIGuard.php
β βββ Budget/ # BudgetResolver, BudgetEnforcer
β βββ Cost/ # TokenEstimator, CostCalculator
β βββ Models/ # AiUsage, AiBudget
β βββ Middleware/
β βββ Commands/
β βββ Exceptions/
βββ database/migrations/
βββ lang/ # 11 locales
βββ tests/
π Real-World Scenarios
1. The "Safe" Chatbot π€ (OpenAI + Laravel AI SDK)
Goal: Build a chatbot that users can't abuse to run up a huge bill. Safety Check: Estimate cost before the request.
use Subhashladumor1\LaravelAiGuard\Facades\AIGuard; use Illuminate\Http\Request; public function chat(Request $request) { $user = auth()->user(); $prompt = $request->input('message'); // 1οΈβ£ Run budget check (throws overflow exception if user is over limit) AIGuard::checkAllBudgets($user->id, $user->team_id); // 2οΈβ£ Estimate cost (OpenAI/Text is roughly 4 chars/token) // If the prompt is huge (e.g. paste-bin attack), stop it here. $estimatedCost = AIGuard::estimate($prompt, 'gpt-4o', 'openai'); if ($estimatedCost > 0.50) { return response()->json(['error' => 'Message too long/expensive.'], 400); } // 3οΈβ£ Call AI (Laravel AI SDK simple example) $response = \AI::chat($prompt); // 4οΈβ£ Record actual usage // Tracks input, output, and updates User + Tenant budgets AIGuard::recordFromResponse($response, $user->id, $user->team_id, 'openai', 'gpt-4o', 'chatbot'); return response()->json(['reply' => $response]); }
2. Video Analysis Agent π₯ (Gemini 2.5) β Multimodal
Goal: Analyze uploaded videos. Video processing is expensive per second.
Method: Use specific keys for video_seconds or video_tokens.
// User uploads a 30-second video clip $videoPath = $request->file('video')->store('videos'); // Call Gemini API (Direct HTTP / Google Client - No Laravel SDK) $geminiResponse = Http::post('https://generativelanguage.googleapis.com/...', [ // ... payload with video data ... ]); $result = $geminiResponse->json(); // π‘ Record complex usage: AIGuard::recordAndApplyBudget([ 'provider' => 'gemini', 'model' => 'gemini-2.5-flash', 'input_tokens' => 500, // Prompt text 'output_tokens' => 200, // Analysis text 'usage' => [ 'input_tokens' => 500, 'video_tokens_in' => 7500, // Video tokens (approx 250/sec) // OR use direct billing unit if supported: 'video_seconds' => 30 ], 'user_id' => auth()->id(), 'tag' => 'video-analysis' ]);
3. Long Document Summarizer π (Claude 3.5 Sonnet + Caching)
Goal: Summarize a 100-page PDF. Reuse the PDF context for follow-up questions to save 90% cost.
Method: Track cached_input_tokens.
// 1st Call: Upload & Cache // Anthropic returns 'cache_creation_input_tokens' (write cost) AIGuard::recordAndApplyBudget([ 'provider' => 'anthropic', 'model' => 'claude-3-5-sonnet', 'input_tokens' => 50000, 'usage' => [ 'input_tokens' => 50000, 'cache_write_tokens' => 50000, // Expensive write ], 'user_id' => auth()->id(), ]); // 2nd Call: Ask question about PDF // Anthropic returns 'cache_read_input_tokens' (Cheap read! ~10% cost) AIGuard::recordAndApplyBudget([ 'provider' => 'anthropic', 'model' => 'claude-3-5-sonnet', 'input_tokens' => 50100, // 50k context + 100 new prompt 'usage' => [ 'input_tokens' => 50100, 'cached_input_tokens' => 50000, // Cheap HIT! 'output_tokens' => 500, ], // AIGuard automatically calculates the lower bill for cached tokens 'user_id' => auth()->id(), ]);
4. Background Data Processing βοΈ (DeepSeek / Mistral + Batch)
Goal: Process 10,000 rows of data nightly. Optimisation: Use a cheaper model (DeepSeek V3 / Mistral Small).
foreach ($rows as $row) { // Check global budget first to prevent runaway loops try { AIGuard::checkAllBudgets(null, $tenant->id); } catch (\Exception $e) { Log::alert("Budget exceeded during batch! Stopping."); break; } // Call DeepSeek API directly $response = Http::withToken($key)->post('https://api.deepseek.com/chat/completions', [ 'model' => 'deepseek-chat', 'messages' => [['role' => 'user', 'content' => "Analyze: " . $row->text]] ]); // Track it AIGuard::recordAndApplyBudget([ 'provider' => 'deepseek', 'model' => 'deepseek-chat', 'input_tokens' => $response['usage']['prompt_tokens'], 'output_tokens' => $response['usage']['completion_tokens'], 'usage' => [ 'cached_input_tokens' => $response['usage']['prompt_cache_hit_tokens'] ?? 0, ], 'tenant_id' => $tenant->id, 'tag' => 'nightly-batch' ]); }
π Multi-Language
11 locales: en, ar, es, fr, de, zh, hi, bn, pt, ru, ja
App locale used automatically. Customize: php artisan vendor:publish --tag=ai-guard-lang
π’ Multi-Tenant (SaaS)
- Store
tenant_idon each usage - Set tenant budgets in config
- Middleware reads tenant from
X-Tenant-IDheader or request attribute
π§ Beta Notice: Laravel AI Guard is currently in beta. Please report any issues with cost calculation, token estimation, or edge cases by opening a GitHub issue. Community feedback is highly appreciated.
π§ͺ Testing
composer install && php artisan test
π License
MIT. See LICENSE.