klkvsk / clearurls
High-performance PHP library for removing tracking parameters from URLs using ClearURLs ruleset
Installs: 0
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
pkg:composer/klkvsk/clearurls
Requires
- php: >=8.3
Requires (Dev)
- phpunit/phpunit: ^11.0
This package is auto-updated.
Last update: 2025-11-28 08:30:38 UTC
README
High-performance PHP library for removing tracking parameters from URLs using the ClearURLs ruleset.
Features
- ๐ Fast - Optimized for bulk URL cleaning with pre-compiled regex patterns
- ๐ฏ Zero Dependencies - Pure PHP 8.3+ standard library only
- ๐ Auto-updating Rules - Fetches latest rules from ClearURLs project
- ๐งช Well Tested - Comprehensive test suite with performance benchmarks
- ๐ช Production Ready - Battle-tested rules from the ClearURLs community
- ๐ฆ Easy Integration - Simple, one-function API
Installation
Requirements
- PHP 8.3 or higher
- Composer
Using Composer
composer require klkvsk/clearurls
The package includes pre-compiled rules (src/Rules.php), so it's ready to use immediately after installation.
Quick Start
use ClearUrls\ClearUrls; // Create cleaner with default rules $cleaner = ClearUrls::createDefault(); // Clean a URL $result = $cleaner->clean('https://example.com/page?utm_source=twitter&id=123'); echo $result->url; // https://example.com/page?id=123 echo $result->wasModified ? 'Modified' : 'Unchanged'; // Modified
Usage Examples
Basic URL Cleaning
$cleaner = ClearUrls::createDefault(); // Remove UTM parameters $result = $cleaner->clean( 'https://example.com/page?utm_source=twitter&utm_medium=social&id=123' ); echo $result->url; // https://example.com/page?id=123 // Remove Facebook tracking $result = $cleaner->clean( 'https://example.com/page?fbclid=abc123&id=456' ); echo $result->url; // https://example.com/page?id=456 // Remove Google tracking $result = $cleaner->clean( 'https://www.google.com/search?q=test&ved=123&ei=456' ); echo $result->url; // https://www.google.com/search?q=test
Handling Redirections
$cleaner = ClearUrls::createDefault(); // Extract target URL from Google redirect $result = $cleaner->clean( 'https://www.google.com/url?q=https://example.com/target&ved=123' ); echo $result->url; // https://example.com/target echo $result->wasRedirected ? 'Yes' : 'No'; // Yes // Extract target URL from Facebook redirect $result = $cleaner->clean( 'https://l.facebook.com/l.php?u=https%3A%2F%2Fexample.com%2Fpage&h=abc' ); echo $result->url; // https://example.com/page
Referral Marketing
By default, referral marketing parameters (like Amazon affiliate tags) are removed. To keep them:
$cleaner = ClearUrls::createDefault(allowReferralMarketing: true); $result = $cleaner->clean( 'https://www.amazon.com/dp/B08ABC123?tag=myaffiliate-20&ref=nav' ); // With allowReferralMarketing=true: keeps 'tag' parameter // With allowReferralMarketing=false: removes 'tag' parameter
Checking Result Status
$result = $cleaner->clean($url); if ($result->wasModified) { echo "Removed tracking parameters\n"; } if ($result->wasRedirected) { echo "Extracted target URL from redirect\n"; } if ($result->wasBlocked) { echo "URL was blocked (completeProvider rule)\n"; } if ($result->hadAnyAction()) { echo "Something was changed\n"; }
Batch Processing
$cleaner = ClearUrls::createDefault(); $urls = [ 'https://example.com/page1?utm_source=email', 'https://example.com/page2?fbclid=abc123', 'https://example.com/page3?gclid=xyz789', ]; foreach ($urls as $url) { $result = $cleaner->clean($url); echo "{$url} -> {$result->url}\n"; }
Custom Configuration
If you need custom rules, you can create providers manually:
use ClearUrls\ClearUrls; use ClearUrls\Provider; $providers = [ new Provider( name: 'custom', urlPattern: '#^https?://example\.com#i', completeProvider: false, rules: [ '#^tracking$#i', '#^session_id$#i', ], rawRules: [], referralMarketing: [], exceptions: [], redirections: [], forceRedirection: false ), ]; $cleaner = new ClearUrls($providers);
Performance
The library is optimized for speed:
- Pre-compiled regex patterns (no runtime compilation)
- Efficient URL parsing and rebuilding
- Minimal memory usage
- Suitable for bulk URL cleaning operations
Run performance benchmarks:
vendor/bin/phpunit tests/PerformanceTest.php --verbose
Architecture
Core Components
-
ClearUrls.php - Main cleaning logic
- URL parsing and validation
- Provider matching
- Rule application
- URL rebuilding
-
Provider.php - Rule provider data structure
- Pattern matching
- Exception handling
- Redirection extraction
-
ClearUrlResult.php - Immutable result object
- Cleaned URL
- Modification status
- Action flags
-
Rules.php - Auto-generated (205 providers, 734+ rules)
- Pre-compiled regex patterns
- Optimized data structure
- Generated by build script
Build System (For Package Maintainers)
compile-rules.php - Rules compilation script (used by package maintainers to update rules)
- Fetches from https://rules2.clearurls.xyz/data.minify.json
- Compiles all regex patterns
- Generates optimized PHP code (src/Rules.php)
- Caches rules locally for offline builds with
--localflag
How It Works
- Build Time (package maintainers only): The build script fetches rules from ClearURLs and compiles them to PHP with pre-compiled regex patterns
- Runtime (end users): When cleaning a URL:
- Finds matching provider by URL pattern
- Checks exceptions (URLs that should not be processed)
- Handles redirections (extracts embedded URLs)
- Applies raw rules (regex on entire URL)
- Removes matching query parameters and fragments
- Rebuilds the clean URL
Rules Format
Rules are fetched from the ClearURLs project. Each provider has:
urlPattern: Regex to match URLscompleteProvider: Block entire providerrules: Query/fragment parameters to removerawRules: Regex patterns to apply to entire URLreferralMarketing: Optional marketing parametersexceptions: URLs to skipredirections: Extract embedded URLsforceRedirection: Force redirect behavior
See ClearURLs documentation for details.
Updating Rules (For Package Maintainers)
The pre-compiled rules (src/Rules.php) are included in the package. End users don't need to update them.
For package maintainers: To update rules to the latest version from ClearURLs:
php build/compile-rules.php
This fetches rules from https://rules2.clearurls.xyz/data.minify.json and regenerates src/Rules.php.
The build script compares the SHA-256 hash of fetched rules against the previous version:
- Rules changed: Updates
updatedAttimestamp in metadata โ triggers new release - Rules unchanged: Preserves existing
updatedAttimestamp โ no new release
Use the --local flag to skip fetching and compile from cached rules:
php build/compile-rules.php --local
Automated Rules Updates (GitHub Actions)
The repository includes a GitHub Actions workflow that automatically:
- Fetches the latest ClearURLs rules
- Compiles them to optimized PHP code
- Creates a new versioned release (format:
1.0.YYYYMMDD) - Updates the CHANGELOG
- Pushes to GitHub and creates a git tag for Packagist
To trigger an update:
- Go to the repository's "Actions" tab on GitHub
- Select "Update ClearURLs Rules" workflow
- Click "Run workflow" button
- The workflow fetches rules and checks if they changed (via SHA-256 hash)
- If rules changed AND version tag doesn't exist โ creates new release
- If rules unchanged โ keeps existing version, no duplicate release
Version format: x.y.YYYYMMDD where:
x.y= Major.Minor version (update manually when adding code features)YYYYMMDD= Date when rules were fetched from ClearURLs
This ensures Packagist automatically picks up new rule updates without manual intervention.
Project Structure
clearurls-php/
โโโ src/
โ โโโ ClearUrls.php # Main library class
โ โโโ Provider.php # Provider data structure
โ โโโ ClearUrlResult.php # Result object
โ โโโ Rules.php # Auto-generated rules
โโโ build/
โ โโโ compile-rules.php # Build script
โ โโโ data.min.json # Cached rules (git-ignored)
โโโ tests/
โ โโโ ClearUrlsTest.php # Unit tests
โ โโโ PerformanceTest.php # Benchmarks
โโโ examples/
โ โโโ basic_usage.php # Simple examples
โ โโโ advanced_usage.php # Advanced features
โโโ composer.json
โโโ phpunit.xml
โโโ README.md
Testing
Install dev dependencies:
composer install --dev
Run tests:
# All tests vendor/bin/phpunit # Specific test file vendor/bin/phpunit tests/ClearUrlsTest.php # Performance benchmarks vendor/bin/phpunit tests/PerformanceTest.php --verbose
Supported Providers (205 total)
Major Providers
- Google (search, ads, redirects)
- Amazon (products, search, ads)
- Facebook (posts, redirects)
- YouTube (videos, tracking)
- Twitter/X (posts, tracking)
- LinkedIn (profiles, tracking)
- Reddit (posts, redirects)
- Instagram (posts, tracking)
- TikTok (videos, tracking)
- And 196+ more...
Global Rules
- UTM parameters (utm_source, utm_medium, etc.)
- Facebook tracking (fbclid)
- Google tracking (gclid, _ga)
- And many more tracking parameters
API Reference
ClearUrls Class
// Create with default rules $cleaner = ClearUrls::createDefault(bool $allowReferralMarketing = false): self // Create with custom providers $cleaner = new ClearUrls(array $providers, bool $allowReferralMarketing = false) // Clean a URL $result = $cleaner->clean(string $url): ClearUrlResult // Set referral marketing preference $cleaner->setAllowReferralMarketing(bool $allow): void
ClearUrlResult Class
readonly class ClearUrlResult { public string $url; // Cleaned URL public bool $wasModified; // Was URL modified? public bool $wasBlocked; // Was URL blocked? public bool $wasRedirected; // Was URL redirected? public function hadAnyAction(): bool; // Any action taken? }
Provider Class
readonly class Provider { public function __construct( public string $name, public string $urlPattern, public bool $completeProvider, public array $rules, public array $rawRules, public array $referralMarketing, public array $exceptions, public array $redirections, public bool $forceRedirection ); public function matchesUrl(string $url): bool; public function getRedirection(string $url): ?string; public function getAllRules(bool $includeReferralMarketing): array; }
Contributing
Contributions are welcome! Areas for contribution:
- Performance optimizations
- Additional test cases
- Documentation improvements
- Bug fixes
- Feature requests
Credits
- Original Project: ClearURLs by KevinRoebert
- Rules: Maintained by ClearURLs community
- PHP Implementation: klkvsk
License
MIT License
The rules data is provided by the ClearURLs project and is licensed under LGPL-3.0-or-later.
Links
- GitHub: https://github.com/klkvsk/clearurls-php
- Packagist: https://packagist.org/packages/klkvsk/clearurls
- ClearURLs Docs: https://docs.clearurls.xyz/
- Issue Tracker: https://github.com/klkvsk/clearurls-php/issues