cable8mm / mma-scrapers
MMA scraper library for UFC, Road FC, and Black Combat.
v0.1.1
2026-04-23 04:31 UTC
Requires
- php: ^8.4
- guzzlehttp/guzzle: ^7.10
- symfony/css-selector: ^8.0
- symfony/dom-crawler: ^8.0
Requires (Dev)
- laravel/pint: ^1.27
- phpunit/phpunit: ^12.5
README
Functions
- multi-source aggregator
- identity resolution
- normalized DB
- 확장 가능 구조
Installation
Installation:
composer require cable8mm/mma-scrapers
Description
Scraper Flows
PromotionScraper
↓
EventScraper
↓
FightScraper
↓
FighterScraper
BlackCombatScraper ↓ BlackCombatParser ↓ BlackCombatFightParser ↓ FighterParser
Crawling Flows
/events ↓ /events/black-combat-12 ↓ fight card ↓ fighter links
MMA aggregator's data flows
Scraper ↓ Parser ↓ DTO ↓ Matcher ↓ Deduplicator ↓ Aggregator ↓ DB (지금 설계)
FightDTO:
여러 source에서 수집된 FightDTO[] ↓ 같은 경기끼리 그룹핑 ↓ FightAggregator로 merge ↓ 최종 FightDTO[]
Knowledge
Architecture
events
└ fights
├ fighters (red / blue)
├ fight_external_ids
└ (aggregated data)
fighters
├ fighter_aliases
└ fighter_external_ids
"raw → normalized → aggregated → stored"
fighters table:
Schema::create('fighters', function (Blueprint $table) { $table->id(); $table->string('name'); $table->string('nickname')->nullable(); $table->string('instagram')->nullable(); $table->string('teamname')->nullable(); $table->string('height')->nullable(); $table->integer('win')->nullable(); $table->integer('lose')->nullable(); $table->integer('draw')->nullable(); $table->timestamps(); });
// 정찬성 = Chan Sung Jung = Korean Zombie Schema::create('fighter_aliases', function (Blueprint $table) { $table->id(); $table->foreignId('fighter_id')->constrained()->cascadeOnDelete(); $table->string('alias')->index(); $table->timestamps(); });
// fighter_id = 1 // source = sherdog // external_id = 12345 Schema::create('fighter_external_ids', function (Blueprint $table) { $table->id(); $table->foreignId('fighter_id')->constrained()->cascadeOnDelete(); $table->string('source'); // sherdog, tapology $table->string('external_id'); $table->unique(['source', 'external_id']); $table->timestamps(); });
Schema::create('events', function (Blueprint $table) { $table->id(); $table->string('name'); $table->string('location')->nullable(); $table->dateTime('event_date')->index(); $table->timestamps(); });
Schema::create('event_external_ids', function (Blueprint $table) { $table->id(); $table->foreignId('event_id')->constrained()->cascadeOnDelete(); $table->string('source'); $table->string('external_id'); $table->unique(['source', 'external_id']); });
Schema::create('fights', function (Blueprint $table) { $table->id(); $table->foreignId('event_id')->constrained()->cascadeOnDelete(); $table->foreignId('fighter_red_id')->constrained('fighters'); $table->foreignId('fighter_blue_id')->constrained('fighters'); $table->string('weight_class')->nullable(); $table->string('method')->nullable(); $table->integer('round')->nullable(); $table->string('time')->nullable(); $table->foreignId('winner_id')->nullable()->constrained('fighters')->nullOnDelete(); $table->dateTime('fight_date')->index(); $table->enum('status', [ 'scheduled', 'live', 'finished' ])->default('scheduled'); $table->boolean('is_title_fight')->default(false); $table->timestamps(); // 🔥 핵심 (순서 뒤집힘 대응) $table->unique([ 'fighter_red_id', 'fighter_blue_id', 'fight_date', ]); });
Schema::create('fight_external_ids', function (Blueprint $table) { $table->id(); $table->foreignId('fight_id')->constrained()->cascadeOnDelete(); $table->string('source'); $table->string('external_id'); $table->unique(['source', 'external_id']); });
Test
composer test