ankane / disco
Recommendations for PHP using collaborative filtering
Requires
- php: >= 8.1
- ankane/libmf: ^0.1.0
Requires (Dev)
- phpunit/phpunit: ^10
This package is auto-updated.
Last update: 2024-10-09 10:02:10 UTC
README
🔥 Recommendations for PHP using collaborative filtering
- Supports user-based and item-based recommendations
- Works with explicit and implicit feedback
- Uses high-performance matrix factorization
Installation
Run:
composer require ankane/disco
Add scripts to composer.json
to download the shared library:
"scripts": { "post-install-cmd": "Disco\\Library::check", "post-update-cmd": "Disco\\Library::check" }
And run:
composer install
Getting Started
Create a recommender
use Disco\Recommender; $recommender = new Recommender();
If users rate items directly, this is known as explicit feedback. Fit the recommender with:
$recommender->fit([ ['user_id' => 1, 'item_id' => 1, 'rating' => 5], ['user_id' => 2, 'item_id' => 1, 'rating' => 3] ]);
IDs can be integers or strings
If users don’t rate items directly (for instance, they’re purchasing items or reading posts), this is known as implicit feedback. Leave out the rating.
$recommender->fit([ ['user_id' => 1, 'item_id' => 1], ['user_id' => 2, 'item_id' => 1] ]);
Each
user_id
/item_id
combination should only appear once
Get user-based recommendations - “users like you also liked”
$recommender->userRecs($userId);
Get item-based recommendations - “users who liked this item also liked”
$recommender->itemRecs($itemId);
Use the count
option to specify the number of recommendations (default is 5)
$recommender->userRecs($userId, count: 3);
Get predicted ratings for specific users and items
$recommender->predict([['user_id' => 1, 'item_id' => 2], ['user_id' => 2, 'item_id' => 4]]);
Get similar users
$recommender->similarUsers($userId);
Examples
MovieLens
Load the data
use Disco\Data; $data = Data::loadMovieLens();
Create a recommender and get similar movies
$recommender = new Recommender(factors: 20); $recommender->fit($data); $recommender->itemRecs('Star Wars (1977)');
Storing Recommendations
Save recommendations to your database.
Alternatively, you can store only the factors and use a library like pgvector-php. See an example.
Algorithms
Disco uses high-performance matrix factorization.
- For explicit feedback, it uses stochastic gradient descent
- For implicit feedback, it uses coordinate descent
Specify the number of factors and epochs
new Recommender(factors: 8, epochs: 20);
If recommendations look off, trying changing factors
. The default is 8, but 3 could be good for some applications and 300 good for others.
Validation
Pass a validation set with:
$recommender->fit($data, validationSet: $validationSet);
Cold Start
Collaborative filtering suffers from the cold start problem. It’s unable to make good recommendations without data on a user or item, which is problematic for new users and items.
$recommender->userRecs($newUserId); // returns empty array
There are a number of ways to deal with this, but here are some common ones:
- For user-based recommendations, show new users the most popular items.
- For item-based recommendations, make content-based recommendations.
Reference
Get ids
$recommender->userIds(); $recommender->itemIds();
Get the global mean
$recommender->globalMean();
Get factors
$recommender->userFactors($userId); $recommender->itemFactors($itemId);
Credits
Thanks to LIBMF for providing high performance matrix factorization
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/ankane/disco-php.git cd disco-php composer install composer test