skyeng / php-lemmatizer
PHP Lemmatizer is a lemmatization library to retrieve a base form from an English inflected word.
Installs: 52 028
Dependents: 1
Suggesters: 0
Security: 0
Stars: 19
Watchers: 117
Forks: 5
Open Issues: 2
Requires
- php: >=5.6
Requires (Dev)
- phpunit/phpunit: ^5.7
This package is not auto-updated.
Last update: 2024-11-20 19:08:12 UTC
README
Краткое описание
Занимается лемматизацией английских слов.
Ответственные
- Code owner: Дмитрий Южаков
- Product owner: Сергей Сафонов
- Команда: Platform Core
PHP Lemmatizer is a lemmatization library for PHP to retrieve a base form from an inflected form word in English.
Inspired by JavaScript Lemmatizer but the returned values are different from it.
Installation
With Composer
$ composer require skyeng/php-lemmatizer
{ "require": { "skyeng/php-lemmatizer": "^1.0" } }
Usage
<?php use Skyeng\Lemmatizer; use Skyeng\Lemma; // Require Composer's autoloader require_once __DIR__ . "/vendor/autoload.php"; $lemmatizer = new Lemmatizer(); // retrieve a lemma with a part of speech. // you can assign Lemma::POS_VERB or Lemma::POS_NOUN or Lemma::POS_ADJECTIVE or // POS_ADVERB as a part of speech. $lemmas = $lemmatizer->getLemmas('desks', Lemma::POS_NOUN); // => [ new Lemma('desk', Lemma::POS_NOUN) ] // of course, available for irregular inflected form words. $lemmas = $lemmatizer->getLemmas('went', Lemma::POS_VERB); // => [ new Lemma('go', Lemma::POS_VERB) ] $lemmas = $lemmatizer->getLemmas('better', Lemma::POS_ADJECTIVE); // => [ new Lemma('better', Lemma::POS_ADJECTIVE), new Lemma('good', Lemma::POS_ADJECTIVE) ] // when multiple base forms are found, return all of them. $lemmas = $lemmatizer->getLemmas('leaves', Lemma::POS_NOUN); // => [ new Lemma('leave', Lemma::POS_NOUN), new Lemma('leaf', Lemma::POS_NOUN) ] // retrieve a lemma without a part of speech. $lemmas = $lemmatizer->getLemmas('sitting'); // => [ new Lemma('sit', Lemma::POS_VERB), new Lemma('sitting', Lemma::POS_ADJECTIVE) ] // retrieve only lemmas not including part of speeches in the returned value. $lemmas = $lemmatizer->getOnlyLemmas('desks', Lemma::POS_NOUN); // => [ 'desk' ] $lemmas = $lemmatizer->getOnlyLemmas('coded', Lemma::POS_VERB); // => [ 'code' ] $lemmas = $lemmatizer->getOnlyLemmas('leaves'); // => [ 'leave', 'leaf' ]
Limitations
// Lemmatizer leaves alone a word not included in it's dictionary index. $lemmas = $lemmatizer->getLemmas('MacBooks'); // => [ new Lemma('MacBooks', Lemma::POS_NOUN) ]
Contribution
- Fork it ( https://github.com/skyeng/php-lemmatizer )
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Add some feature')
- Push to the branch (git push origin my-new-feature)
- Create a new Pull Request