coral-media/php-ir

Information Retrieval algorithms (vector space, similarity, clustering)

Installs: 0

Dependents: 0

Suggesters: 0

Security: 0

Stars: 0

Watchers: 0

Forks: 0

Open Issues: 0

pkg:composer/coral-media/php-ir

v0.5.2 2025-12-22 04:34 UTC

This package is auto-updated.

Last update: 2025-12-22 04:37:31 UTC


README

A focused PHP library implementing classical Information Retrieval (IR) algorithms based on the Stanford IR book
Introduction to Information Retrieval.

The goal of this project is to provide a correct, deterministic, and explainable IR core suitable for search, clustering, and recommendation systems.

License

MIT

Scope

  • Vector space model (dense and sparse vectors)
  • Similarity measures (cosine, euclidean)
  • Term weighting:
    • Term Frequency (TF)
    • Inverse Document Frequency (IDF)
    • TF-IDF
    • BM25 (probabilistic ranking)
  • Clustering (K-Means, K-Means++)
  • Deterministic, explainable algorithms

Non-Goals

  • NLP pipelines
  • Transformers / embeddings
  • Semantic vector databases
  • Dataset abstractions
  • Framework integrations

This library intentionally focuses on classical IR, not modern NLP.

Philosophy

This library favors:

  • Mathematical correctness
  • Explicit, inspectable abstractions
  • Deterministic behavior
  • Minimal hidden state
  • Native acceleration via Zephir (optional, future-facing)

API Stability

New functionality will be added in a backward-compatible manner.

Releasing

Project versions are tracked using a .version file and Git tags.

To bump the version:

./scripts/bump-version.sh {major|minor|patch}
git commit -am "chore(release): bump version"