jfcherng/php-levenshtein-distance

Calculate the Levenshtein distance and edit progresses between two strings.

4.0.2 2022-05-31 13:28 UTC

README

GitHub Workflow Status (branch) Packagist Packagist Version Project license GitHub stars Donate to this project using Paypal

Calculate the Levenshtein distance and edit progresses between two strings. Note that if you do not need the edit path, PHP has a built-in levenshtein() function.

Features

  • UTF-8-ready.
  • Full edit progresses information.

Installation

$ composer require jfcherng/php-levenshtein-distance

Example

See demo.php.

<?php

include __DIR__ . '/vendor/autoload.php';

use Jfcherng\Diff\LevenshteinDistance as LD;

$old = '自訂取代詞語模組';
$new = '自订取代词语模组!';

$calculator = new LD(
    true, // calculate edit progresses?
    // progress options
    LD::PROGRESS_OP_AS_STRING | LD::PROGRESS_PATCH_MODE
);

$results = $calculator->calculate($old, $new);

// this is the same but using an internal singleton
$results = LD::staticCalculate(
    $old, // old string
    $new, // new string
    true, // calculate edit progresses?
    // progress options
    LD::PROGRESS_OP_AS_STRING | LD::PROGRESS_PATCH_MODE
);

// [
//     'distance' => 5,
//     'progresses' => [
//         ['ins', 8, '!', 1],
//         ['rep', 7, '组', 1],
//         ['cpy', 6, '模', 1],
//         ['rep', 5, '语', 1],
//         ['rep', 4, '词', 1],
//         ['cpy', 3, '代', 1],
//         ['cpy', 2, '取', 1],
//         ['rep', 1, '订', 1],
//         ['cpy', 0, '自', 1],
//     ],
// ]
var_dump($results);

$results = LD::staticCalculate(
    $old, // old string
    $new, // new string
    true, // calculate edit progresses?
    // progress options
    LD::PROGRESS_OP_AS_STRING | LD::PROGRESS_PATCH_MODE | LD::PROGRESS_MERGE_NEIGHBOR
);

// [
//     'distance' => 5,
//     'progresses' => [
//         ['ins', 8, '!', 1],
//         ['rep', 7, '组', 1],
//         ['cpy', 6, '模', 1],
//         ['rep', 4, '词语', 2],
//         ['cpy', 2, '取代', 2],
//         ['rep', 1, '订', 1],
//         ['cpy', 0, '自', 1],
//     ],
// ]
var_dump($results);

Progress Options

  • LD::PROGRESS_NO_COPY: Do not include COPY operations in the progresses.
  • LD::PROGRESS_MERGE_NEIGHBOR: Merge neighbor progresses if possible.
  • LD::PROGRESS_OP_AS_STRING: Convert the operation in progresses from int to string.
  • LD::PROGRESS_PATCH_MODE: Replace the new edit position with the corresponding string.

Returned progresses

  1. The operation.
  2. The edit position for the new string.
  3. The edit position for the old string. Or the corresponding string if LD::PROGRESS_PATCH_MODE is used.
  4. The edit length.