frantzmiccoli / phphoneticindexing
Phonetic indexing for PHP, uses the standard library for English, Cologne phonetics for German and a custom algorithm for French
Installs: 10
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
pkg:composer/frantzmiccoli/phphoneticindexing
Requires (Dev)
- phpunit/phpunit: ^7.5
This package is not auto-updated.
Last update: 2025-10-21 20:24:42 UTC
README
Scope
We try to provide different languages phonetic indexing methods.
- English: uses the PHP Standard Library soundex()
- German: uses a custom implementation of the Cologne phonetic indexing algorithm. https://en.wikipedia.org/wiki/Cologne_phonetics
- French: uses a custom algorithm (see below)
Installation
composer require frantzmiccoli/phphoneticindexing
Usage
use PhPhoneticIndexing\GetPhoneticIndex;
$getPhoneticIndex = new GetPhoneticIndex();
var_dump($getPhoneticIndex->getPhoneticIndex('carabine', 'fr')); // karabyn
Please note that if you wish to support new languages, those can be added using
$getPhoneticIndex->addLanguage().
French implementation
| Root class | Pattern | Replacement | Example | 
|---|---|---|---|
| z | [aeiouy]s[aeiouy] | z | hasard | 
| 3 | è | 3 | très | 
| 3 | é | 3 | était | 
| 3 | ai | 3 | était | 
| 3 | e[rtx]$ | 3 | est rentrer / | 
| 3 | ^est$ | 3 | est rentrer / | 
| 3 | e[rt] following letter kept | 3 | errance / | 
| 3 | es[^$] following letter kep | 3 | brest / | 
| 3 | ez$ | 3 | est rentrer / | 
| o | o | o | orange | 
| o | au | o | aubagne | 
| a | a | a | abracadra | 
| a | oi[e] | a | oie | 
| b | b | b | abolition | 
| b | p | b | problème | 
| 1 | [iu][nm]([^mnaeiouy123]) | 1 | obtint emprunt | 
| 1 | ement | 1 | lentement | 
| - | ent$ | - | vouent | 
| 1 | en | 1 | enfant | 
| 1 | em | 1 | emprunter | 
| 1 | an | 1 | enfant | 
| f | f | f | fenêtre | 
| f | ph | f | sophisme | 
| f | v | v | savourer | 
| e | e[^$] | e | fenêtre | 
| e | eu | e | eux | 
| e | o?eu? | e | oeuvre oedême | 
| 2 | o[nm][^nmaeiouy123] | 2 | attention ombre | 
| j | j | j | juger | 
| j | g[ei] | j | juger gironde | 
| j | ch | j | chercher | 
| j | sh | j | sherpa | 
| y | ill | i | briller | 
| y | i | i | cession | 
| y | y | i | cession | 
| s | s | s | sérieux | 
| s | c[ei] | s | cession | 
| s | ç | s | ça | 
| k | g[^ei] | k | gué gardien | 
| k | k | k | karaté | 
| k | c | k | caramel | 
| k | qu | k | que | 
| u | ou | u | oublie | 
| u | u | u | ubuesque | 
| - | [depqrstwxz]$ | - | camp | 
| - | e$ | - | oedême aiment | 
| - | h | - | habituer | 
- Remove numbers and work in lower case.
- Proceed with substitution in the given order.
- Remove duplicates
- Remove -
- If wished remove aeiouy123
Side note
Part of this was developed during a live programming session. Unfortunately the quality is awful, but the links are here:
- Live Programming: Phonetic indexing (1/4) - project motivation and existing solutions overview https://youtu.be/l8BGkOEwCcw
- Live Programming: Phonetic indexing (2/4) - existing PHP code to support German https://youtu.be/0f-9BMp0Md4
- Live Programming: Phonetic indexing (3/4) - adapting to French language, theory and tests https://youtu.be/nFFQpKIvXeY
- Live Programming: Phonetic indexing (4/4) - French language implementation https://youtu.be/Jz365DtN9f0