oblak / syllabizer
Splits Serbian words into syllables (podela reči na slogove). Supports Latin and Cyrillic.
Fund package maintenance!
v1.0.0
2026-06-07 13:53 UTC
Requires
- php: >=8.1
- ext-mbstring: *
Requires (Dev)
- oblak/wordpress-coding-standard: ^1.2
- phpunit/phpunit: ^10.5
This package is auto-updated.
Last update: 2026-06-07 13:55:45 UTC
README
Syllabizer
Split serbian words into syllables (podela reči na slogove)
Installation
You can install the package via composer:
$ composer require oblak/syllabizer
Usage
<?php require __DIR__ . '/vendor/autoload.php'; use Oblak\Syllabizer; $syllabizer = new Syllabizer(); $syllabizer->syllabize('jednak'); // ['jed', 'nak'] $syllabizer->syllabize('tramvaj'); // ['tram', 'vaj'] $syllabizer->syllabize('pidžama'); // ['pi', 'dža', 'ma'] $syllabizer->syllabize('mačka'); // ['ma', 'čka'] // Cyrillic works just as well $syllabizer->syllabize('сломљен'); // ['слом', 'љен'] // Syllabic R (slogotvorno r) is a nucleus of its own $syllabizer->syllabize('brzo'); // ['br', 'zo'] $syllabizer->syllabize('rđa'); // ['r', 'đa'] // Count the syllables count($syllabizer->syllabize('slogovnik')); // 3
syllabize() accepts a string or any Stringable, and returns an ordered array
of syllables. Joining the result reproduces the original word exactly:
$word = 'doneti'; implode('', $syllabizer->syllabize($word)) === $word; // true
Joining syllables
tokenize() is a convenience wrapper that returns the syllables as a single string,
joined by a separator (a hyphen by default):
$syllabizer->tokenize('doneti'); // 'do-ne-ti' $syllabizer->tokenize('сломљен'); // 'слом-љен' // Pass any separator you like $syllabizer->tokenize('doneti', '·'); // 'do·ne·ti'
How it works
The library follows the standard pedagogical rules for Serbian syllabification:
- Both scripts — Latin and Cyrillic input are supported. The Latin digraphs
lj,njanddž(in any case) count as a single consonant and are never split, just like their Cyrillic counterpartsљ,њ,џ. - Vowels carry syllables — the number of syllables equals the number of vowels
(
a e i o u), plus any syllabic R. - Syllabic R — an
rwith no neighbouring vowel (between consonants, or word‑initial before a consonant) becomes a syllable nucleus:pr‑st,tr‑ka,r‑vač. - Consonant clusters — a single consonant opens the following syllable
(
li‑va‑da); within a cluster the boundary falls between two sonants (or‑la,tram‑vaj) or between a plosive and a following non‑approximant (lop‑ta,sred‑stvo); otherwise the whole cluster opens the next syllable (la‑sta,je‑dva,sve‑tlost).
Testing
$ composer test
Coding standards
$ composer lint # check $ composer lint:fix # auto-fix
License
The MIT License (MIT). Please see the License File for more information.