kraenzle-ritter / markdown-to-tei
Flexible Markdown to TEI XML converter with customizable conventions
v1.1.1
2025-08-12 14:25 UTC
Requires
- php: >=8.1
- league/commonmark: ^2.4
- symfony/css-selector: ^6.3|^7.0
- symfony/dom-crawler: ^6.3|^7.0
Requires (Dev)
- phpunit/phpunit: ^10.0|^11.0
- squizlabs/php_codesniffer: ^3.7
README
A flexible PHP-based converter that transforms Markdown into TEI-XML with support for extended conventions and customizable mappings.
Features
- Markdown to TEI-XML conversion with full TEI P5 compliance
- Extended conventions (e.g.,
[]
to<supplied>
,{}
to<unclear>
) - Flexible configuration system for custom rules and mappings
- Configurable TEI metadata (title, author, language, etc.)
- TEI P5 standard compliant output with proper namespaces
- High performance processing of large documents
- Fully tested with comprehensive PHPUnit test suite
Installation
composer require kraenzle-ritter/markdown-to-tei
For development:
git clone https://github.com/kraenzle-ritter/markdown-to-tei.git
cd markdown-to-tei
composer install
Basic Usage
<?php require_once 'vendor/autoload.php'; use MarkdownToTei\Converter; use MarkdownToTei\Config\ConversionConfig; $config = new ConversionConfig(); $converter = new Converter($config); $markdown = "# Title\n\nThis is a [supplied text] with **important** text."; $teiXml = $converter->convert($markdown); echo $teiXml;
Extended Conventions
Overview of Conventions and Mappings
Markdown/HTML | TEI-XML Conversion |
---|---|
[supplied text] |
<supplied>supplied text</supplied> |
{unclear text} |
<unclear>unclear text</unclear> |
(( editorial_note )) |
<note type="editorial">editorial note</note> |
--deleted-- |
<del>deleted</del> |
++added++ |
<add>added</add> |
[text](url) |
<ref target="url">text</ref> |
<h1>Heading</h1> |
<head type="chapter">Heading</head> |
<h2>Heading</h2> |
<head type="section">Heading</head> |
<li>Item</li> |
<item>Item</item> |
<ul>...</ul> |
<list>...</list> |
<ol>...</ol> |
<list type="ordered">...</list> |
<blockquote>Quote</blockquote> |
<quote>Quote</quote> |
<em>text</em> / *text* |
<hi rend="italic">text</hi> |
<strong>text</strong> / **text** |
<hi rend="bold">text</hi> |
<code>code</code> / `code` |
<code>code</code> |
<hr> / --- |
<milestone/> |
|p.123| |
<pb n="123"/> |
This table shows the most important standard conventions and HTML-to-TEI mappings. You can add your own rules via configuration.
Advanced Configuration
<?php // Configure TEI metadata $config->setTeiSetting('title', 'My Document'); $config->setTeiSetting('author', 'Author Name'); $config->setTeiSetting('language', 'en'); // Add custom conventions $config->addConvention('page_break', [ 'pattern' => '/\|p\.(\d+)\|/', 'replacement' => '<pb n="$1"/>', 'type' => 'regex' ]); // Custom HTML to TEI mappings $config->addMapping('h1', 'head[@type="chapter"]');
Examples
example.php
: Basic functionality demonstrationexamples/advanced_config.php
: Advanced configuration optionsexamples/file_conversion.php
: File-based conversionexamples/manuscript_edition.php
: Critical edition with special conventions
Testing
Run the test suite:
composer test
27 tests with 100% pass rate ensuring reliability and correctness.
Requirements
- PHP 8.1 or higher
- Composer for dependency management
Dependencies
league/commonmark
- Robust Markdown parsingsymfony/dom-crawler
- Reliable HTML/XML manipulationsymfony/css-selector
- CSS selector support
License
This project is licensed under the MIT License - see the LICENSE file for details.