kraenzle-ritter/markdown-to-tei

Flexible Markdown to TEI XML converter with customizable conventions

v1.1.1 2025-08-12 14:25 UTC

This package is auto-updated.

Last update: 2025-08-12 14:26:08 UTC


README

PHP Version License: MIT CI Quality Assurance Downloads Latest Version PHPUnit Tests TEI P5 Composer

A flexible PHP-based converter that transforms Markdown into TEI-XML with support for extended conventions and customizable mappings.

Features

  • Markdown to TEI-XML conversion with full TEI P5 compliance
  • Extended conventions (e.g., [] to <supplied>, {} to <unclear>)
  • Flexible configuration system for custom rules and mappings
  • Configurable TEI metadata (title, author, language, etc.)
  • TEI P5 standard compliant output with proper namespaces
  • High performance processing of large documents
  • Fully tested with comprehensive PHPUnit test suite

Installation

composer require kraenzle-ritter/markdown-to-tei

For development:

git clone https://github.com/kraenzle-ritter/markdown-to-tei.git
cd markdown-to-tei
composer install

Basic Usage

<?php
require_once 'vendor/autoload.php';

use MarkdownToTei\Converter;
use MarkdownToTei\Config\ConversionConfig;

$config = new ConversionConfig();
$converter = new Converter($config);

$markdown = "# Title\n\nThis is a [supplied text] with **important** text.";
$teiXml = $converter->convert($markdown);

echo $teiXml;

Extended Conventions

Overview of Conventions and Mappings

Markdown/HTML TEI-XML Conversion
[supplied text] <supplied>supplied text</supplied>
{unclear text} <unclear>unclear text</unclear>
(( editorial_note )) <note type="editorial">editorial note</note>
--deleted-- <del>deleted</del>
++added++ <add>added</add>
[text](url) <ref target="url">text</ref>
<h1>Heading</h1> <head type="chapter">Heading</head>
<h2>Heading</h2> <head type="section">Heading</head>
<li>Item</li> <item>Item</item>
<ul>...</ul> <list>...</list>
<ol>...</ol> <list type="ordered">...</list>
<blockquote>Quote</blockquote> <quote>Quote</quote>
<em>text</em> / *text* <hi rend="italic">text</hi>
<strong>text</strong> / **text** <hi rend="bold">text</hi>
<code>code</code> / `code` <code>code</code>
<hr> / --- <milestone/>
|p.123| <pb n="123"/>

This table shows the most important standard conventions and HTML-to-TEI mappings. You can add your own rules via configuration.

Advanced Configuration

<?php
// Configure TEI metadata
$config->setTeiSetting('title', 'My Document');
$config->setTeiSetting('author', 'Author Name');
$config->setTeiSetting('language', 'en');

// Add custom conventions
$config->addConvention('page_break', [
    'pattern' => '/\|p\.(\d+)\|/',
    'replacement' => '<pb n="$1"/>',
    'type' => 'regex'
]);

// Custom HTML to TEI mappings
$config->addMapping('h1', 'head[@type="chapter"]');

Examples

  1. example.php: Basic functionality demonstration
  2. examples/advanced_config.php: Advanced configuration options
  3. examples/file_conversion.php: File-based conversion
  4. examples/manuscript_edition.php: Critical edition with special conventions

Testing

Run the test suite:

composer test

27 tests with 100% pass rate ensuring reliability and correctness.

Requirements

  • PHP 8.1 or higher
  • Composer for dependency management

Dependencies

  • league/commonmark - Robust Markdown parsing
  • symfony/dom-crawler - Reliable HTML/XML manipulation
  • symfony/css-selector - CSS selector support

License

This project is licensed under the MIT License - see the LICENSE file for details.