chemstrucml / csml-parser
PHP parser and SVG renderer for Chemical Structure Markup Language (CSML)
Requires
- php: ^8.4
- ext-dom: *
- ext-libxml: *
- ext-simplexml: *
Requires (Dev)
- laravel/pint: ^1.19
- phpstan/phpstan: ^2.1
- phpunit/phpunit: ^12.5
- rector/rector: ^2.0
- symfony/var-dumper: ^8.0
This package is auto-updated.
Last update: 2026-04-11 22:00:26 UTC
README
A PHP 8.4+ package that parses Chemical Structure Markup Language (CSML) documents and renders them as SVG images.
CSML is an XML-based markup language for describing molecular structures with full topological, geometric, and repeating-unit information. This package provides the reference PHP implementation for parsing and visualizing CSML documents.
Installation
composer require chemstrucml/csml-parser
Requirements
- PHP 8.4 or higher
ext-simplexmlext-domext-libxml
Quick Start
use ChemStrucML\Csml\CsmlRenderer; $renderer = new CsmlRenderer(); // Render from a file $svg = $renderer->renderFile('benzene.csml'); // Render from a string $svg = $renderer->renderString('<csml version="0.1"> <molecule id="water" name="Water"> <atom id="O1" element="O" /> <atom id="H1" element="H" /> <atom id="H2" element="H" /> <bond from="O1" to="H1" order="single" /> <bond from="O1" to="H2" order="single" /> </molecule> </csml>'); // Save to file file_put_contents('molecule.svg', $svg);
Usage
Parse and Render Separately
use ChemStrucML\Csml\CsmlRenderer; $renderer = new CsmlRenderer(); // Parse to inspect the document model $document = $renderer->parse($csmlString); echo $document->version; // "0.1" echo $document->molecules[0]->name; // "Benzene" echo count($document->molecules[0]->atoms); // 6 // Render to SVG $svg = $renderer->render($document);
Custom Rendering Configuration
use ChemStrucML\Csml\CsmlRenderer; use ChemStrucML\Csml\Config\RenderConfig; $config = new RenderConfig( bondLength: 80.0, // Bond length in SVG pixels (default: 60.0) bondWidth: 2.0, // Stroke width (default: 1.5) fontSize: 16.0, // Atom label font size (default: 14.0) fontFamily: 'monospace', // Font family (default: 'Arial, Helvetica, sans-serif') padding: 30.0, // SVG padding (default: 20.0) showAllCarbons: false, // Show carbon labels in skeletal mode (default: false) showImplicitHydrogens: true, // Show implicit H labels (default: true) useColoredAtoms: true, // CPK coloring for atoms (default: true) backgroundColor: 'white', // SVG background (default: 'transparent') ); $renderer = new CsmlRenderer(config: $config); $svg = $renderer->renderFile('molecule.csml');
Using the Parser Directly
use ChemStrucML\Csml\Parser\XmlParser; $parser = new XmlParser(); $document = $parser->parseFile('molecule.csml'); foreach ($document->molecules as $molecule) { echo "Molecule: {$molecule->name}\n"; echo " Atoms: " . count($molecule->atoms) . "\n"; echo " Bonds: " . count($molecule->bonds) . "\n"; echo " Rings: " . count($molecule->rings) . "\n"; }
Supported CSML Features
| Feature | Elements | Status |
|---|---|---|
| Atoms | <atom>, <atom-list> |
Supported |
| Bonds | <bond>, <bond-chain> |
Supported |
| Ring systems | <ring>, <fused-ring> |
Supported |
| Groups & fragments | <group>, <group-ref>, <anchor>, <attach> |
Supported |
| Repeat units | <repeat>, <connector>, <cap> |
Supported |
| Branching | <branch> |
Supported |
| Copolymers | <copolymer> |
Supported |
| Coordinates | <coordinates>, <point> |
Supported |
| Metadata | <meta> |
Supported |
| Implicit hydrogens | implicit-h="auto" |
Supported |
| Stereochemistry | chirality, stereo attributes |
Parsed |
| Geometry constraints | <angle>, <torsion>, <length> |
Parsed |
Bond Rendering Styles
- Single, double, triple bonds with proper parallel line offset
- Aromatic rings with inscribed circle
- Wedge bonds (filled triangle for stereo-up)
- Hatch bonds (dashed lines for stereo-down)
- Dashed bonds (hydrogen bonds, partial bonds)
Atom Rendering
The renderer follows skeletal formula conventions by default:
- Carbon atoms are not labeled (unless they carry a charge, isotope, or explicit label)
- Non-carbon atoms display their element symbol with CPK coloring (O = red, N = blue, S = yellow, etc.)
- Implicit hydrogens are shown as subscripts (e.g. NH, OH, NH₂)
- Charges are rendered as superscripts (e.g. O⁻, NH₃⁺)
CSML Examples
Benzene
<csml version="0.1"> <molecule id="benzene" name="Benzene"> <atom-list element="C" prefix="C" from="1" to="6" /> <ring id="benz" size="6" aromatic="true"> <member atom="C1" /><member atom="C2" /><member atom="C3" /> <member atom="C4" /><member atom="C5" /><member atom="C6" /> </ring> </molecule> </csml>
1-Hexanol
<csml version="0.1"> <molecule id="1-hexanol" name="1-Hexanol"> <atom-list element="C" prefix="C" from="1" to="6" /> <atom id="O1" element="O" /> <bond-chain atoms="C1 C2 C3 C4 C5 C6" order="single" /> <bond from="C6" to="O1" order="single" /> </molecule> </csml>
Naphthalene (Fused Rings)
<csml version="0.1"> <molecule id="naphthalene" name="Naphthalene"> <atom-list element="C" prefix="C" from="1" to="10" /> <ring id="ring_a" size="6" aromatic="true"> <member atom="C1" /><member atom="C2" /><member atom="C3" /> <member atom="C4" /><member atom="C5" /><member atom="C6" /> </ring> <ring id="ring_b" size="6" aromatic="true"> <member atom="C5" /><member atom="C6" /><member atom="C7" /> <member atom="C8" /><member atom="C9" /><member atom="C10" /> </ring> <fused-ring rings="ring_a ring_b" shared-atoms="C5 C6" /> </molecule> </csml>
Why CSML?
| Feature | CSML | SMILES | MOL/SDF | CML | InChI |
|---|---|---|---|---|---|
| Human-readable | Yes | Yes | No | Yes | No |
| Polymer repeat units | Yes | No | Partial | No | No |
| Copolymer patterns | Yes | No | No | No | No |
| Reusable fragments | Yes | No | No | Partial | No |
| Extensible (namespaces) | Yes | No | No | Yes | No |
| End-group specification | Yes | No | No | No | No |
CSML separates topology (what is connected to what) from geometry (angles, lengths) and from presentation (how to draw it), making it uniquely suited for polymer science, materials engineering, and structural chemistry applications where existing formats fall short.
License
MIT