soluble/metadata

Extract metadata from database queries

1.3.2 2018-12-22 11:25 UTC

This package is auto-updated.

Last update: 2024-10-20 06:38:44 UTC


README

PHP Version PHP Version Build Status codecov Scrutinizer Quality Score Latest Stable Version Total Downloads License

soluble-metadata is a low level library currently focusing on MySQL which extracts metadata from an sql query with extensibility, speed and portability in mind.

Use cases

You can take advantage of soluble/metadata to format/render resulting query data according to their type (when rendering an html table, generating an excel sheet...), for basic validation (max lengths, decimals)...

Features

  • Extract metadata information from an SQL query (datatypes,...)
  • Common API across various driver implementations.
  • Rely on native database driver information (does not parse the query in PHP)
  • Works even when the query does not return results (empty resultset).
  • Carefully tested with different implementations (libmariadb, mysqlnd, libmysql, pdo_mysql).

Under the hood, the metadata extraction relies on the driver methods mysqli_stmt::result_metadata() and PDO::getColumnMeta(). Although the soluble-metadata API unify their usage and type detection, differences still exists for more advanced features. A specific effort has been made in the documentation to distinguish possible portability issues when switching from one driver to another. Keep that in mind when using it.

Requirements

  • PHP engine 7.1+ (v1.2.0), 7.0+ and 5.4 (v1.0.0)
  • Mysqli or PDO_mysql extension enabled (Mysqli exposes more features)

Documentation

Installation

Instant installation via composer.

$ composer require soluble/metadata

Most modern frameworks will include composer out of the box, but ensure the following file is included:

<?php
// include the Composer autoloader
require 'vendor/autoload.php';

Basic example

<?php

use Soluble\Metadata\Reader;
use Soluble\Datatype\Column\Type as DataType;


$conn = new \mysqli($hostname,$username,$password,$database);
$conn->set_charset($charset);

$metaReader = new Reader\MysqliMetadataReader($conn);

$sql = "select * from `my_table`";

try {
    $md = $metaReader->getColumnsMetadata($sql);
} catch (\Soluble\Metadata\Exception\InvalidQueryException $e) {
    // ...
}

foreach($md as $column_name => $col_def) {
   
   $datatype = $col_def->getDatatype();
   
   echo $column_name . "\t" . $datatype . "\t";
   
   echo ($col_def->isNullable() ? 'Y' : 'N') . '\t';
   
   switch ($datatype) {
       case DataType::TYPE_STRING:  // equivalent to 'string'
           echo $col_def->getCharacterOctetLength() . "\t";
           break;
       case DataType::TYPE_INTEGER:
           echo ($col_def->isNumericUnsigned() ? 'Y' : 'N') . "\t";
           break;
       case DataType::TYPE_DECIMAL:
           echo ($col_def->isNumericUnsigned() ? 'Y' : 'N') . "\t";
           echo $col->getNumericPrecision() . "\t";  // For DECIMAL(5,2) -> precision = 5 
           echo $col->getNumericScale() . "\t";      // For DECIMAL(5,2) -> scale = 2
           break;
           
       // ...see the doc for more possibilitities
   }
   
   echo $col_def->getNativeType() . PHP_EOL;
}  

Could print something like :

...

Usage

Step 1. Initiate a metadata reader

  • For Mysqli: send the existing mysqli connection to the MysqlMetadataReader :

    <?php
    use Soluble\Metadata\Reader;
    
    $conn = new \mysqli($hostname,$username,$password,$database);
    $conn->set_charset($charset);
    
    $reader = new Reader\MysqliMetadataReader($conn);
  • For Pdo_mysql: send the existing pdo_mysql connection to the PdoMysqlReader :

    <?php
    use Soluble\Metadata\Reader;
    
    $conn = new \PDO("mysql:host=$hostname", $username, $password, [
                \PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
    ]);
    
    $reader = new Reader\PDOMysqlMetadataReader($conn);

Step 2. Extract metadata from an SQL query

<?php

//....

$reader = new Reader\MysqliMetadataReader($conn);

$sql = "
         SELECT `p`.`post_id`,
                `p`.`title` AS `post_title` 
                `p`.`created_at`,
                'constant' AS `constant_col`,
                1 + 2 AS `computed_col`, 
                null as `null_col`
                COUNT(`c`.*) as `nb_comments`,
                MAX(`c`.`created_at`) as latest_comment
                 
            FROM `post` AS `p`
            LEFT OUTER JOIN `comment` as `c`  
                 ON `c`.`post_id` = `p`.`post_id`
            GROUP BY `p`.`post_id`, `p`.`title`, 
                     `p`.`created_at`, `constant_col`, 
                     `computed_col`, `null_col`     
       ";


try {    
    $meta = $reader->getColumnsMetadata($sql);
} catch (\Soluble\Metadata\Exception\InvalidQueryException $e) { 
    //...
}

/*
  The resulting ColumnsMetadata will contain something like:

  [
     "post_id"        => '<Soluble\Datatype\Column\Definition\IntegerColumn>',
     "post_title"     => '<Soluble\Datatype\Column\Definition\StringColumn>',
     "created_at"     => '<Soluble\Datatype\Column\Definition\DatetimeColumn>',
     "constant_col"   => '<Soluble\Datatype\Column\Definition\StringColumn>',
     "computed_col"   => '<Soluble\Datatype\Column\Definition\IntegerColumn>',
     "null_col"       => '<Soluble\Datatype\Column\Definition\NullColumn>',
     "nb_comments"    => '<Soluble\Datatype\Column\Definition\IntegerColumn>',
     "latest_comment" => '<Soluble\Datatype\Column\Definition\DateTimeColumn>'
     
  ]
    
*/  

Alternatively, when you want to get the metadata from a table you can use the helper method $reader->getTableMetadata($table).

Step 3: Getting column type (4 options)

<?php

// ...

$meta = $reader->getColumnsMetadata($sql);

// Retrieve a specific column (i.e. 'post_title')
// Note the parameter is the column alias if defined 

$col = $meta->getColumn('post_title'); 

                                       
// Type detection
// ----------------------

// Option 1, type detection by datatype name
// ------------------------------------------

echo $col->getDatatype(); // -> 'string' (equivalent to Soluble\Datatype\Column\Type::TYPE_STRING)  

/* 
   The normalized datatypes are defined in the 
   Soluble\Datatype\Column\Type::TYPE_(*) and can be :
   'string', 'integer', 'decimal', 'float', 'boolean', 
   'datetime', 'date', 'time', 'bit', 'spatial_geometry'
*/


// Option 2, type detection by classname
// --------------------------------------

if ($col instanceof \Soluble\Datatype\Column\IntegerColumn) {
    // ... could be also BitColumn, BlobColumn, BooleanColumn
    // ... DateColumn, DateTimeColumn, DecimalColumn, FloatColumn
    // ... GeometryColumn, IntegerColumn, StringColumn, TimeColumn,
    // ... NullColumn
}

// Option 3, type detection by interface (more generic)
// -----------------------------------------------------

if ($col instanceof \Soluble\Datatype\Column\NumericColumnInterface) {
   // ... for example NumericColumnInterface 
   // ... includes DecimalColumn, FloatColumn, IntegerColumn
}

// Option 4, type detection by helper functions (more generic)
// -----------------------------------------------------------

$col->isText();     // Whether the column contains text (CHAR, VARCHAR, ENUM...)
$col->isNumeric();  // Whether the column is numeric (INT, DECIMAL, FLOAT...)
$col->isDatetime(); // Whether the column is a datetime (DATETIME)
$col->isDate();     // Whther the column is a date (DATE)

Step 4: Getting datatype extra information

The following methods are supported and portable between mysqli and PDO_mysql drivers:

<?php

// ...

// For all types
// -------------

echo $col->getOrdinalPosition(); // -> 2 (column position in the query)
echo $col->isNullable() ? 'nullable' : 'not null';
echo $col->isPrimary() ? 'PK' : '';  // Many columns may have the primary flag
                                     // The meaning of it depends on your query

// For integer and decimal types
echo $col->isNumericUnsigned();   // Whether the numeric value is unsigned.

// For decimal based types
// -----------------------

echo $col->getNumericPrecision(); // For DECIMAL(5,2) -> 5 is the precision
echo $col->getNumericScale();     // For DECIMAL(5,2) -> 2 is the scale

// For character/blob based types
// ------------------------------

echo $col->getCharacterOctetLength();  // Octet length (in multibyte context length might differs)
 

Getting column specifications.

The following methods are also portable.

<?php

// ...

echo $col->getAlias(); // Column alias name -> "post_title" (or column name if not aliased)

echo $col->isComputed(); // Whenever there's no table linked (for GROUP, constants, expression...)

echo $col->getTableAlias(); // Originating table alias -> "p" (or table name if not aliased)
                            // If empty, the column is computed (constant, group,...)

The methods used in the example below gives different results with pdo_mysql and mysqli drivers. Use them with care if portability is required !!!

<?php

// ...

echo $col->getTableName();  // Originating table -> "post"
                            // (*) PDO_mysql always return the table alias if aliased 

echo $col->getName();  // Column original name -> "title". 
                       // (*) PDO_mysql always return the alias if aliased

echo $col->getNativeType(); // Return the column definition native type
                            // i.e: BIGINT, SMALLINT, VARCHAR, ENUM
                            // (*) PDO_mysql consider 
                            //        - ENUM, SET and VARCHAR as CHAR
                            
echo $col->isGroup(); // Whenever the column is part of a group (MIN, MAX, AVG,...)
                      // (*) PDO_mysql is not able to retrieve group information
                      // (*) Mysqli: detection of group is linked to the internal driver
                      //     Check your driver with mysqli_get_client_version().
                      //       - mysqlnd detects:
                      //          - COUNT, MIN, MAX
                      //       - libmysql detects:
                      //          - COUNT, MIN, MAX, AVG, GROUP_CONCAT
                      //       - libmariadb detects:
                      //          - COUNT, MIN, MAX, AVG, GROUP_CONCAT and growing    


// For numeric types
// -----------------

echo $col->isAutoIncrement();   // Only make sense for primary keys.
                                // (*) Unsupported with PDO_mysql

echo $col->isNumericUnsigned(); // Whether the numeric value is unsigned.
                                // (*) Unsupported with PDO_mysql

Unsupported methods

Those methods are still unsupported on both mysqli and PDO_mysql implementations but kept as reference

<?php

// ... 

echo $col->getColumnDefault(); // Always return null
echo $col->getCharacterMaximumLength();  // Returns $col->getCharacterOctetLength()
                                         // and does not (yet) handle multibyte aspect.

API

AbstractMetadataReader

Use the Reader\AbstractMetadataReader::getColumnsMetadata($sql) to extract query metadata.

<?php

use Soluble\Metadata\Reader;
use PDO;

$conn = new PDO("mysql:host=$hostname", $username, $password, [
            PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
]);

$reader = new Reader\PdoMysqlMetadataReader($conn);

$sql = "select id, name from my_table";

$columnsMeta = $reader->getColumnsMetadata($sql);

ColumnsMetadata

The Soluble\Metadata\ColumnsMetadata allows to iterate over column information or return a specific column as an Soluble\Datatype\Column\Definition\AbstractColumnDefinition.

<?php

$reader = new Reader\PdoMysqlMetadataReader($conn);

$sql = "select id, name from my_table";

$columnsMeta = $reader->getColumnsMetadata($sql);

foreach ($columnsMeta as $col_name => $col_def) {
    echo $coldev->getDatatype() . PHP_EOL; 
}

$col = $columnsMeta->getColumn('id');
echo $col->getDatatype();

AbstractColumnDefinition

Metadata information is stored as an Soluble\Datatype\Column\Definition\AbstractColumnDefinition object on which :

AbstractColumnDefinition implementations

Here's the list of concrete implementations for Soluble\Datatype\Column\Definition\AbstractColumnDefinition.

They can be used as an alternative way to check column datatype. For example

use Soluble\Datatype\Column\Definition;

if ($coldef instanceof Definition\DateColumnInterface) {

    // equivalent to
    // if ($coldef->isDate()) {

    $date = new \DateTime($value);
    echo $value->format('Y');
} elseif ($coldef instanceof Definition\NumericColumnInterface) {
    echo number_format($value, $coldef->getNumericPrecision);
}

Supported readers

Currently only pdo_mysql and mysqli drivers are supported.

Future ideas

  • Implement more drivers (pgsql...), contributions welcome !!!

Contributing

Contribution are welcome see contribution guide

Notes

Currently metadata are read from the underlying database driver by executing a query with a limit 0 (almost no performance penalty). This ensure your query is always correctly parsed (even crazy ones) with almost no effort.

The underlying driver methods mysqli_stmt::result_metadata(), PDO::getColumnMeta() used respectively by the metadata readers Mysql and PdoMysql are marked as experimental and subject to change on the PHP website. In practice, they haven't changed since 5.4 and are stable. In case of a change in the php driver, it should be very easy to add a specific driver.

Sadly there is some differences between PDO_mysql and mysqli in term of features. Generally the best is to use mysqli instead of pdo. PDO lacks some features like detection of autoincrement, enum, set, unsigned, grouped column and does not distinguish between table/column aliases and their original table/column names.

If you want to rely on this specific feature (aliases) have a look to alternatives like phpmyadmin sql-parser.

Also if you are looking for a more advanced metadata reader (but limited to table - not a query), have a look to the soluble-schema project which share the same datatype standards while exposing more information like foreign keys,... in a more portable way.

Coding standards