frictionlessdata/datapackage

A utility library for working with Data Packages

v1.0.0 2021-08-29 15:50 UTC

This package is auto-updated.

Last update: 2024-10-29 04:53:13 UTC


README

Build Coveralls Scrutinizer-ci Packagist SemVer Codebase Support

A utility library for working with Data Package in PHP.

Features summary and Usage guide

Installation

composer require frictionlessdata/datapackage

Optionally, to create zip files you will need the PHP zip extension. On Ubuntu it can be enabled with sudo apt-get install php-zip

Package

Load a data package conforming to the specs

use frictionlessdata\datapackage\Package;
$package = Package::load("tests/fixtures/multi_data_datapackage.json");

Iterate over the resources and the data

foreach ($package as $resource) {
    echo $resource->name();
    foreach ($resource as $row) {
        echo $row;
    }
}

Get all the data as an array (loads all the data into memory, not recommended for large data sets)

foreach ($package as $resource) {
    var_dump($resource->read());
}

All data and schemas are validated and throws exceptions in case of any problems.

Validate the data explicitly and get a list of errors

Package::validate("tests/fixtures/simple_invalid_datapackage.json");  // array of validation errors

Load a zip file

$package = Package::load('http://datahub.io/opendatafortaxjustice/eucountrydatawb/r/datapackage_zip.zip');

Provide read options which are passed through to tableschema-php Table::read method

$package = Package::load('http://datahub.io/opendatafortaxjustice/eucountrydatawb/r/datapackage_zip.zip');
foreach ($package as $resource) {
    $resource->read(["cast" => false]);
}

The package object has some useful methods to access and manipulate the resources

$package = Package::load("tests/fixtures/multi_data_datapackage.json");
$package->resources();  // array of resource name => Resource object (see below for Resource class reference)
$package->getResource("first-resource");  // Resource object matching the given name
$package->removeResource("first-resource");
// add a tabular resource
$package->addResource("tabular-resource-name", [
    "profile" => "tabular-data-resource",
    "schema" => [
        "fields" => [
            ["name" => "id", "type" => "integer"],
            ["name" => "name", "type" => "string"]
        ]
    ],
    "path" => [
        "tests/fixtures/simple_tabular_data.csv",
    ]
]);

Create a new package from scratch

$package = Package::create([
    "name" => "datapackage-name",
    "profile" => "tabular-data-package"
]);
// add a resource
$package->addResource("resource-name", [
    "profile" => "tabular-data-resource", 
    "schema" => [
        "fields" => [
            ["name" => "id", "type" => "integer"],
            ["name" => "name", "type" => "string"]
        ]
    ],
    "path" => "tests/fixtures/simple_tabular_data.csv"
]);
// save the package descriptor to a file
$package->saveDescriptor("datapackage.json");

Save the entire datapackage including any local data to a zip file

$package->save("datapackage.zip");

Resource

Resource objects can be accessed from a Package as described above

$resource = $package->getResource("resource-name")

or instantiated directly

use frictionlessdata\datapackage\Resource;
$resource = Resource::create([
    "name" => "my-resource",
    "profile" => "tabular-data-resource",
    "path" => "tests/fixtures/simple_tabular_data.csv",
    "schema" => ["fields" => [["name" => "id", "type" => "integer"], ["name" => "name", "type" => "string"]]]
]);

Iterating or reading over the resource produces combined rows from all the path or data elements

foreach ($resource as $row) {};  // iterating
$resource->read();  // get all the data as an array

Contributing

Please read the contribution guidelines: How to Contribute