ondram/simple-google-reader

Super simple Google Spreadsheets and Google Docs reader

1.0.1 2024-12-23 11:17 UTC

README

Latest Stable Version Coverage Status GitHub Actions Build Status

PHP library providing a simple way to load data from Google Spreadsheets and Google Docs.

The aim is to provide universal low-level access to the data, without any additional features like data manipulation or formatting. It means you can implement any domain object mapping or data processing in your application.

This library is a wrapper for google/apiclient, with minimal additional dependency footprint. It is intended to be easily integrated to any framework or pure PHP.

Installation

Install using Composer:

$ composer require ondram/simple-google-reader

Usage

  1. Obtain service account credentials for your project
  2. In service account details in IAM admin console open Keys settings and add JSON keys. Download generated JSON file with credentials (save for example as google_client.json).
  3. Optional: You can setup domain-wide delegation access for the service account. This is done in Google Workspace Admin. In that case the service account can impersonate any domain user.
  4. Enable required APIs in Google Cloud Console for your project:
  5. Share the intended document with your service account (or if you use domain-wide delegation, then with some user account), copy document ID (from the URL)
  6. Make sure to install any package implementing PSR-6 caching
  7. Prepare cache and initialize Google Client:
<?php declare(strict_types=1);

use Cache\Adapter\Filesystem\FilesystemCachePool;
use Cocur\Slugify\Slugify;
use League\Flysystem\Adapter\Local;
use League\Flysystem\Filesystem;
use OndraM\SimpleGoogleReader\Spreadsheets\SpreadsheetsReader;

require_once __DIR__ . '/vendor/autoload.php';

// Create instance of Cache, in this case we use FilesystemCache
$cache = new FilesystemCachePool(new Filesystem(new Local(__DIR__ . '/data')));

// Instantiate the Google Client with your credentials
$client = new \Google\Client();
$client->setAuthConfig(__DIR__ . '/data/google_client.json');
// If you service account has domain-wide delegation access, you need to use setSubject to set the name of the user
// which will the service account impersonate. This user also must have right to access the spreadsheet.
$client->setSubject('foo@bar.cz');

// see below for spreadsheets and docs usage

Reading spreadsheets

In Google Cloud Console, do not forget to enable Google Sheets API.

// $client is the instance from above example

$client->addScope(\Google\Service\Sheets::SPREADSHEETS);

// Create instance of Slugify, needed for spreadsheets
$slugify = new Slugify(['rulesets' => ['default', 'czech']]);

$reader = new SpreadsheetsReader($client, $slugify, $cache);

$rows = $reader->readById('pasteHereGoogleSpreadsheedId', '[optional sheet name]'/*, optional cache TTL*/);
foreach ($rows as $row) {
    echo $row['first_column'];
    echo $row['second_column'];
}

For spreadsheets, it is required that the first row contains column names. The library will use these names (converted to slugs) as keys in the associative array. Consider table:

This will be read as:

[
    ['first_column' => 'Value 1', 'second_column' => 'Foo'],
    ['first_column' => 'Value 2', 'second_column' => 'Bar'],
]

Empty rows are skipped. There is currently (intentional) limitation to read columns A:Z only.

Reading documents

As plaintext

In Google Cloud Console, do not forget to enable Google Docs API.

// $client is the instance from above example

$client->addScope(\Google\Service\Docs::DOCUMENTS_READONLY);

$docsReader = new DocsReader($client, $cache);

$document = $docsReader->readAsPlaintext('pasteHereGoogleDocsId'/*, optional cache TTL*/);

This will read the whole document as plain text. Only text elements are included, other elements like tables are ignored. Also, any document formatting is ignored.

As HTML

To read document as HTML, do not forget to enable Google Drive API in Google Cloud console.

$client->addScope(\Google\Service\Docs::DRIVE_READONLY);

$html = $docsReader->readAsHtml('pasteHereGoogleDocsId'/*, optional cache TTL*/);

Note the output will be quite bloated-HTML, with many inline styles and nested elements. It may be useful to apply on the output some HTML sanitizer like symfony/html-sanitizer to remove unwanted elements and attributes.

Testing

Tests in this library are mainly integration, meaning they require real Google API access. To run them, you must download and store JSON credentials for you service account to tests/google_client.json file.

The tests then use this table and this document to read example data.

$ composer test

Changelog

For latest changes see CHANGELOG.md file. This project follows Semantic Versioning.