ondram / simple-google-reader
Super simple Google Spreadsheets and Google Docs reader
Requires
- php: ^8.1
- cocur/slugify: ^4.3
- google/apiclient: ^2.13
- psr/simple-cache: ^1.0
- psr/simple-cache-implementation: ^1.0 || ^2.0 || ^3.0
Requires (Dev)
- cache/array-adapter: ^1.2
- cache/filesystem-adapter: ^1.1
- ergebnis/composer-normalize: ^2.20
- php-parallel-lint/php-parallel-lint: ^1.3
- phpstan/phpstan: ^1.10
- phpunit/phpunit: ^10.5
- symfony/css-selector: ^6.0 || ^7.0
- symfony/dom-crawler: ^6.0 || ^7.0
- symfony/var-dumper: ^6.0 || ^7.0
README
PHP library providing a simple way to load data from Google Spreadsheets and Google Docs.
The aim is to provide universal low-level access to the data, without any additional features like data manipulation or formatting. It means you can implement any domain object mapping or data processing in your application.
This library is a wrapper for google/apiclient, with minimal additional dependency footprint. It is intended to be easily integrated to any framework or pure PHP.
Installation
Install using Composer:
$ composer require ondram/simple-google-reader
Usage
- Obtain service account credentials for your project
- In service account details in IAM admin console open Keys settings and add JSON keys. Download generated JSON file with credentials (save for example as
google_client.json
). - Optional: You can setup domain-wide delegation access for the service account. This is done in Google Workspace Admin. In that case the service account can impersonate any domain user.
- Enable required APIs in Google Cloud Console for your project:
- Google Sheets API if you plan reading Spreadsheets
- Google Docs API to read Docs
- Google Drive API if you need to read Docs as HTML
- Share the intended document with your service account (or if you use domain-wide delegation, then with some user account), copy document ID (from the URL)
- Make sure to install any package implementing PSR-6 caching
- Prepare cache and initialize Google Client:
<?php declare(strict_types=1); use Cache\Adapter\Filesystem\FilesystemCachePool; use Cocur\Slugify\Slugify; use League\Flysystem\Adapter\Local; use League\Flysystem\Filesystem; use OndraM\SimpleGoogleReader\Spreadsheets\SpreadsheetsReader; require_once __DIR__ . '/vendor/autoload.php'; // Create instance of Cache, in this case we use FilesystemCache $cache = new FilesystemCachePool(new Filesystem(new Local(__DIR__ . '/data'))); // Instantiate the Google Client with your credentials $client = new \Google\Client(); $client->setAuthConfig(__DIR__ . '/data/google_client.json'); // If you service account has domain-wide delegation access, you need to use setSubject to set the name of the user // which will the service account impersonate. This user also must have right to access the spreadsheet. $client->setSubject('foo@bar.cz'); // see below for spreadsheets and docs usage
Reading spreadsheets
In Google Cloud Console, do not forget to enable Google Sheets API.
// $client is the instance from above example $client->addScope(\Google\Service\Sheets::SPREADSHEETS); // Create instance of Slugify, needed for spreadsheets $slugify = new Slugify(['rulesets' => ['default', 'czech']]); $reader = new SpreadsheetsReader($client, $slugify, $cache); $rows = $reader->readById('pasteHereGoogleSpreadsheedId', '[optional sheet name]'/*, optional cache TTL*/); foreach ($rows as $row) { echo $row['first_column']; echo $row['second_column']; }
For spreadsheets, it is required that the first row contains column names. The library will use these names (converted to slugs) as keys in the associative array. Consider table:
This will be read as:
[ ['first_column' => 'Value 1', 'second_column' => 'Foo'], ['first_column' => 'Value 2', 'second_column' => 'Bar'], ]
Empty rows are skipped. There is currently (intentional) limitation to read columns A:Z only.
Reading documents
As plaintext
In Google Cloud Console, do not forget to enable Google Docs API.
// $client is the instance from above example $client->addScope(\Google\Service\Docs::DOCUMENTS_READONLY); $docsReader = new DocsReader($client, $cache); $document = $docsReader->readAsPlaintext('pasteHereGoogleDocsId'/*, optional cache TTL*/);
This will read the whole document as plain text. Only text elements are included, other elements like tables are ignored. Also, any document formatting is ignored.
As HTML
To read document as HTML, do not forget to enable Google Drive API in Google Cloud console.
$client->addScope(\Google\Service\Docs::DRIVE_READONLY); $html = $docsReader->readAsHtml('pasteHereGoogleDocsId'/*, optional cache TTL*/);
Note the output will be quite bloated-HTML, with many inline styles and nested elements. It may be useful to apply on the output some HTML sanitizer like symfony/html-sanitizer to remove unwanted elements and attributes.
Testing
Tests in this library are mainly integration, meaning they require real Google API access.
To run them, you must download and store JSON credentials for you service account to tests/google_client.json
file.
The tests then use this table and this document to read example data.
$ composer test
Changelog
For latest changes see CHANGELOG.md file. This project follows Semantic Versioning.