codeinc / pdf2txt-client
A PHP client for the pdf2txt service
v1.5
2024-02-24 01:28 UTC
Requires
- php: >=8.3
- php-http/discovery: ^1.19
- php-http/multipart-stream-builder: ^1.3
- psr/http-client: ^1.0
Requires (Dev)
- php-http/guzzle7-adapter: ^1.0
- phpunit/phpunit: ^11
- spatie/ray: ^1.41
README
This repository contains a PHP 8.2+ library for converting PDF files to text using the pdf2txt service.
Installation
The recommended way to install the library is through Composer:
composer require codeinc/pdf2txt-client
Usage
This client requires a running instance of the pdf2txt service. The service can be run locally using Docker or deployed to a server.
Examples
Extracting text from a local file:
use CodeInc\Pdf2TxtClient\Pdf2TxtClient; use CodeInc\Pdf2TxtClient\Exception; $apiBaseUri = 'http://localhost:3000/'; $localPdfPath = '/path/to/local/file.pdf'; try { // convert $client = new Pdf2TxtClient($apiBaseUri); $stream = $client->extract( $client->createStreamFromFile($localPdfPath) ); // display the text echo (string)$stream; } catch (Exception $e) { // handle exception }
With additional options:
use CodeInc\Pdf2TxtClient\Pdf2TxtClient; use CodeInc\Pdf2TxtClient\ConvertOptions; use CodeInc\Pdf2TxtClient\Format; $apiBaseUri = 'http://localhost:3000/'; $localPdfPath = '/path/to/local/file.pdf'; $convertOption = new ConvertOptions( firstPage: 2, lastPage: 3, format: Format::json ); try { $client = new Pdf2TxtClient($apiBaseUri); // convert $jsonResponse = $client->extract( $client->createStreamFromFile($localPdfPath), $convertOption ); // display the text in a JSON format $decodedJson = $client->processJsonResponse($jsonResponse); var_dump($decodedJson); } catch (Exception $e) { // handle exception }
Saving the extracted text to a file:
use CodeInc\Pdf2TxtClient\Pdf2TxtClient; use CodeInc\Pdf2TxtClient\ConvertOptions; use CodeInc\Pdf2TxtClient\Format; $apiBaseUri = 'http://localhost:3000/'; $localPdfPath = '/path/to/local/file.pdf'; destinationTextPath = '/path/to/local/file.txt'; try { $client = new Pdf2TxtClient($apiBaseUri); // convert $stream = $client->extract( $client->createStreamFromFile($localPdfPath) ); // save the text to a file $client->saveStreamToFile($stream, $destinationTextPath); } catch (Exception $e) { // handle exception }
License
The library is published under the MIT license (see LICENSE
file).