README

Laravel package for integrating Gemini Text-to-Speech API and Google Cloud Text-to-Speech API

Requirements

PHP: 8.4 or higher
Google Cloud account with access to Gemini API and/or Cloud Text-to-Speech API

Installation

You can install the package via composer:

composer require moe-mizrak/laravel-google-text-to-speech

You can publish the config file with:

php artisan vendor:publish --tag="laravel-google-text-to-speech"

Configuration

After publishing the configuration file, you can set your Google Cloud credentials and other settings in the config/laravel-google-text-to-speech.php file.

Published config file will look like this:

return [
    'driver' => env('GOOGLE_TEXT_TO_SPEECH_DRIVER', TextToSpeechDriverType::GEMINI->value), // Options: 'gemini', 'cloud'
    'api_endpoint' => env('GOOGLE_TEXT_TO_SPEECH_API_ENDPOINT', 'generativelanguage.googleapis.com'), // For Gemini API use 'generativelanguage.googleapis.com', for Google Cloud API use 'texttospeech.googleapis.com'
    'cloud' => [
        'credentials' => env('GOOGLE_TEXT_TO_SPEECH_CREDENTIALS'), // The path to the Google Cloud credentials JSON file.
    ],
    'gemini' => [
        'api_key' => env('GOOGLE_GEMINI_API_KEY'), // Your Gemini API key
        'model' => env('GOOGLE_GEMINI_MODEL', 'gemini-2.5-flash-preview-tts'), // The Gemini model to use for Text-to-Speech synthesis.
        'temperature' => env('GOOGLE_GEMINI_TEMPERATURE', 0.85),
    ],
];

Note

If you are using Google Cloud Text-to-Speech API:

Go to the Google Cloud Console to create and download your service account credentials with proper permissions for Text-to-Speech API.
Save the downloaded JSON file and set its path in the config cloud.credentials field.

If you are using Gemini Text-to-Speech API:

Go to Google Cloud Console and select the project where Gemini API is enabled (or create a project).
Create a service account with the necessary roles to access Gemini API.
Add a new key on the Keys tab, which will be used in the config gemini.api_key field.

Usage

There are 2 drivers for Google Text-to-Speech API:

gemini: Uses Gemini Text-to-Speech API.
cloud: Uses Google Cloud Text-to-Speech API.

Gemini Text-to-Speech API is the newer and more advanced API (premium voices), while Google Cloud Text-to-Speech API is the traditional API.

Note

You can set the driver in the config file so that the package uses the desired API automatically

(You need to set credentials/api_key, and api_endpoint accordingly in the config file for the selected driver)

Synthesize Text

This is an example of how to use the synthesizeText method:

For Gemini Text-to-Speech API:

$textData = new GeminiTextData(
    text: 'Laplace Demon: the hypothetical entity that, with perfect knowledge of the present, could predict all future events based on causal determinism.',
);

$voiceData = new GeminiVoiceData(
    voiceName: 'Algieba',
    modelName: 'gemini-2.5-flash-preview-tts',
);

$audioConfigData = new GeminiAudioConfigData;

$geminiSynthesizeData = new CloudSynthesizeData(
    $textData,
    $voiceData,
    $audioConfigData,
);

$response = GoogleTextToSpeech::synthesizeSpeech($geminiSynthesizeData);

$response will contain the synthesized audio content (bytes). it can be saved as an audio file as follows:
```
file_put_contents('output.pcm', $response);
```

Note

Gemini Text-to-Speech API currently supports only .pcm audio format.

After saving the output as a .pcm file, you can convert it to other audio formats (like .wav or .mp3) using tools like ffmpeg.

Tip

Check GeminiTextData, GeminiAudioConfigData and GeminiVoiceData classes for more options.

For Cloud Text-to-Speech API:

$textData = new CloudTextData(
    text: 'Laplace Demon: the hypothetical entity that, with perfect knowledge of the present, could predict all future events based on causal determinism.',
    isSsml: false,
);

$voiceData = new CloudVoiceData(
    languageCode: 'en-US',
    voiceName: 'en-US-Wavenet-D',
);

$audioConfigData = new CloudAudioConfigData(
    audioEncoding: AudioEncoding::MP3,
);

$cloudSynthesizeData = new CloudSynthesizeData(
    $textData,
    $voiceData,
    $audioConfigData
);

$response = GoogleTextToSpeech::synthesizeSpeech($cloudSynthesizeData);

$response will contain the synthesized audio content (bytes). it can be saved as an audio file as follows:
```
file_put_contents('output.mp3', $response);
```

Tip

Check CloudTextData, CloudVoiceData, and CloudAudioConfigData classes for more options.

List Voices

This is an example of how to use the listVoices method:

$response = GoogleTextToSpeech::listVoices(languageCode: 'en-US');

$response will contain a list/array of available voices for the specified language code.

Warning

listVoices method only works with Google Cloud Text-to-Speech API. It is not supported for Gemini Text-to-Speech API.

Contributing

Your contributions are welcome! If you'd like to improve this project, simply create a pull request with your changes. Your efforts help enhance its functionality and documentation.

If you find this project useful, please consider ⭐ it to show your support!

Authors

This project is created and maintained by Moe Mizrak.

License

Laravel Package Template is an open-sourced software licensed under the MIT license.

moe-mizrak / laravel-google-text-to-speech

Maintainers

Details

README

Requirements

Installation

Configuration

Usage

Synthesize Text

For Gemini Text-to-Speech API:

For Cloud Text-to-Speech API:

List Voices

Contributing

Authors

License