keboola / php-csv-db-import
Handling of large bulk data into database tables.
Installs: 42 958
Dependents: 3
Suggesters: 0
Security: 0
Stars: 4
Watchers: 17
Forks: 0
Open Issues: 14
Requires
- php: ^7.4|^8
- aws/aws-sdk-php: ^3.11
- keboola/csv: ^1.1
- tracy/tracy: ^2.3
Requires (Dev)
- keboola/coding-standard: ^14.0
- keboola/phpunit-retry-annotations: *
- phpstan/phpstan: ^1.9
- phpunit/phpunit: ^9.5
- dev-master
- 6.0.0
- 5.3.0
- 5.2.0
- 5.1.1
- 5.1.0
- 5.0.4
- 5.0.3
- 5.0.2
- 5.0.1
- 5.0.0
- 4.1.2
- 4.1.1
- 4.1.0
- 4.0.0
- 3.0.1
- 3.0.0
- 2.4.4
- 2.4.3
- 2.4.2
- 2.4.1
- 2.4.0
- 2.3.1
- 2.3.0
- 2.2.1
- 2.2.0
- 2.1.0
- 2.0.0
- 1.5.1
- 1.5.0
- 1.4.1
- 1.4.0
- 1.3.11
- 1.3.10
- 1.3.9
- 1.3.8
- 1.3.7
- 1.3.6
- 1.3.5
- 1.3.4
- 1.3.3
- 1.3.2
- 1.3.1
- 1.3.0
- 1.2.0
- 1.1.1
- 1.1.0
- 1.0.9
- 1.0.8
- 1.0.7
- 1.0.6
- 1.0.5
- 1.0.4
- 1.0.3
- 1.0.2
- 1.0.1
- 1.0.0
- dev-mj-remove-snowflake-engine
- dev-KBC-2862-user-error-when-load-fails-to-native-types
- dev-ujovlado-license-and-readme
- dev-zajca-update-php-csv
- dev-zajca-snflk-dr-2222
- dev-KBC-1976-idea-improve-performance-of-ups
- dev-odbc-2211
- dev-tf-remove-unnecessary-snflk-driver-download
- dev-martin-KBC-190-retry-on-result-not-found
- dev-tf-KBC-190-retry-on-result-not-found
- dev-roman-fix-connection-error-KBC-89
- dev-roman-snowflake-query-tagging-KBC-89
- dev-roman-update-snowflake-odbc-version
- dev-zajca-kbc-50
- dev-vojta-xdebug
- dev-erik-connection-create
- dev-erik-password
- dev-martin-snflk-binding
- dev-piv-typo
- dev-martin-snowflake-perf-debug
- dev-martin-cleanup
This package is auto-updated.
Last update: 2024-11-17 13:50:08 UTC
README
Handling of large bulk data into database tables.
Supported engines:
AWS Redshift
,Snowflake
- Load data from CSV stored in AWS S3
- Load data from another Redshift table in same database
Features
- Full load - destination table is truncated before load
- Incremental load - data are merged
- Primary key dedup for all engines
- Convert empty values to
NULL
(usingconvertEmptyValuesToNull
option)
Development
Preparation
- Create AWS S3 bucket and IAM user using
aws-services.json
cloudformation template. - Create Redshift cluster
- Download Snowflake driver (.deb) and place it into root of repository (
./snowflake-odbc.deb
) - Create
.env
file. Use output ofaws-services
cloudfront stack to fill the variables and your Redshift credentials.
REDSHIFT_HOST=
REDSHIFT_PORT=5439
REDSHIFT_USER=
REDSHIFT_DATABASE=
REDSHIFT_PASSWORD=
SNOWFLAKE_HOST=
SNOWFLAKE_PORT=
SNOWFLAKE_USER=
SNOWFLAKE_PASSWORD=
SNOWFLAKE_DATABASE=
SNOWFLAKE_WAREHOUSE=
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_S3_BUCKET=
AWS_REGION=
Upload test fixtures to S3:
docker-compose run php php ./tests/loadS3.php
Redshift settings
User and database are required for tests. You can create them:
CREATE USER keboola_db_import PASSWORD 'YOUR_PASSWORD';
CREATE DATABASE keboola_db_import;
GRANT ALL ON DATABASE keboola_db_import TO keboola_db_import;
Snowflake settings
Role, user, database and warehouse are required for tests. You can create them:
CREATE ROLE "KEBOOLA_DB_IMPORT";
CREATE DATABASE "KEBOOLA_DB_IMPORT";
GRANT ALL PRIVILEGES ON DATABASE "KEBOOLA_DB_IMPORT" TO ROLE "KEBOOLA_DB_IMPORT";
CREATE WAREHOUSE "KEBOOLA_DB_IMPORT" WITH WAREHOUSE_SIZE = 'XSMALL' WAREHOUSE_TYPE = 'STANDARD' AUTO_SUSPEND = 3600 AUTO_RESUME = TRUE;
GRANT USAGE ON WAREHOUSE "KEBOOLA_DB_IMPORT" TO ROLE "KEBOOLA_DB_IMPORT" WITH GRANT OPTION;
CREATE USER "KEBOOLA_DB_IMPORT"
PASSWORD = "YOUR_PASSWORD"
DEFAULT_ROLE = "KEBOOLA_DB_IMPORT";
GRANT ROLE "KEBOOLA_DB_IMPORT" TO USER "KEBOOLA_DB_IMPORT";
Tests Execution
Run tests with following command.
docker-compose run --rm tests
Redshift and S3 credentials have to be provided.