pgloader v3.6.1
This release contains three major themes: usual maintenance and bug fixing, support for new database systems as sources and targets, and support for Citus distribution.
New Documentation System
The documentation also has received quite some attention and is now available at https://pgloader.readthedocs.io/en/latest/. It includes a new introduction page and the introduction section now has a Feature Matrix displaying what feature coverage you can expect depending on your database source type. Please read https://pgloader.readthedocs.io/en/latest/intro.html for more details.
PostgreSQL as a source
pgloader v3.6.1 now as integrated support for PostgreSQL either as a source database system, or a target database system, or both. Migrating from PostgreSQL to another PostgreSQL instance is best done with PostgreSQL tools such as logical replication or backup and restore facilities. That said, pgloader also supports PostgreSQL derivatives such as Redshift and Citus, so from PostgreSQL to PostgreSQL is to be read with that in mind.
Support for Citus distribution keys
pgloader v3.6.1 includes support for Citus distribution, documented at https://pgloader.readthedocs.io/en/latest/ref/pgsql-citus-target.html. When your target database is Citus, then pgloader has support for:
- distribution key declaration right in the pgloader command, allowing for the next items,
- distribution key integration in the target schema, done on the flight by following foreign key definitions and adding the distribution key where needed (tables and constraints),
- backfilling of the data from their sources, using SQL JOINs when migrating the data from the source to the target system.
Support for Redshift
pgloader v3.6.1 implements Redshift support both as a source and as a target database system. When used as a target, pgloader takes care of dumbing down the data types when compared to PostgreSQL, and needs the user to provide an S3 setup where to upload intermediary files: Redshift can COPY from S3 files, not from standard input on the connection like PostgreSQL would.
When used as a source system, pgloader uses SELECT queries with Redshift, allowing to fetch all the data over the same network protocol, as usual.
AFTER SCHEMA EXECUTE SQL
pgloader v3.6.1 implements new support for running SQL queries in between its handling of the schema and the data parts of the migration. This allows for custom post-processing the schema to happen before loading the data into the target database.
Improvements and bug fixes
The existing support for MS SQL, MySQL, SQLite, CSV and other formats have received improvements and bug fixes. The CSV parser for instance is now able to consider a subset of the target table columns when using a CSV header in the file, or column and fields in different orders.
Sponsoring pgloader
Some of the improvements in that release were made possible thanks to our sponsors! If you need new pgloader features, please talk to me about them, I"ll be happy to make it happen! You can contribute to pgloader either your time and skills, or money. See https://pgloader.io/moral-licence/ for more information and details.