We recently had the need to move one of our Django app’s back-end from MySQL to PostgreSQL. We wanted to try some of PgSQL’s features in that particular context, and so far, we are quite pleased! Unfortunately the migration process was not as painless as we would have liked. At least, not until we found Philip Southam’s cool py-mysql2pgsql tool.
August 2018: Please note that this post was written for an older version of Django. Changes in the code might be necessary to adapt it to the latest versions and best practices.
Being an improved port of Max Lapshin’s ruby-written mysql2pgsql, py-mysql2pgsql allows you to connect to a MySQL database and dump it’s contents into a PostgreSQL compatible dump file or directly pipe it into and already running Postgre Server. It handles large data sets (millions of rows) with more ease than its Ruby inspirator.
So we went ahead and tried the tool but, as it so often happens, the process wasn’t smooth and when migrating a table’s constraints we got this error:
ERROR: there is no unique constraint matching given keys for referenced table
Ouch! Apparently py-mysql2pgsql was having trouble with some of our Foreign Keys, making them refer tables not yet imported. We investigated a bit and found that someone had already coded a workaround to this here by letting you choose the order in which the tables will be imported. So this is the process we followed:
# Download and install anentropic's py-mysql2pgsql modified version $ sudo python anatropics-py-mysql2pgsql/setup.py install # Run py-mysql2pgsql so it creates a blank configuration file # Edit the file with the connection data for the MySQL server and optionally # the Postgre server. # Put the tables you want to export (all, eventually) in the "only_tables" # section of the config file. Here you have to tweak the order of the tables # until you solve all dependencies issues. It may take a bit of trial-and-error # but unless you have a massive amount of tables it shouldn't take long # Run py-mysql2pgsql with your config file $ py-mysql2pgsql -v -f my_config.yml
And that should be it. Hopefully you now have a fully functional Postgre database (or a very compatible and cool dump file) with all your data.
Like what you read?
Subscribe to our newsletter and get updates on Deep Learning, NLP, Computer Vision & Python.