Database Migrations in Flyway

The concept of data migrations has been around for a long time and in the past, they have been projects that are separated from the main project, mostly with SQL script to change the tables, update data, etc. But application frameworks such as Ruby on Rail and Django have had them for a long time. In the JVM world, Flyway appears to be the leader in application-integrated database migrations.

Flyway integrates with Maven and Gradle, which makes using it turn out to be like a command-line tool. Once you get it setup in Gradle, for instance, it’s as easy as running gradle flywayMigrate -i and your database it updated. Flyway also has Spring Boot integration which runs all the migrations on application startup. This is the easiest method and is what I have using in my sample application.

Flyway keeps track of what migrations have run in the database in a table called schema_version. If you run the sample application, this is what the schema_version table looks like:

This table shows two migrations – one with a description of “Create Tables” and another with “insert data”. They are derived from the name of the file – the “script name” (which you can see in the script column). Flyway has pretty strict conventions about naming your migrations, which helps it parse and put information into this table. The names of these two files are V1__Create_tables.sql and Most of this naming convention is configurable but I’d argue that it’s a lot of work to roll your own.

The V1__Create_tables.sql file is just a common SQL file with multiple statements. Flyway runs these statements in transactions so if one command fails, it backs them all out and you are not left with the database in a bad state. I think it’s important to note that there is nothing special about these SQL statements. There are no specific Flyway code or anything. It just runs plain SQL.

The file is different. This is a Java-based migration. I used this to parse a CSV file and put it into the database (which I think is a common task). This uses Spring’s JDBC Template, which makes it really easy to do inserts. But, if you aren’t using Spring, you can use the raw connection and get PreparedStatements from that connection, but JdbcTemplate is much easier to work with. One thing I don’t care for in the Java migrations is the class name. Since the file name is required to be, then the class is named V2__insert_data. I think this a poor name for the class. I wish you could set these with properties inside the class instead of just using the filename.

Both of these migrations are Versioned migrations, meaning they are only designed to run once. Flyway also supported Repeatable migrations, which run every time those files change and the Flyway task is run. It does this by checking the checksum on the file. Going back to the schema_version table, note that the SQL migration has a checksum and the Java migration does not. By default, Java migrations do not have checksums but you can generate it by implementing MigrationChecksumProvider in your class and generating it your own. That seems dangerous to me, so use repeatable Java migrations with caution.

So now you have an overview of using Flyway in your Java projects. Feel free to checkout my sample application and play around with it. I think it’s a simple application that can solve a lot of problems.


About the Author

Mike Hostetler profile.

Mike Hostetler

Principal Technologist

Mike has almost 20 years of experience in technology. He started in networking and Unix administration, and grew into technical support and QA testing. But he has always done some development on the side and decided a few years ago to pursue it full-time. His history of working with users gives Mike a unique perspective on writing software.

One thought on “Database Migrations in Flyway

  1. Jojo says:

    Nice. Did something similar, I just generalized that so I can create a csv import for every table. My Java based mygration just needs to inherit a base class and the childclass is not more than 5 lines of code.
    Pass the table name, and Nr of columns (this still can be improved so I don’t need to rely on that, but I hacked that up all yesterday), add the csv data loader and implement a hook from the parent (one liner). Done. Looks real sexy already. Only thing that I don’t think is perfekt is the fact I added the number of columns as argument for the (?,?,?…) part of the query.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Blog Posts
Retrofit2: Get the body from an error response
Retrofit2 is a nice library for making HTTP rest requests. It includes a static utility (CallUtils) for getting the result from your request, but if the api you’re calling doesn’t return a 2xx request it […]
Airflow Logging: Task logs to Elasticsearch
This is part three of a five-part series addressing Airflow at an enterprise scale. I will update these with links as they are published. Airflow: Planning a Deployment Airflow + Helm: Simple Airflow Deployment More […]
Using Nix as a Professional
How to use Nix as a tool to optimize developer time with real-life examples.
Enterprise Auth for Airflow: Azure AD
This is part three of a five-part series addressing Airflow at an enterprise scale. I will update these with links as they are published. Airflow: Planning a Deployment Airflow + Helm: Deploying the Chart Without […]