Jun 15, 2020

Consistent Python environments with Poetry and pre-commit hooks

Clean and Consistent Environments

Regardless of the programming language you are working in, it can sometimes be a struggle to maintain a clean codebase and a consistent development environment for all members of your team, especially if your teammates are split between platforms or use different editors.  However, you can simplify this process with three straightforward strategies: 1) set up your Git attributes appropriately, 2) use Poetry to manage your development environment, and 3) enforce a coding standard through pre-commit hooks.   Below, I’ll dive into each strategy in more detail. Check out my sample apologies repo to see how this works in practice.

Git Attributes

If your teammates work in the same codebase from both UNIX-like platforms (macOS or Linux) and Windows, then it’s especially important to set up Git attributes to manage line endings.  However, even if you’re only working on a single platform, it’s still a good idea.  Git makes it fairly easy to shoot yourself in the foot, and diagnosing a problem related to line endings can be confusing.   To manage line endings, set up your .gitattributes file appropriately.   This StackOverflow answer has a good discussion of the options, but for new projects it’s really as simple as grabbing the appropriate content from this collection of .gitattributes templates.    For an existing project, you may also need to do a one-time normalization step as described in this GitHub documentation.

Poetry in Motion

One of the more difficult things to manage for any Python project is the dependencies and the resulting development environment.  Most people rely on Python virtual environments, but then you still need to make sure that everyone on your team is using the same setup.  There are a variety of different mechanisms available (for instance, our own Mike Hostetler blogged about direnv back in June of 2019).  My favorite solution is Poetry.  Poetry is a single tool that is used both to manage project dependencies and to construct and utilize a virtual environment based on those dependencies.  It also manages the process of publishing code to a repository such a PyPI.  Your teammates simply need to install a Python interpreter and the Poetry tool itself (which is as simple as brew install poetry on macOS), and then everything else happens auto-magically from there.

For instance, if you clone my apologies repo, you can do this:

localhost:~/projects/repos/apologies> poetry install
Creating virtualenv apologies-pSRSS4B3-py3.7 in /Users/kpronovici/Library/Caches/pypoetry/virtualenvs
Installing dependencies from lock file
- Installing chardet (3.0.4)
- Installing idna (2.9)
- Installing markupsafe (1.1.1)
- Installing sphinx-autoapi (1.2.1)
- Installing tox (3.14.5)
- Installing apologies (0.1.6)

Once that’s done, Poetry manages your virtual environment for you.  You can use poetry add or poetry remove to add and remove dependencies, which are tracked in pyproject.toml.  The virtual environment is automatically updated to include those dependencies.  If new dependencies are added, developers can refresh their environment using poetry install.

Even better, developer-only dependencies can be added with the –dev switch.  This means that any tool you want your developers to have access to can be managed by Poetry.   For instance, in my project, the developer dependencies include Pylint.  I can run Pylint out of the Poetry virtualenv using poetry run pylint.

Poetry goes a long way toward making dependencies and virtual environments simple to use, and it’s worth your time to look into it.

Pre-Commit Hooks for Standards

Once you’ve made it easy for all of your teammates to be working with a consistent development environment, turn your attention to code consistency.   I take a two-pronged approach to coding standards, with some tools focused on code formatting and other tools focused on code quality.

For code formatting, I rely on Black and isort.  Black ensures that everyone’s code looks the same, while isort makes sure that imports are referenced in a consistent manner.  With some care, you can configure isort so that it formats import statements in exactly the same way as Black would, avoiding conflicts.

For code quality, I rely on Pylint and MyPy.   Pylint enforces a coding standard and is also a general linter.  MyPy is a static type checker for Python.  Between these tools, I can catch most problems.  Some people find Pylint to be more trouble than it’s worth, and use the Flake8 linter instead.

Once you configure your code formatting and code quality tools, the next thing you need to do is make sure that everyone uses those tools consistently.  One approach is to apply the checks during your continuous integration (CI) process, failing the build if standards are not met.  This is important, but I consider it a fallback.  Instead, I recommend that you apply these checks as pre-commit hooks.   This way, non-standard code never has a chance to enter the repository.

To manage pre-commit hooks, I use the pre-commit package.   This package relies on a file called .pre-commit-config.yaml in the root of your repository.  When a new developer joins the team and clones the repository, they can enable all of the pre-commit hooks using poetry run pre-commit install.  For my sample apologies repo, a commit now looks like this:

localhost:~/projects/repos/apologies> git commit -m "Release v0.1.6" pyproject.toml
[master 8e9e6e6] Release v0.1.6
1 file changed, 1 insertion(+), 1 deletion(-)

If any of the hooks fails, then the commit won’t complete.  For instance, if Black updates formatting, then the newly-updated files will be left in the repo and will need to be added to the commit.  Or, if Pylint finds errors, the developer will need to fix those errors before committing.

The pre-commit package has a list of supported tools and knows how to create a Python virtualenv to install and run those tools.  Since I already have Poetry to manage my virtualenv, that’s overkill for me.  Instead, I configure everything as a “local” hook, and execute the tools via poetry run.  That way, developers have an easy way to run the exact same checks outside of the pre-commit hook, both from the command-line and within an IDE like IntelliJ.

MyPy and Pylint can sometimes be fairly slow, especially for large projects.  If this is the case, you might need to rely on the CI system to enforce these standards instead.  Everything is a compromise, so choose the tools and process that add the most value for your team.

About the Author

Object Partners profile.
Leave a Reply

Your email address will not be published.

Related Blog Posts
Natively Compiled Java on Google App Engine
Google App Engine is a platform-as-a-service product that is marketed as a way to get your applications into the cloud without necessarily knowing all of the infrastructure bits and pieces to do so. Google App […]
Building Better Data Visualization Experiences: Part 2 of 2
If you don't have a Ph.D. in data science, the raw data might be difficult to comprehend. This is where data visualization comes in.
Unleashing Feature Flags onto Kafka Consumers
Feature flags are a tool to strategically enable or disable functionality at runtime. They are often used to drive different user experiences but can also be useful in real-time data systems. In this post, we’ll […]
A security model for developers
Software security is more important than ever, but developing secure applications is more confusing than ever. TLS, mTLS, RBAC, SAML, OAUTH, OWASP, GDPR, SASL, RSA, JWT, cookie, attack vector, DDoS, firewall, VPN, security groups, exploit, […]