terriko: (Default)
[personal profile] terriko
This is crossposted from Curiousity.ca, my personal maker blog. If you want to link to this post, please use the original link since the formatting there is usually better.


This is part of my series on “best practices in practice” where I talk about best practices and related tools I use as an open source software developer and project maintainer. These can be specific tools, checklists, workflows, whatever. Some of these have been great, some of them have been not so great, but I’ve learned a lot. I wanted to talk a bit about the usability and assumptions made in various tools and procedures, especially relative to the wider conversations we need to have about open source maintainer burnout, mentoring new contributors, and improving the security and quality of software.





I was just out at Google Summer of Code Mentor Summit, which is a gathering of open source mentors associated with Google’s program. Everyone there regularly works with new contributors who have varying levels of ability and experience, and we want to maintain codebases that have good quality, so one of the sessions I attended was about tools and practices for code quality. Pre-commit is one of the tools that came up in that session that I use regularly, so I’d like to talk about it today. This is a tool I wouldn’t have thought to look for on my own, but someone else recommended it to me and did the initial config for my project, so I’m happy to pay that forwards by recommending it to others.





Pre-commit helps you run checks before your code can be checked to git. Your project provides a config file of whatever tools it recommends you use. Once you’ve got pre-commit installed, you can tell it to use that file, and then those checks will run when you type `git commit` with it halting if you don’t pass a check so you can fix it before you “save” the code. By default it only runs on files you changed and can be tuned by the project maintainers to skip files that aren’t compliant yet, so you don’t generally get stuck fixing other people’s technical debt unless that’s something that the maintainers chose to do.





Under the hood there’s some magic happening to make sure it can install, set up, and configure the tools. It does tell you what’s happening on the command line, but it’s worlds better than having to install them all yourself, and it puts it into a separate environment so you don’t have to worry about needing slightly different versions for different projects. Honestly, the only time I’ve had trouble with this tool was when I was using it in a weird environment behind a proxy and some combination of things meant that pre-commit was unable to set up tools for me. I think that’s more of a failure of the environment than of the tool, and it’s been shockingly easy to set up and use on every other development machine where I’ve used it. One command to install pre-commit, then one command to set it up for each project where I use it.





I’m sure there are some programmers who are incredibly disciplined and manage to run all required checks themselves manually, but I am not the sort of person who memorizes huge arrays of commands and flags and remembers to run them Every Single Time. I am the sort of person who writes scripts to automate stuff because I will forget. Before pre-commit I would have had a shell script to do the thing, but now I don’t have to write those for projects that already have a config file ready for me. Thus, pre-commit speaks to the heart of how I work as a developer. I got into computers because I could make them do the boring stuff.





A photo of the package locker in a US shared mailbox.  A label around the keyhole reads "open" with arrows and then says "key will remain in lock after opening door" -- it's a great example of design that doesn't rely on users remembering to do the right thing (in this case, giving back the key for future use)
Image Description: A photo of the package locker in a US shared mailbox. A label around the keyhole reads “open” with arrows and then says “key will remain in lock after opening door” — it’s a great example of design that doesn’t rely on users remembering to do the right thing (in this case, giving back the key for future use)




Pre-commit also speaks to the heart of my computer security philosophy: any security that relies on humans getting things 100% right 100% of the time is doomed to fail eventually. And although a lot of this blog is about knitting and fountain pens and my hobby work, I want to remind you that I’m not just some random person on the internet when it comes to talking about computer security: I have a PhD in web security policy and I work professionally as an open source security researcher. Helping people write and maintain better code is a large portion of my day job. A lot of the most effective work in security involves making it easy and “default” for people to make the most secure choices. (See the picture above for a more physical example of the design philosophy that ensures users do the right thing.)





Using pre-commit takes a bunch of failure points out of our code quality and security process and makes it easier for developers to do the right thing. For my current work open source project, we recommend people install it and use it on their local systems, then we run it again in our continuous integration system and require the checks to pass there before the code can be merged into the main branch.





As a code contributor:






  • I like that pre-commit streamlines the whole process of setting up tools. I just type pre-commit install in the directory of code I intend to modify and it does the work.




  • I can read the .pre-commit-config.yaml file to find out a list of recommended tools and configurations for a project all in one place. Good if you’re suspicious of installing and using random things without looking them up, but also great for learning about projects or about new tools that might help you with code quality in other projects.




  • It only runs on files I changed, so the fixes it recommends are usually relevant to me and not someone else’s technical debt haunting me.




  • It never forgets to run a check. (unless I explicitly tell it to)




  • It helps me fix any issues it finds before they go into git, so I don’t feel obliged to fuss around with my git history to hide my mistakes. Git history is extremely obnoxious to fuss with and I prefer to do it as infrequently as humanly possible.




  • It also subtly makes me feel more professional to know that all the basic checks are handled before I even make a pull request. I’ve been involved in open source so long that I mostly don’t care about my coding mistakes being public knowledge, but I know from mentoring others that a lot of people find the idea of making a mistake in public very hard, and they want to be better than the average contributor from the get-go. This is definitely a way to make your contributions look better than average!




  • It gives me nearly immediate, local feedback if my code is going to need fixes before it can be merged. I like that I get feedback usually before my brain has moved on to the next problem, so it fits into my personal mental flow before I even go to look at another window.




  • It can get you feedback considerably faster than waiting for checks to run in a continuous integration system. If you’re lucky, a system like github actions can get you feedback within a few minutes on quick linter-style checks, but if the system is backed up it or if you’re a new contributor to a project and someone has to approve things before they run (to make sure you’re not just running a cryptominer or other malicious code in their test system!), it can take hours or days to get feedback. Being able to fix things before the tests run can save a lot of time!





As a project maintainer:






  • Letting me configure the linters and pre-checks I want in one place instead of multiple config files is pretty fantastic and keeps the root directory of my project a lot less full of crap.




  • It virtually eliminates problems where someone uses a tool subtly differently than I do. If you’re not an open source project maintainer who works with random people on the internet you may not realize how much of a hassle it is helping people configure multiple development tools, but let me tell you, it’s a whole lot easier to just tell them to use pre-commit.

    • Endlessly helping people get started and answering the same questions over and over can be surprisingly draining! It’s one of the things we really watch for in Google Summer of Code when trying to make sure our mentors don’t burn out. Anything I can do that makes life easier for contributors and mentors and avoid repetitive conversations has an outsized value in my toolkit.






  • Being able to run exactly the same stuff in our continuous integration/test system means even if my contributors know nothing about it, I still get the benefits of those checks happening before I do my code review.




  • It saves me a lot of time back-and-forth with contributors asking for fixes so it lets me get their code merged faster. A nicer experience for all of us!




  • I can usually configure which files need to be skipped, so it can help us upgrade our code quality slowly. Or I can use it as a nudge to encourage people changing a file to also fix minor issues if I so desire.





What gets run with pre-commit will obviously depend on the project, but I think it’s probably helpful to give you an idea of what I run. I talked about using black, the python code formatter in a previous best practices post. For my work open source project, it’s only one of several code quality linters we use. We also use pyupgrade to help us be forward-compatibile with python syntaxes, bandit to help us find python security issues, gitlint to help us provide consistency in commit messages (we use the conventional commits format rules), and mypy to help us slowly add static typing to our code base.





Usually before installing a new pre-commit hook, I make sure all files will pass the checks (and disable scanning of files that won’t). Some tools are pretty good at a slow upgrade if you so desire. One such tool for us as been interrogate, which prompts people to add docstrings — I have it set up with a threshold so the files will pass. The output when pre-commit runs generates a report with red segments in it if there’s missing docstrings for some functions, even if the check passes so you don’t have to fix them. Sometimes that means someone working in that file will go ahead and fix those interrogate warnings while they’re working on their bugs, and that’s incredibly nice.





I’ll probably talk about some of these tools more later on in this best practices in practice series, but that should give you some hints of things you might run in pre-commit if you don’t already have your own list of code quality tools!





Summary





Pre-commit is a useful tool to help maintain code quality (and potentially security!) and it can be used to slowly improve over time.





I only found out about pre-commit because someone else told me and I’m happy to spread the word. I don’t think tools like pre-commit attract evangelists the way some other code-adjacent tools do, and it’s certainly not the sort of thing I learned about when I learned to code, when I got involved in opens source initially, or even when I was in university (which was long after I learned to code and got into open source). I’m sure it’s not the only tool in this category, but it’s the one I use and I like it enough that I haven’t felt a need to shop around for alternatives. I don’t know if it’s better for python than for other languages, but I love it enough that I could see myself contributing to make it work in other environments as needed, or finding similar tools now that I know this is an option.





As a project maintainer, I feel like it helps improve the experience both for new contributors who can use it to help guide them to submit code I’ll be able to merge, and for experienced contributors and mentors who then don’t have to spend as much time helping people get started and dealing with minor code nitpicks during code reviews. As an open source security researcher, I feel like it’s a pretty powerful tool to help improve code quality and security with easy feedback to developers before we even get to the manual code review stage. As a developer, I like that it helps me follow any project’s best practices and gives me feedback so I can fix things before another human even sees my code.





I hope other people will have similar good experiences with pre-commit!





Profile

terriko: (Default)
terriko

June 2025

S M T W T F S
1234567
89 1011121314
15161718 192021
22232425262728
2930     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 19th, 2025 08:02 pm
Powered by Dreamwidth Studios