terriko

This is crossposted from Curiousity.ca, my personal maker blog. If you want to link to this post, please use the original link since the formatting there is usually better.

I’m starting a little mini-series about some of the “best practices” I’ve tried out in my real-life open source software development. These can be specific tools, checklists, workflows, whatever. Some of these have been great, some of them have been not so great, but I’ve learned a lot. I wanted to talk a bit about the usability and assumptions made in various tools and procedures, especially relative to the wider conversations we need to have about open source maintainer burnout, mentoring new contributors, and improving the security and quality of software.

So let’s start with a tool that I love: Black.

Black’s tagline is “the uncompromising Python code formatter” and it pretty much is what it says on the tin: it can be used to automatically format Python code, and it’s reasonably opinionated about how it’s done with very few options to change. It starts with pep8 compliance (that’s the python style guide for those of you don’t need to memorize such things) and takes it further. I’m not going to talk about the design decisions they made but the black style guide is actually an interesting read if you’re into this kind of thing.

I’m probably a bit more excited about style guides than the average person because I spent several years reading and marking student code, including being a teaching assistant for a course on Perl, a language that is famously hard to read. (Though I’ve got to tell you, the first year undergraduates’ Java programs were absolutely worse to read than Perl.) And then in case mounds of beginner code wasn’t enough of a challenge, I also was involved in a fairly well-known open source project (GNU Mailman) with a decade of code to its name even when I joined so I was learning a lot about the experience of integrating code from many contributors into a single code base. Both of these are… kind of exhausting? I was young enough to not be completely set in my ways, but especially with the beginner Java code, it became really clear that debugging was harder when the formatting was adding a layer of obfuscation to the code. I’d have loved to have an autoformatter for Java because so many students could find their bugs easier once I showed them how to fix their indents or braces.

And then I spent years as an open source project maintainer rather than just a contributor, so it was my job to enforce style as part of code reviews. And… I kind of hated that part of it? It’s frustrating to have the same conversation with people over and over about style and be constantly leaving the same code review comments, and then on top of that sometimes people don’t *agree* with the style and want to argue about it, or people can’t be bothered to come back and fix it themselves so I either have to leave a potentially good bug fix on the floor or I have to fix it myself. Formatting code elegantly can be fun once in a while, but doing it over and over and over and over quickly got old for me.

So when I first heard about Black, I knew it was a thing I wanted for my projects.

Now when someone submits a thing to my code base, Black runs alongside the other tests, and they get feedback right away if their code doesn’t meet our coding standards. It hardly any time to run so sometimes people get feedback very fast. Many new contributors even notice failing required test and go do some reading and fix it before I even see it, and for those that don’t fix issues before I get there I get a much easier conversation that amounts to “run black on your files and update the pull request.” I don’t have to explain what they got wrong and why it matters — they don’t even need to understand what happens when the auto-formatter runs. It just cleans things up and we move on with life.

I feel like the workflow might actually be better if Black was run in our continuous integration system and automatically updated the submitted code, but there’s some challenges there around security and permissions that we haven’t gotten around to solving. And honestly, it’s kind of nice to have an easy low-stress “train the new contributors to use the tools we use” or “share a link to the contributors doc” opening conversation, so I haven’t been as motivated as I might be to fix things. I could probably have a bot leave those comments and maybe one of those days we’ll do that, but I’m going to have to look at the code for code review anyhow so I usually just add it in to the code review comments.

The other thing that Black itself calls out in their docs is that by conforming to a standard auto-format, we really reduce the differences between existing code and new code. It’s pretty obvious when the first attempt has a pile of random extra lines and is failing the Black check. We get a number of contributors using different integrated development environments (IDEs) that are pretty opinionated themselves, and it’s been freeing to not to deal with whitespace nonsense in pull requests or have people try to tell me on the glory if their IDE of choice when I ask them to fix it. Some python IDEs actually support Black so sometimes I can just tell them to flip a switch or whatever and then they never have to think about it again either. Win for us all!

So here’s the highlights about why I use Black:

As a contributor:

Black lets me not think about style; it’s easy to fix before I put together a pull request or patch.

It saves me from the often confusing messages you get from other style checkers.

Because I got into the habit of running it before I even run my code or tests, it serves as a quick mistake checkers.

Some of the style choices, like forcing trailing commas in lists, make editing existing code easier and I suspect increase code quality overall because certain types of bug are more obvious.

As a an open source maintainer:

Black lets me not think about style.

It makes basic code quality conversations easier. I used to have a *lot* of conversations about style and people get really passionate about it, but it wasted a lot of time when the end result was usually going to be “conform to our style if you want to contribute to this project”

Fixing bad style is fast, either for the contributor or for me as needed.

It makes code review easier because there aren’t obfuscating style issues.

It allows for very quick feedback for users even if all our maintainers are busy. Since I regularly work with people in other time zones, this can potentially save days of back and forth before code can be used.

It provides a gateway for users to learn about code quality tools. I work with a lot of new contributors through Google Summer of Code and Hacktoberfest, so they may have no existing framework for professional development. But also even a lot of experienced devs haven’t used tools like Black before!

It provides a starting point for mentoring users about pre-commit checks, continuous integration tests, and how to run things locally. We’ve got other starting points but Black is fast and easy and it helps reduce resistance to the harder ones.

It reduces “bike shedding” about style. Bikeshedding can be a real contributor to burnout of both maintainers and contributors, and this reduces one place where I’ve seen it occur regularly.

It decreases the cognitive overhead of reading and maintainin a full code base which includes a bunch of code from different contributors or even from the same contributor years later. If you’ve spent any time with code that’s been around for decades, you know what I’m talking about.

In short: it helps me reduce maintainer burnout for me and my co-maintainers.

So yeah, that’s Black. It improves my experience as an open source maintainer and as a mentor for new contributors. I love it, and maybe you would too? I highly recommend trying it out on your own code and new projects. (and it’s good for existing projects, even big established ones, but choosing to apply it to an existing code base gets into bikeshedding territory so proceed with caution!)

It’s only for Python, but if you have similar auto-formatters for other languages that you love, let me know! I’d love to have some to recommend to my colleagues at work who focus on other languages.

I'm happy to say that...

Mailman 3.0 suite is now in beta!

As many of you know, Mailman's been my open source project of choice for a good many years. It's the most popular open source mailing list manager with millions of users worldwide, and it's been quietly undergoing a complete re-write and re-working for version 3.0 over the past few years. I'm super excited to have it at the point where more people can really start trying it out. We've divided it into several pieces: the core, which sends the mails, the web interface that handles web-based subscriptions and settings, and the new web archiver, plus there's a set of scripts to bundle them all together. (Announcement post with all the links.)

While I've done more work on the web interface and a little on the core, I'm most excited for the world to see the archiver, which is a really huge and beautiful change from the older pipermail. The new archiver is called Hyperkitty, and it's a huge change for Mailman.

You can take a look at hyperkitty live on the fedora mailing list archives if you're curious! I'll bet it'll make you want your other open source lists to convert to Mailman 3 sooner rather than later. Plus, on top of being already cool, it's much easier to work with and extend than the old pipermail, so if you've always wanted to view your lists in some new and cool way, you can dust off your django skills and join the team!

Hyperkitty logo

Do remember that the suite is in beta, so there's still some bugs to fix and probably a few features to add, but we do know that people are running Mailman 3 live on some lists, so it's reasonably safe to use if you want to try it out on some smaller lists. In theory, it can co-exist with Mailman 2, but I admit I haven't tried that out yet. I will be trying it, though: I'm hoping to switch some of my own lists over soon, but probably not for a couple of weeks due to other life commitments.

So yeah, that's what I did at the PyCon sprints this year. Pretty cool, eh?

One of the things that bugs me when I'm doing book reviews is that I prefer it when reviews have a picture of the cover and link to the book of some sort, but I didn't love the output from Amazon's referal link generator, which would have been the easiest solution. I've been doing it manually, but that's a lot of cut and pasting and I kind of abhor doing tasks that are easy to automate.

Thankfully, I'm a coder and a user of greasemonkey, so I have all the skills I need to automate it. Seriously, being able to tweak web pages to suit my own needs is the greatest thing.

In the spirit of sharing, here's the script I'm using to generate the code I wanted for my reviews using the book page on LibraryThing:

// ==UserScript==
// @name        Book review header generator
// @namespace   tko-bookreview
// @description Takes any librarything book page and gives me a nice link to the book with cover and author details
// @include     http://www.librarything.com/work/*
// @version     1
// @grant       none
// ==/UserScript==

// Get all the data we'd like to display at the top of a review
var coverimage = document.getElementById('mainCover').outerHTML;
var title = document.getElementsByTagName('h1')[0].innerHTML;
var author = document.getElementsByTagName('h2')[0].innerHTML;
var librarythinglink = document.URL; 


// Trim down the title and author info
title = title.replace(/ *<span .*<\/span>/, '');

author = author.replace(/href="/, 'href="http://www.librarything.com');
author = author.replace(/<hr>/, '');

// Generate the code for this book
var reviewheader = '<a href="' + librarythinglink + '">' + 
   coverimage + '<br />' +
   '<b>' + title + '</b></a> ' +
   '<em>' + author + '</em>';

// Add code around this for embedding it into the page
var textbox = '<h4>Review Code</h4>' +
	'<textarea name="embedHTML" onFocus="this.select();" rows="5" ' + 
	'style="width: 250px;" wrap="virtual">' + reviewheader + '</textarea>';


// Find a good spot and add it to the page
var insert = document.getElementsByClassName('gap')[0];
insert.outerHTML =  textbox + insert.outerHTML;

Please feel free to consider this open sourced and free for any type of use: alter it to suit your needs as you will!

Edit: Github link, for those so inclined.

I got a really interesting query today that boiled down to, "How much math do you need to write code?"

The short answer to this is, "Not that much" or perhaps "it depends on what you want the code to do." But here's part of what I actually wrote back:

---

To be honest, the level of math required to write code is pretty small. A grade school understanding is often sufficient; there's a reason we can teach 7 year olds to program! Modern programming languages are much less math-oriented: I once spent an afternoon teaching my then 11 year old sister and her friends how to write dynamic database-driven websites, and the only math they used was to add up the scores on the "what animal are you most like?" quizzes they wanted to write.

The math in computer science comes a lot later: for deeper analysis of algorithms and running time, we use algebra and mathematical proofs in an academic setting. But... to tell the truth, relatively few programmers need or use this kind of deeper understanding in their day-to-day jobs. And in my experience teaching students, many people find this stuff easier to learn by doing, so they only really begin to grasp it *after* they have gotten comfortable writing programs.

In short: you probably have all the math skills you need to write code, and if you decide you want to do more hardcore CS later, it'll be easier to learn the math along the way anyhow!

---

There's some nuance there that I didn't really tease out -- the deeper understanding of algorithms and program behaviour is what characterizes the real "science" out of computer science. And maybe the world would be a better place if more programmers did actually use deeper analysis in their day-to-day jobs. But you don't have to be an academic-style computer scientist to write code! Still, it's a very interesting question, given that historically programming actually did require a lot more math, and our perceptions and stereotypes haven't really kept up with the reality of the field.

Perhaps it's time for me to write another presentation? ;)

(For context: my old slideshow about women, computing and math got included in this TechCrunch post about Racism and Meritocracy, so I've been getting a lot of mail, including the one that spawned this post.)

More notes from getting myself set up to write code:

The fink project has a page about upgrading from 10.5 to 10.6 with fink.

It's very easy to get distracted by the nice step-by-step description that takes up most of the page.

The pertinent bit of information is at the top, though:

you will probably find it easier to do a clean install from source instead.

I probably should have just done that first, since the lovely instructions didn't work at all for me. Oh well.

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28

Entries tagged with code

Best practices in practice: Black, the Python code formatter

Mailman 3.0 Suite Beta!

Book review code

How much math do you need to write code?

Upgrading fink from mac os X 10.5 to 10.6

Profile

February 2026

Links

Syndicate

Page Summary

Style Credit

Expand Cut Tags