Fix a Typo and Level Up

2023-01-07

Have you ever been reading something, noticed a typo, and wished you had the power to fix it? There's a latent copyeditor hiding somewhere in the mind of every literate person. Providing that kind of power to fix it is part of the promise and process of "open source." In this particular definition, I mean the sense that the code (or hardware, or prose, or whatever) is available for everyone to see and to directly make proposed modifications which are then selectively accepted/rejected by one or more gatekeepers who do their best to judge whether or not it is an overall beneficial change, versus a newspaper where all you can do is write a letter to the editor. Think Wikipedia vs. Microsoft Encarta. Remember Microsoft Encarta? I won't fault you if you don't.

There is an overly simplistic self-interested rhetorical question around this that goes something like: "Why would I give my time and effort to improve something else if I'm not getting paid?" Well, I have bad news for you if that's the way you think: every time you watch, hear, or read an advertisement, you're doing just that. You are not getting paid to help brands improve their presence in your mind and yet you regularly entertain advertisements in some form, be it radio, TV, website, print, or a so-called news article on your smartphone that is really a long-form ad about how something is available for the lowest price ever. So, read on.

The most well-known centralized location for open source software is called GitHub, which was acquired by Microsoft in 2018 for $7.5 billion. Open source volunteers and private enterprises alike use this platform to collaborate on software development. I was perusing some documentation there one day when I saw...a typo! A passage that intended to say that something would be "stripped," as in removed, was accidentally saying that something would be "striped," as in zebra. Furthermore, both of these terms have special meaning within technological lexicons, so the potential for confusion was real.

It took me less than five minutes to fix the typo and submit the proposed change back to the team which manages the open source library that had the "stripped/striped" typo. It was the smallest possible change: adding a single character, the letter "p." The open source maintainers of that library responded and "accepted" my suggested change within 48 hours, so from that moment forward, the typo was fixed for everyone, around the world. This system really works sometimes! However, the benefits in this case went beyond a mild jolt of dopamine.

As I was enjoying the "popcorn" moment of the maintainers accepting my changes, I noticed something I'd never seen before. One of the maintainers said they'd approved it, but then, instead of manually "merging" it (applying the change), the bots took over. Bots are automated, software-based entities that can interact with a system like GitHub in similar ways as a human can, for automation purposes. I had never seen this particular application of bots. One bot, in fact, called another. A bot called mergify made the cryptic comment bors +r and then another bot named bors seemingly did a bunch of stuff, and then some 20 minutes later, automatically merged my change. What the heck just happened, I wondered to myself?

This led me to put a name to a face, the face of a certain kind of deployment hell to which I bear witness in my software engineering career from time to time. The name of this scourge is Semantic Merge Conflicts. The quick pitch about the bors bot explains exactly how Semantic Merge Conflicts happen and how to mitigate them, but in brief, imagine a growing team of developers working on the same codebase. Two of them start working from the same starting point which "passes tests," then each makes a change which also "passes tests" but is incompatible with the other's changes, and then both are "merged" back in to the main codebase which then inexplicably "fails tests." While each change "passes tests" individually, the codebase ends up in an unhealthy state. bors is one solution to this problem.

We can't expect teammates to always know exactly what others are coding in real time, so for large and complex projects with multiple contributors, a systemic fix to these kinds of headaches is an unequivocal value add. From now on, I'll be suggesting this to any software development team large enough to benefit. Level up! But how did I discover this? Did I learn it in an expensive Computer Science training program? No, I didn't spend a dime. I simply took it upon myself to fix a tiny typo in some open source technical documentation, and received a significant unexpected lesson. I did more work writing this post than fixing that typo, and yet, I have now internalized a robust solution to a nuanced problem I never thought I'd know how to solve as a result. So tell me, how does one get paid to take in ads?