2022 - 07 - 09

Major version numbers may not be sacred, but backwards compatibility is.

Tom Preston-Werner, the creator of the SemVer version numbering standard, published an article a little bit ago titled Major Version Numbers are Not Sacred. The article is worth a read, but the basic argument (as best I can summarize it) is that we shouldn't be afraid of incrementing the major version number of our SemVer-adhering software projects to indicate breaking changes.

On the one hand, I agree with him: if SemVer is to be meaningful, then we obviously need to ensure that we increment the major version number any time we release a breaking change. Even if the breaking change isn't a big, sexy, marketable change. Even if it's just a change to a corner of the API that few, if any, people use. You still need to increment the major version number to indicate the backwards compatibility break. If you don't, then you're simply not adhering to SemVer. And there's nothing specifically wrong about not adhering to SemVer, but you shouldn't then claim to be adhering.

On the other hand, he then goes on to argue (if I understand him correctly) that we should therefore be willing to increment the major version number willy-nilly. And I very much disagree with him there. It is specifically because SemVer ties API breakage to the major version number that SemVer-adhering projects should be hesitant about major version bumps. Not because the version number matters, but because backwards compatibility matters.

Of course, if your project is experimental, still in alpha, just for fun, etc. then this obviously doesn't apply. But for serious projects out of alpha/beta that are intended for real use, backwards compatibility matters a lot.

Libraries are foundations.

Compatibility can matter for end-user software as well. But since SemVer is for software libraries, libraries are what I want to talk about here.

Libraries are not a thing unto themselves. The whole point of libraries is to help other people build the actual things, which are actual runnable software. Libraries serve as a foundation for other people to build on. And as a foundation, you want them to be stable.¹

If you're trying to build a house, but the ground keeps shifting underneath you, you're going to have a really bad time. And similarly, if you're building a piece of software but your libraries keep shifting underneath you, you're also going to have a bad time.

Not only that, but it also impacts the quality of the resulting software because time that could have been spent improving the software itself was instead wasted keeping up with library API changes. And every time you have to update your code to work with new APIs is also an opportunity for more bugs to slip in.

Arguably, people can just lock the version of the libraries they depend on. And many people do. But that also impacts software quality, because then bug fixes in the library don't make it into the software.

This is why API stability matters. And I say this as a library maintainer myself.

In fact, I say this primarily as a library maintainer, because it baffles me that some maintainers don't seem to get this. It's like, this is the whole point of what we're doing: enabling others to build cool and/or useful stuff. And that's a holistic thing. It's not just about features and APIs at an instant in time, it's also about how changes happen over time and how that affects the developers who depend on our code.²

Have your cake and eat it too.

It's always hard to know where other people are really coming from, so this is mostly speculation. But my feeling is that at least some of this "move fast and break things" attitude in library maintenance is coming from people's desire to simultaneously:

Have a popular, widely-used project.
Have fun and experiment on said project.

But this is a conflict of interest, and leads to projects that:

Declare themselves stable.
Bump the version to 1.0.³
And have fancy, official-looking websites.

All while not really being API stable, because they just break compatibility and bump to 2.0 six months later (or even faster!), and immediately drop maintenance of 1.x when they do so. And then wash, rinse, repeat.

These projects have all the superficial signs of stability, but little of the substance.

And the thing is, I get it! It's fun to be able to keep iterating without limits. Moreover, making all of the boring (and sometimes frustrating and burdensome) choices that are necessary for API stability are definitely not fun. But that's part of maintaining a serious library project responsibly.

What's in a name (version)?

The thing is, none of this actually has anything to do with version number bumps. At their best, version numbers are a way to communicate. And communication does matter.

But what also matters is actual behavior. And that's why Tom Preston-Werner's blog post rubbed me the wrong way. Regardless of whether a project follows SemVer or not, if they're intending to be something other people can rely on, then they should care about backwards compatibility.

If they follow SemVer, that will then be reflected in their version numbers. But it's not about the version numbers. And it applies to projects that aren't following SemVer as well. Regardless of versioning scheme, backwards compatibility matters.

Having said that...

I've presented things here in a very black-and-white way, which isn't actually reflective of all my thoughts on this issue. Some examples:

The stable/not-stable dichotomy is a simplification, and there are a lot of unique situations that libraries can be in.
I'm definitely not saying breaking releases (2.0, 3.0, etc.) should never happen. I will say, however, that if you're claiming to be stable but you're not measuring the period between breaking releases in years, you're probably doing it wrong. Though I'm sure there are exceptions.
If someone's project accidentally becomes popular and widely depended on, I don't think they should feel any responsibility whatsoever. If people need a stable version, they can fork it. No one should be forced into a position of maintenance, it should be a choice.
More than anything else, effective communication is the most important thing. If a supposedly stable project prominently communicates that it makes breaking releases every few months, I might disagree with the stability claim, but it doesn't practically matter since the information needed for potential users to make an informed decision is right there. (But this assumes that the information is prominent in the same places that otherwise give an impression of stability.)⁴

Nevertheless, there is a reason I wrote this post: I've run into quite a few seemingly serious library projects that don't appear to take API breakage seriously. And Tom Preston-Werner's post seemed not only representative of that but actively encouraging of it, which did not sit well with me.

So although the issue isn't actually as simple as I've made it out to be in this post, I think the thrust of this post is still appropriate and perhaps needed. "Move fast and break things" only makes sense when no one is depending on you.

Footnotes

In software, the word "stable" is used in (at least) two different ways: to mean API stability, and to mean crash/bug stability. Both are important for dependable software libraries, but this post is exclusively discussing the former.
Library maintenance is also about a bunch of other stuff, like good documentation, timely bug fixes, etc. I would argue, however, that most of those things are typically much harder (and much more understandable to fail at) than avoiding API breakage.
I've seen some people argue that 1.0 is "just a version number" and doesn't actually indicate API stability. (In fact, this whole post is in response to an article more-or-less arguing that.) And to support this they'll cite the SemVer spec and point out that it doesn't actually require a 1.0 release to indicate anything about stability.

I think that's a specious argument that ignores the significance of 1.0 throughout much of computing history. Version 1.0 typically indicates "readiness" for real use, and for software libraries part of being ready for real use is API stability, as discussed in this post.

(And if your software is at 1.0 but is not ready for real use, it is incumbent upon you to make that very clear in some other way. And this is independent of whether you use SemVer or not.)
An early draft of this post included a proposal for a standardized "stability tag/badge" that contained at-a-glance information about a project's maintenance policies, like minimum time between breaking releases, maintenance windows for old incompatible versions, etc. If projects posted such badges on their site, readme, etc. then everyone could easily check if a project's maintenance policies matched their stability needs before depending on it.

I still think something like that could be valuable. But I decided I probably wasn't the person to spearhead such an effort. I also suspect it would be difficult to get projects to actually use it.