Make Breaking Changes Without API Versioning

Lots of REST APIs are versioned: Microsoft Azure Service Management API, AWS EC2 API, Facebook Graph API, Twitter REST API, Strip, etc.. The motivation behind API version is obvious: to make it easier to introduce breaking changes without disrupting existing client code. As Facebook’s API doc says:

“The goal for having versioning is for developers building apps to be able to understand in advance when an API or SDK might change. They help with web development, but are critical with mobile development because a person using your app on their phone may take a long time to upgrade (or may never upgrade).”

However, API versioning sometimes can incur substantial engineering cost. Take Microsoft Azure Service Management API for an example. Currently it has more than 20 versions, among which the first one, 2009-10-01, was released six years ago and is still in use. Customers’ expectation is that as long as they stick with the same version (e.g. 2009-10-10), their client code won’t never need to change. Assume there are 1,000 test cases for the Service Management API, therefore in order to deliver on that expectation, every time when Azure is upgrading the Service Management API, theoretically it has to run the 1,000 test cases for 20+ times, one for each version from 2009-10-10 all the way to 2015-04-01! Having a retirement policy like Facebook’s will help a bit, but even in Facebook’s case, they still have to run all the test cases for 5 times in theory (from v2.0 to v2.4).

The engineering cost problem won’t change too much whether the multiple versions are served by the same piece of code/binary, or separated pieces:

  • You may implement the API to only have the outer layer versioned and most of the core/backend code version-less, hoping to save some test needs. But that will actually increase the chance of accidentally changing the behavior of older API versions since they share the same core/backend code.
  • Making the core/backend code also versioned gives a better isolation between versions, but it has a lot more code paths that need to be covered in testing.
  • Forking a maintaince branch for each version may sound appealing since it mostly eliminates the need to run full blown testing for v2.0, v2.1, v2.2 and v2.3 when you release v2.4, since v2.0, v2.1, v2.2 and v2.3 are in their own branches. But on the other hand, applying bug fix across multiple maintenance branch may not be as trivial as it sounds. Plus, when multiple versions of binaries running side by side, data compatibility and corruption problems become real.

I was a proponent of API versioning until I have seen the cost. Then I went back to where we started: what are the other ways to make it easier to introduce breaking changes without requiring all client code to upgrade?

Policy-driven may not be a bad idea. Let’s try it on a couple real examples of breaking changes:

  1. Azure Service Management API introduced a breaking change in the version 2013-03-01: “The AdminUsername element is now required“. Instead of introducing a new API version, we could add a policy “AdminUsername element must not be empty” and start with “Enforced = false” for every subscription. Once enforced, any API call with empty AdminUsername will fail. Individual subscription can turn on the enforcement by themselves in management portal, or it will be forced to turn on after two years (equivalent to Facebook’s two years version retirement policy). Once it’s turned on, it can’t be turned off. During the two years, there might be some other new features that require the “AdminUsername element must not be empty” policy to be turned on. It’s up to each subscription whether they want to delay the work of changing the code around AdminUsername at the cost of delaying the adoption of other new features, vs. pay the cost of changing code now to get access to other new features sooner.
  2. Facebook Graph API has deprecated “GET /v2.4/{id}/links” in v2.4. As an alternative to introducing a new API version, we could add a policy “Deprecate /{id}/links”. It would be a per App ID level policy and can be viewed/set in the dashboard. App owner will receive reminders when the deadline of turning on the policy is approaching: 12 months, 6 months, 3 months, 4 weeks, 1 week, …

The policy-driven approach will have different characteristic than the API versions when it comes to discover the breaking changes. In the API version way, when the client switch to v2.4 in its development environment, errors will pop up right away in the integration test if any hidden client code is still consuming APIs deprecated in v2.4. In the policy-driven way, the developer would need to use a different App ID, which is in development mode, to turn the policy on and run the integration test. I don’t see fundamental difference between these two ways. They each have pros and cons. Advancing only one number (the API version) may be less work than flipping a number of policies, but policies give the client more granular control and flexibility.

At the end of the day, API versions may still be needed, but only for a major refactoring/redesign of the API, which would only happen once every a couple years. Maybe it’s more appropriate to call it “generations” rather than “versions”. Within the same generation, we can use the policy-driven approach for smaller breaking changes.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s