When it comes to looking into bottleneck and improvement opportunities in the engineering velocity area, I use a four-buckets model, in terms of how long a task takes:
- Instant. This is something that only takes a few seconds to half a minute. Tasks like running a small set of unit tests, compiling a sub-folder or do a “git pull” for the first time in the last several days are in this bucket. While waiting for such tasks to finish, I don’t leave my desk. I would catch up on some quick conversations on IM, take a peek on my cellphone or reply an email while waiting.
- Coffee break. A coffee break task takes a few minutes, such as apply my private bits to a one-box test instance, run a large set of unit tests, etc.. Some time I go for a coffee or use restroom when such tasks are running.
- Lunch break. When a task takes longer time, such as half an hour or 1+ hour, I will grab a lunch while it’s running. Sometime I start the task when I leave office to pick up my boy and check the result when I get home.
- Overnight. Such task takes quite a few hours, or up to about half day. So we have to run them overnight: usually start at the night, go to sleep and check the result when we wake up in the next morning. If it’s started in the morning, we probably are not going to see the outcome until the evening.
Over the years, I have learned a few things in this four-buckets model:
- A task’s duration will slowly deteriorate within the same bucket without being noticed, until it’s about to fall into the next bucket. For example, the build time of a code base may be 10 minutes in the beginning, which put it in the coffee break bucket. It can get slower over the course of the next several months, become 15 minutes, 20 minutes, …, as more code are added. Few will notice it, or be serious about it, until the build time gets close to half an hour, which is no longer a coffee break task, but a lunch break task. People feel more motivated/obligated to fix things to keep a task remain in the current bucket, than prevent it slowly deteriorating within the same bucket.
- For maximum effect, when we make engineering investments in shortening a task’s duration, we should aim to move it into the next shorter bucket. Incremental improvements within the same bucket will have less impact on engineering velocity. For example, if an overnight task is shortened from 12 hours to 6 hours, it’s still an overnight task. But if it can be further shortened to 3 hours, that will transform the work style in the team: the team will be able to run the task multiple times during the day. It will dramatically change the pace of the team.
- Incremental improvements within the same bucket are less likely to sustain, due to the first observation mentioned above. It’s going to be like Sisyphus rolling the stone uphill. Unless the stone is rolled over the hill, it will go back down to where it started. To avoid such regression and frustration, our investment should be sufficient to move the task into the next shorter bucket, or don’t make the investment and put the time/money/energy somewhere else.
- There is a big difference between the “Instant” bucket vs. the next two, the coffee break tasks and the lunch break tasks: whether I have a context switch. For the tasks in the instant task bucket, there is no or little context switch. I don’t leave my desk. I remember what I wanted to do. I’m not multi-tasking. Once the task becomes longer and gets into the coffee break bucket, my productivity is one notch down. I have context switch. I have to do multi-tasking. We should try really hard to prevent the tasks in the “Instant” bucket from getting slower and dropping into the coffee break bucket, to save context switch and multi tasking.
- Similar to the previous point, there is also a big difference between the coffee/lunch break bucket vs. the overnight bucket. On the tasks in the overnight bucket, I do worse than context switch. I sleep. It’s like close the lid of a laptop. It definitely takes much longer time and more effort to get the full context back after a sleep, than after a lunch break. We should try really hard to prevent any task slipping into the overnight bucket. It’s about whether it’s same day or not. Same day matters a lot, especially psychologically: in the past, we didn’t really feel the difference between Prime’s two-days shipping vs. the normal 3-5 days shipping; but when Prime has the same-day shipping, it feels substantially different.
Actually, there is a fifth bucket: “over the weekend”. Such task takes more than a day to run. I didn’t include it in my four-buckets model because if an engineering team ever has one or more critical tasks in the over-the-weekend bucket, they are seriously sick. They are deep in debt (engineering debt) and they should stop doing anything else and fix that problem first, to at least move it into the overnight bucket. In a healthy engineering team, all the tasks can be done over a lunch break or sooner. Everything is same day. There is no overnight task. That’s the turnaround time required to deliver innovations and customer values in the modern world.
 Just being exaggerate to highlight the point.
 With reasonable exceptions, such as some long-haul tests. Though many long-haul tests that I have seen could be replaced by shorter tests with certain testability designs.