You're on an airplane and happen to look out the window and notice a small problem with your engine. You remain calm and sit back knowing that the pilots just got a bit busier. The first thing they reach for is their trusty checklists†.
Every aspect of maintaining and flying airplanes are kept in a large book of checklists. The reality is there is just too much to remember. The costs of forgetting a seemingly step way too high. As such everything is meticulously documented. † Some checklists are even required to be in memory, such as the one above.
What do checklists do for software developers?
Checklists are critical for any team to perform well and they are an easy step forward to organizing your processes and teams.
Onboarding
As new developers join the company, they need to get up to speed on the standards. Sure, you could sit with them and try to make sure you remember every detail, and they can write that down in their notes, and you can do this over and over again. Or, just it write it down once and then share the checklist each time. If you're adding or scaling quickly, the time spent training can be prohibitive. Checklists help offload this time and allow new developers to be confident they are doing the correct processes.
Quality Assurance
Web development is quite complex and hard to get right. Those last little details which make all the difference are often forgotten about, or are skipped due to time or other constraints. Having checklists ensures that you can make sure all your quality requirements are noted.
Lower technology barriers to entry
As systems and companies grow, it becomes impossible to know every bit of technology, api, system, etc. Checklists are a good way to distill knowledge and allow other devs an on ramp into that system even if they may not be experts. All common tasks and troubleshooting should be distilled into checklists so anyone can work on that system even with little knowledge. This makes your team more versatile since they don't need to sift through documentation and guess which commands work for this system.
Emergencies
You just got an alert that the production environment is down. This is a stressful situation for most people. It's almost impossible to remember every single step of every single system. It's best to drill your developers to turn to the checklist. Now they only need to remember 1 thing, "go to the checklists". There should be little reason to panic until the end of the checklists are reached (that's an article for a different day).
Processes around checklists
Checklists should be implemented into your companies culture and processes. These may be highly enforced, or perhaps in most cases, loosely referenced depending on your business and industry.
Highly enforced requirements
Your checklists become your lifeline if you are in a sensitive industry that requires high levels of security and audit. While not as fun for developers, they will need to use and complete their checklists for every action. In extreme cases, other departments might even need to weigh in and approve the checklist. If this is the case, you want to ensure your checklists are easily found and duplicated for each task.
An example of this could be that after adding a database migration, legal needs to ensure we are not collecting any data that is not in our policies. Thus they need to put their "stamp" on the checklist.
These checklists can be dated and saved in a file system for an audit trail. While this is not my personally preferred way of working, it may be required for industries like finance where regulation is high and the company needs to show it is taking steps to safeguard their data and security.
Loosely referenced
In most companies, the checklists are referenced on an ad-hoc basis. They are guidelines to follow. As a team lead, I want to ensure that my team is utilizing these checklists to guide their daily tasks. When something goes wrong, it is an opportunity to update the checklist. This takes the responsibility off of one individual and puts it on the team instead. As long as the checklist is followed, there can not be individual blame. It is the teams fault that the checklist is not complete or correct. After it's fixed, hopefully, there will be nothing that goes wrong the next time the same scenario is encountered.
Creating checklists
There doesn't need to be a lot of effort put into creating checklists. To start creating checklists, jot down what you are doing while you do it. It's better to write down what you are doing to build your processes instead of going "process shopping" and trying to align your work to an established process. What works for others may not work for you. A good time for this in the early stages of a company is when a new employee is onboarded. As you communicate verbally to them, jot down the steps and hopefully the next person does not require this same communication. I am not advocating that your process should remove all communication. Communication is the most vital skill for your team. The goal should be to optimize your communication. Not having to spell out basic steps like how to commit or merge means you can spend that time on better items like your company culture or mission.
Checklists in action
I've recently added how I typically integrate checklists into my workflow to Olympus Framework, my app starting point. I like to keep documentation directly with my repos. I have seen a separate docs repo quite often. My experience has been that separating the docs leads to more stagnation for little benefit.
As a company grows there might be several document repos or guides. A company could extract the common items like culture and such leaving the app specific docs with the corresponding repos.
This docs folder utilizes Docsify to turn Markdown to a webpage. This gives the nice benefit that you can view the docs inline in say Github or serve up the site on a url to browse with no additional dev work.
The checklists section is on the left and any dev can easily browse and see various tasks that we have checklists for.
Writing them down is only the first step. The next goal is to leverage it as much as possible within your dev process. A checklist like committing code is one that new employees will use for a bit and keep updated. However a checklist for adding a migration might quickly be forgotten or newer employees might not even know it exists.
In order to get these front and center I added a bit of code into the Dangerfile. Danger is a great way to help enforce your company processes and dev culture.
# Check migration files
all_files = (git.modified_files + git.added_files).reject { |path| path.include?('Dangerfile') }
migration_changed = !all_files.grep(%r{db\/migrate\/}).empty?
migration_updated_message = "Migrations have been updated! Look for the checklist below."
if migration_changed
file = "./docs/checklists/adding_a_migration/README.md"
contents = File.read(file)
checklist = contents.split("### Adding migration checklist").last
message("A migration has been added, complete the migration checklist:" + checklist)
warn(migration_updated_message)
end
After creating a PR on Github and the checks run, a nice message is added for the developer and reviewers to remember and check each step. Ensuring this checklist gets added to the PR helps with making sure it's always up to date. If someone notices a missing step, they can simply update the in repo documentation and add it to the PR for all future devs.
Conclusion
Checklists are not going to be bulletproof at all times. If they aren't used, then bugs and errors will quickly creep into your companies dev culture. Even used well, bugs and accidents will occur.
The disasters around the 737 Max crashes, particularly Lion Air flight 610 could have been prevented with better checklists. Not long before the crash another pilot encountered the same scenario and found the solution in their third tier of checklists. Pilots must have certain checklists in memory, then there is a quick reference card of important but less critical tasks and finally the full book of all the checklists. The captain of that flight reported the incident and that was never circulated beyond the maintenance team. Had a stronger culture of sharing been in place, hundreds of lost lives could still be with us.
You are not performing efficiently if you aren't using checklists to help drive your dev processes. It doesn't matter the language or tooling you use to implement them, but every effective team has them.