David's Blog

My debugging methodology

The other week, a new teammate asked about my approach to resolving bugs. This was in the context of receiving a bug report (e.g. in the form of a ticket, task, etc) and so it might be slightly different than debugging in a broader sense. Still, I realized I had never formally written out (or discussed) debugging steps before and so thought it might be worth writing down more broadly. Writing it down made me think about and articulate what exactly I would do where before I might have had a mental model or internally intuitive process to follow but not in a well-defined way.

This might seem like it's very straightforward but in thinking about it more, I think it's a form of tacit knowledge in that you build it up over time but don't often read about it, discuss it, or otherwise think about it.

I listed out some steps from receiving the bug report through to resolution:

1. Read the bug report and understand the incorrect behavior as well as the expected or desired behavior. Check whether there were steps provided for reproducing the bug.

2. Try to reproduce the bug on your own. This could be in your local development environment (ideally) or in a staging/production environment if need be. If there were reproduction steps included with the bug, use those steps. Otherwise, try to recreate the conditions for the bug on your own. If you can't reproduce (or can't do so reliably) it then debugging and fixing it will be difficult or impossible.

3. Once you can reproduce it, you know the bug still exists. The next step is to start determining what code relates to the behavior you were just seeing and to read the code. If it's a small codebase or one you know well, finding and reviewing the code might be straightforward. If it's a particularly large project then there may be some searching required (either locally with tools like git-grep, ripgrep, editor search tools, or web-based tools like Github code search and livegrep).

4. Once you have a rough idea of the code related to the bug, the next step is trying to understand what behavior is triggering the bug. This could involve reading the code to understand some logic, figuring out why incorrect data is being passed in or produced, writing a test case to trigger the bug in code automatically, or just tweaking code and running it to experiment and see what happens (perhaps while looking at or adding logging or other diagnostics). This process could be quite involved depending on the complexity of the bug!

5. If you've found where the bug is in code and understand it, now you can try to implement a fix! Sometimes this is something like fixing off-by-one logic errors or similar minor mistakes but other times it can be much more involved. Verify that your changes fix the bug by running through the reproduction steps and seeing the bug is now gone. Further, and where possible/reasonable, implement a test case to automatically verify the bug is fixed and to prevent regressions. I'd be remiss if I didn't also mention that you should spend time trying to understand why the code was written in the way that it was in the first place. It could have been an intentional decision and understanding if that decision still makes sense is part of (attempting to) resolve the bug. Tools like viewing commit history or old tasks/tickets can often help with determining this context if you are not sure of it.

6. Create one or more commits with your bug fix (and ideally your new test case(s) as well). If the project you are working on has a code review process, follow the workflow for creating a new code review.

If you can get through all of these steps, chances are you've fixed the bug! Following these steps isn't always as straightforward as this short description might sound. Reproducing a bug, finding the relevant code, and figuring out why that code is behaving the way that it does are all worthy of longer posts of their own. Lately Julia Evans has been talking about debugging a lot on Twitter lately, sharing both some really good tips for debugging as well as describing why it can be a difficult process. Here's a blog post about that that I thought was very good.

Even once you've done some debugging and understand a bug, you might not easily be able to fix the bug! Sometimes there will be extensive refactoring needed to address the problem in a clean way or there could be architectural limitations that make fully addressing the bug difficult. That won't always be the case though! I've found that debugging is a great way to learn about a system, likely because it forces you to read lots of code!