Development and Debugging Principles

Lessons Learned While Debugging

  • When two identical systems are generating different results, verify that (in order of likelihood):
    • The versions of all components are, in fact, identical between the two systems,
    • The config is identical between the two systems, and
    • There is no residual data on one system that was not present on the other.
  • gdb and gcc parse code better than I do. Let software do its job so you can do yours.
  • Regular check-ins of "stable" code on a wip branch allows you to quickly identify which change resulted in a bug that crops up during development. That is, if I create a bug or break the build during development, I only have to look at changes since the last check-in. Revision control systems are designed for this; take advantage of it during development. A good motto is "Commit early, commit often."
  • Whenever information in a ticket is unclear, get the ticket owner to clarify before making changes to code.
  • Always verify changes before committing an adhoc on a customer box.
  • When data comes out one end of a process different from what you're expecting, verify data all along the process to verify that it is as expected, rather than focusing on a function in the process down-stream from the real problem.
  • Do not focus on a specific piece of code if you haven't verified that it is the cause of the bug.
  • When debugging a new code base, ask for help once you've done your homework. Come prepared with specific messages and file names.
  • Get more sleep.
  • Spend time learning the available debugging tools.
  • When the investigation requires multiple lines of investigation, put minor issues on the back burner to help focus on the main issue.
  • If you are required to work in multiple environments, make sure your tools are installed and ready in all environments before a problem happens.
  • Don't invest time/effort when a quick test can negate the necessity. (This is the software equivalent of "measure twice; cut once". Unlike cutting wood, changes to software can usually be easily undone; what can't be undone is the time it takes to make the change.)
  • When updating a development environment, make sure all aspects are updated to the same version.
  • Logs are your friend. If the program is spitting log messages it may guide you to the bug faster than code investigation will.
  • Writing down a map (complete with class names, file names, line numbers and function calls) while tracing through code speeds debugging.
  • When stuck, ask for help. Sometimes a co-worker has either a fresh perspective or a piece of information that can solve the puzzle in seconds.
  • When an issue is taking too long, start a timer, then when the timer goes off, either put off the task or ask for help. It only takes a few minutes to ask someone else who may know the answer; it can take hours experimenting and surfing for answers.
  • When odd errors occur, look at configuration rather than code changes.
  • Look for the cause of a bug in new or untested code before looking in existing and established code unless there is a good reason to start there.
  • If you can reproduce the error, but are having trouble locating the source, try a variation on the test. Multiple varying tests may expose the full behaviour of the bug which makes diagnosis easier. For example, change your inputs, or change the method of a test to see whether the bug appears or not under different conditions. Create a table of inputs and outputs from the black box code you're debugging.
  • Take the time investigate the functions that are being used in code that you are debugging; names can be misleading. In languages like Perl where there is no strong typing, double check the expected types of values of functions that you are calling.
  • Moving to another problem may show the answer to the problem you're stuck on. Also, sleeping on a problem helps.
  • Take ownership of code you are debugging. We had a report that took much longer than it needed to because there was a bug that no-one had bothered to track down.
  • Even with a plausible explanation as to why some code needs a work around, investigate the problematic code for possible bugs. Challenge your assumptions before hacking.
  • Debugging requires rest and calm; take all your breaks, breath, stretch.
  • Anticipate long computer operations (eg. compiling) and plan to do those over breaks or overnight.
  • Data evidence trumps the customer's word and my own assumptions about the customer's intelligence.
  • Always start with errors in the log files.
  • Identify times when above average support may be required, and ask to have more support time scheduled.
  • Always pass issues off via non-deniable methods (ie. CC the team lead), otherwise, you are still the issue's owner.
  • When debugging on a live box, leave as little evidence as possible so that further investigation, especially other developers will be able to sort our what was actually done. For example, use less rather than vim so that there is no confusion over which files you touched.
  • If you don't know how to test something, ask someone else if they know how.
  • Don't let errors dominate log files; it makes it much difficult to debug real problems.
  • Don't mix versions of code and configuration.
  • Redo incorrect work as early as possible even if it will be a pain and even you don't think it will be a big deal, or at least add it to the list of Assumptions and Risks for the issue. Example, I used the wrong version of the configuration from an issue, and everything seemed to be fine, but it caused issues down the line.
  • Especially when debugging low level code, use the compiler sparingly. When a compile error is encountered, use that error to find similar errors in the new code. (Feb 23, 2009)
  • Keep track of all hypotheses and next steps during the debugging process. (February 23, 2009)
  • Deleting things in loops is dangerous; prefer built-in list handling like <algorithm> or LINQ. February 23, 2009)
  • Be careful exposing non-const internal lists; you never know what someone else may do with your data. (February 23, 2009)
  • When working jointly on an issue, discuss testing up front with all other developers to eliminate the duplication of effort. (March 4, 2009)
  • Question everything the user tells you when resolving an issue. If they say the import says X, check that it says X. (April 2, 2009)
  • With all the hypotheses written down, seek to disprove each one. Start with the most likely hypothesis, or if unsure, start with the easiest one to check. (April 2, 2009)
  • Do not sneak unrelated changes into a check-in. Changes need to be approved through proper channels. Changes also need to leave a trail so the why of the change can be ascertained later by another developer; checking in unrelated code significantly blurs that trail. (April 2, 2009)
  • Grasp the existing design before making any changes either through inspection or exploratory programming. Trying to code around an existing design will result in really bloaty and messy code. (June 26, 2009)
  • Apply the principles of Test Driven Design when changing or adding code. Even when no test-harness is in place and there is no time to add one, test each change manually as it is done. (June 26, 2009)
  • "Unused code is like an unused water pipe. The rodents and bugs multiply unseen, and if you ever turn the water on, you get dirty crap coming out.", Ed Moulding (October 6, 2009)

Excerpt from Zen & the Art of Debugging
(http://zenandtheartofdebugging.com/seizing.htm)
Once found the bug needs to be fixed. However, it is estimated that half of all bug fixes don't fix the problem or introduce a new one or don't fix the problem and introduce a new one.

Questions

  1. Have we verified that the hypothesized bug really explains the failure?
  2. Have we verified that all of the failure is accounted for?
    • Is there a second bug running in full lurk mode?
  3. Have we identified the root cause of the problem?
  4. Is the proposed fix the best possible?
    • Is it the simplest to implement?
    • The least risky?
    • Does it make the code base simpler or more complex?
  5. Have we looked for and eliminated all similar bugs?
    • It is rare for a mistake to be made only once.
  6. Verifying the fix: have we tested that the fix actually worked?
    • For all of the reported problems?
    • Without breaking anything else?
  7. Have we verified that this error can't be made again?
    • And that if it is, it will be caught automatically?
  8. Is there anything in the coding (or perhaps more accurately "bugging") process which is making us vulnerable to bugs?
    • (I.e. vague requirements, insufficent training, lack of sleep,…).

Professor Solomon's Twelve Principles of Finding Lost Objects (adapted to finding bugs)

  1. Don't look for it
    • It is easy to start searching frantically.
    • Stop and develop a plan on where to look.
  2. It's not lost-you are
    • There are no bugs that are impossible to find, only unsystematic searchers.
  3. Remember the three C's
    • Comfort: Don't launch in right away, grab a cup of tea, you may be a while.
    • Calmness: Do some sort of calming exercise to push back nervous thoughts.
    • Confidence: Get all new-agey and tell yourself that you will find the bug.
  4. It's where it's supposed to be
    • Odds are the bug is found in the code that handles the function being performed when the bug was encountered. This is not always the case, but it is a good place to plan to search first.
  5. Domestic drift
  6. You're looking right at it
  7. The camouflage effect
  8. Think back
  9. Look once, look well
  10. The eureka zone
  11. Tail thyself
  12. It wasn't you
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License