Tips for Debugging
The tips in this guide are focused on debugging tips specifically for interviews. Generally, speaking, there are several top priorities: Shorten your dev cycle, find a minimally reproducible example, etc. However, many of these tips go out the window in an interview, because most of the time, you're only working with 20-30 lines of code anyways. As a result, running code is incredibly fast, and you already have as simple of an example as you can possibly get.
Given this, let's focus on tips that apply when you already have a super short dev cycle, with a minimally reproducible example. Note that these tips can apply to any debugging scenario really — you just need to be extra fast at these in an interview.
Define trivial inputs and outputs
Before talking about how to come up with your own test cases, I should first emphasize: Come up with your own test cases — even if the interviewer gives you one. More likely that not, the provided test case is probably designed to cover a few edge cases, most of which you don't care about yet, when you're trying to get a simple version of your code working.
Now that you're considering defining your own test cases, here are two possible first steps for test cases:
- Define a "trivial" case. Define a really silly case that doesn't test the code logic but ensures that everything is running fine in the simplest scenario possible. For example, this could be input of all zeros to make sure outputs are all zeros. At the bare minimum, this ensures your code can run. This may seem silly, but in code with lots of transpositions, contractions, expansions etc. you may end up with incompatible matrix dimensions. This test can sus that out pretty quickly.
- Define the identity case. Define a test case that has your code produce the "identity". For example, if you're writing up a matrix multiply, use the identity matrix for your weights, and make sure the outputs match the inputs exactly. If you're writing up a sorting algorithm, use a pre-sorted list of numbers. You could call this the "base case".
These should be quick sanity checks you can apply to almost any question. Use these at your discretion of course. If your code is already running fine, then clearly there's no need to define a test case just to test if your code runs. Each test case you add should clearly test something new — that adds information to your debugging process, which we'll discuss down below in .
Make it friendly for mental math
After you've defined and passed sanity checks, it's time to actually define tests that cover the logic in your code. However, these should be no less simple, and "simple" in this case has a specific meaning. In particular, your inputs and outputs should be friendly for mental math.
So friendly in fact, that you should be able to run your function, with your inputs, by hand. There are two easy ways to do this.
- Make your input small integers. In fact, add as many zeros as you can without compromising the test's ability to evaluate your logic. Add ones if you can too. For example, say you're working on Convolutions from scratch. Start by using the filter
[1, 1, 0]
, and you could make the input something simple, like[1, 2, 3, 4, 5]
. - Use as few numbers as possible. Make your input a
3 x 1
tensor if you can, for example. Heck, make it a1 x 1
if you can, to start off. If your code involves complicated dimension logic, you might not even need to pass in any input. Just check the dimensions themselves.
As you begin to pass more and more of your tests, you'll eventually want to graduate to fuzz testing if your interviewer gives you more time. However, just like with these simple tests, you'll want a hypothesis for what kind of test cases you expect to catch. Doing anything blindly, even writing tests, is bad in an interview.
Use the code as your scratchpad
Even though I said "friendly for mental math", you don't need do this all in your head. Use the code in front of you for your notes.
- Next to every line of your code, write the value that variable should hold. This will help you tremendously, by making all of your code a scratchpad for your thoughts. It has the doubly-beneficial effect of giving the interviewer insight into your thoughts — so they can help you. Even neater, is that you can then breakpoint into those lines to double-check if your mentally-computed values are correct.
- Next to every line of your code, write the tensor dimension names. This will help you understand pretty quickly which dimensions are compatible, and which dimensions you need to transpose or permute. If excessive commenting is not your jam, I highly recommend using named tensors of some kind, or adding the dimension names to your variable name, such as
q_bshd
forb = batch_size
,s = seq_len
,h = n_heads
,d = d_head
. Customize as you see fit.
Taking notes in this way can get tricky for loops and recursive functions, so prepare a system that works for you.
Test one component at a time
If you're lucky, one random test will work on the first go, but that's rarely the case. If your test fails, make sure you're only testing and checking one component of your code at a time. Too many components at once, and you won't know which component failed.
- For example, if you're writing a matrix multiply, check that one vector input works first — this checks that your inner loop is working correctly. Then, pass a matrix to check that your outer loop works correctly.
- If you're writing the feed-forward network in a transformer, check that the intermediate outputs after the first linear match your expectations. If you can't breakpoint, set the other linear layers to return the identity, so only your first linear layer has a tangible effect on the output.
If you're diligent with testing only one component at a time, you're guaranteed to eventually test the problematic code through process of elimination.