For many times, I found that testing a solution could be equally hard as writing a solution, if not harder. Here is an example: how to randomly generate test cases for a HasCycle() function.
HasCycle() is a coding question that I used for many times in the past: write a function to detect whether there is a cycle in a directed graph. Some candidates finished the code pretty quickly, so my follow-up question would be “how would you test it”. For those who manually enumerated test cases quickly and thoroughly, my next question was “how would you randomly generate test cases”.
A wrong answer would be to write the solution again, say HasCycle2(), and see if HasCycle() produces the same results as HasCycle2() does for any randomly generated directed graph. That’s a wrong method because it’s circular proof: using this method requires the proof of HasCycle2() being fully correct, which is the same as to prove HasCycle() being fully correct. Someone argued that he could write HasCycle2() in a different algorithm to make it not a circular proof. No, that would be still a circular proof, just in a disguised form.
There wasn’t a satisfactory answer by any candidate, as far as I remembered. To be honest, I didn’t know the answer in the first place, either. It wasn’t about the answer itself, anyway. It was mainly to figure out how the candidate approaches problems. As long as the reasoning is sound, it would be cool if the candidate told me “no, there is no way, because blah blah blah”. But soon I started to think: to be fair, maybe I should figure that out myself first, to make sure either there is an answer, or the answer is “no, there isn’t an answer”.
I did come up with a way to randomly generate test cases for HasCycle(). Here are the steps:
- Pick a random integer N. Create a N-nodes directed graph with no arcs.
- Pick a pair of nodes X and Y randomly, as long as there isn’t an X->Y arc in the graph.
- Add an X->Y arc to the graph and color it as blue.
- Add the red arcs, too. A red arc is the last arc to form a cycle, in which all other arcs are blue.
- Repeat 2-4 for a random number of times (or until there is no more blue arc can be added)
- On the outcome of (5), just remove all the red arcs and the nodes and the blue arcs will become a graph without cycle. Pass it into HasCycle() and it should return false.
- Or, on the outcome of (5), randomly change the color of one or many red arcs to blue. Remove the remaining red arcs. The nodes and all the blue arcs will become a directed graph with cycle. Pass it into HasCycle() and it should return true.
The crux is how to identify which red arcs to add in step (4). Here is how:
- Get all the nodes that are reachable from Y by blue arcs, including Y itself, and put them into a list from-Y. This can be done by a simple traversal, following the blue arcs draining from Y.
- Similarly, get all the nodes from which X is reachable by blue arcs, including X itself, and put them into a list to-X. This can be done by reverse-traversal on the blue arcs sinking into X.
- For each node i in from-Y and each node j in to-X, create a red arc i->j.
Use an example to illustrate how this works:
First, randomly pick N=11, to generate a 11-nodes directed graph with no arcs:
Add the first blue arc: A and K are randomly picked. A blue arc A->K is added. At the same time, a red arc K->A is added. K->A is red because it will form a cycle together with a blue arc A->K:
Add the second blue arc randomly: B->A. This time, not only A->B should be added as a red arc (because A->B and B->A together will be a cycle), but also K->B should be added ad red arc as well, because B->A, A->K and K->B will be a cycle:
Keep doing this. When G->H is added as blue arc, three red arcs need to be added: H->G, H->A, H->B. Because each of them will form a cycle with other blue arc(s):
Now let’s examine a more general example of how to add red arcs. This time, a blue arc G->F is randomly added:
In this case, from-Y is F, C, E and to-X is G, A, B. So we should add nice arcs: F->G, F->A, F->B, C->G, C->A, C->B, E->G, E->A, E->B:
So the outcome of step (5) is this:
Now, if we want a random graph without cycle, we just need to remove all the red arcs:
If we want a random graph with cycle, we will randomly convert one red arc to blue and remove all the rest red arcs. For example, we convert C->A to blue. That will give us a cycle A->G->F->C->A:
I believe this method doesn’t have any blind spot. Given enough time, it should be able to generate all possible kinds of directed graph, with 0, 1 or multiple cycles.