Hey everyone,
Let me introduce you to a famous problem in mathematics. Suppose you’re looking for someone to marry. In order to find that special someone, you, like most people, go on a series of dates, and then pick the best candidate amongst those whom you’ve dated. Unfortunately, while you have the ability to rank the suitors in retrospect, you have no way to determine if your ultimate choice will have been the best one out of all the available options. For example, if you were to pick the 8th person whom you’ve dated as your lifelong partner, until death do you part, you would be able to rank that person amongst the 7 previous candidates, but you would have no idea if anyone better would have come along had you rejected that 8th person and kept looking instead. Hence, the dowry problem. (Also called The Secretary Problem, The Fussy Suitor Problem, The Marriage Problem, The Sultan’s Dowry Problem, The Gogol Game, and The Best Choice Problem).
More formally, suppose you have 100 rankable suitors whom you will date one at a time, in succession. After each date, you have the choice to either reject or accept the suitor. If you accept the suitor, you cannot date anymore candidates and are stuck with that person for life. If you reject the suitor, you will not have the chance to date that candidate again and will move on to the next suitor. With each additional suitor, you have the ability to rank that suitor unambiguously amongst your previous suitors. The process continues until you either accept a suitor, or reject all 100 suitors and doom yourself to a life of lonely bachelorhood or spinsterhood.
So, how do you maximize your chance of selecting the best candidate? It turns out the solution to the problem is to employ an optimal stopping rule. This means you date X number suitors, rank those suitors, and then choose the next suitor who is at least as good as those X suitors. In this problem the optimal stopping rule is to date 37 candidates, and then choose the next suitor who is at least as good as the best of the 37.
It turns out the problem has an elegant solution when n suitors are involved. Generally, the optimal stopping rule is to date n/e (where e is Euler’s constant) candidates, and then choose the next candidate that is at least as good as the the first n/e candidates. This means you should date about 37% of the candidate pool, and then choose the next person who is at least as good as the best of that 37%. So it works no matter how big n is:
100: date the first 37 candidates
1,000: date the first 368 candidates
10,000: date the first 3,679 candidates
100,000: date the first 36,788 candidates
1,000,000: date the first 367,879 candidates
And so on.
I decided to use VBA to run some simulations on this problem to see if I could confirm what we have seen here. Unfortunately, I was not successful, but after looking more carefully, I think I found the problem had to do with the built-in random number generator, as well as the shuffling algorithm I employed to randomize the order of the suitors. I basically generated an array of 100 integers from 1 to 100, shuffled them, applied the stopping rule from 1 to 100, and repeated the procedure to see if the optimalĀ stopping point (deterimined by the maximum empirical probability of success) matched 37.
After many thousands of iterations I discovered the optimal stopping point centered around 60, and I spent about an hour trying to correct the issue without success. Then I tried counting the proportion of iterations in which the best candidate (integer 100) wound up within the first 33 indices of the array. It turned out that best candidate ended up in the first 33 only 17%-20% of the time, when it ought to be 33%. This would explain why the optimal stopping point for this distribution ended up being further to the right than 100/e. From here I concluded that the shuffling algorithm I used was not shuffling the array uniformly. Here is the code below:
Sub ShuffleArrayInPlace(InArray() As Variant)
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' ShuffleArrayInPlace
' This shuffles InArray to random order, randomized in place.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim N As Long
Dim Temp As Variant
Dim J As Long
Randomize
For N = LBound(InArray) To UBound(InArray)
J = CLng(((UBound(InArray) - N) * Rnd) + N)
If N <> J Then
Temp = InArray(N)
InArray(N) = InArray(J)
InArray(J) = Temp
End If
Next N
End Sub
I obtained the shuffling algorithm from Chip Pearson’s VBA page. I’m willing to give the problem another shot with one that shuffles the array more uniformly, so if anyone can help me out that would be great!