this post isn’t about that kind of simulation… (property of EA Sports)

Simulating Misanthropic Neighbors

Using R & Shiny to solve a FiveThirtyEight Riddle

Daniel McNichol
3 min readDec 18, 2018

--

Back in 2016 I posted my first interactive Shiny App based on a simulation program I wrote in R to try to test my solution to a 538 Riddler puzzle. I’m crossposting that work here now, since LinkedIn is trash.

Misanthropic Neighbors

The 4/22/16 riddle went as follows:

The misanthropes are coming. Suppose there is a row of some number, N, of houses in a new, initially empty development. Misanthropes are moving into the development one at a time and selecting a house at random from those that have nobody in them and nobody living next door. They keep on coming until no acceptable houses remain. At most, one out of two houses will be occupied; at least one out of three houses will be. But what’s the expected fraction of occupied houses as the development gets larger, that is, as N goes to infinity?

My a priori reasoning went:

  • Best case = 1/2 occupied houses
  • Worst case = 1/3 occupied houses
  • Mean case = 5/12 occupied houses

As N grows infinitely large, the fraction of occupied houses will approach the mean case.

There is variability in the best, worst, & middle cases depending on N being odd or even, as well as its size (in fact, odd N’s can do better than 1/2 in edge cases for large N’s, & very often for small N’s). However, these marginal variances should shrink asymptotically to zero as N approaches infinity.

To test this, I created a simulation function in R:

… then whipped that code into an interactive shiny app:

this is a screenshot…don’t fall for it

Full shiny code in this repo: https://github.com/dnlmc/538

Per the simulation, the proportion of occupied houses converges to ~ .432… as N grows larger.

My a priori conjecture was 5/12, or roughly .41666…

As we used to say in my old job: close enough for government work.

It also got me a nice shoutout in the following week’s solution post on 538:

In light of the full analytic solution, my attempt at an a priori estimate was a bit …heuristic & under-developed. But it got within ~ .016 of the correct answer — & didn’t require me to bother with any ‘second-order non-homogeneous recurrence relations with variable coefficients’ — so I’m gonna go ahead & chalk that up as a win.

If you agree (or don’t), follow me & check out my other posts :)


Follow on twitter: @dnlmc
LinkedIn: linkedin.com/in/dnlmc
Github: https://github.com/dnlmc

--

--

Daniel McNichol

Founder & Chief Scientist @ Coεmeta (coemeta.xyz) | formerly Associate Director of Analytics & Decision Science @ the Philadelphia Inquirer