The best way for me to achieve deep understanding of a theorem is not through lengthy proofs alone, but through practical application/implementation or as they said in the Marine Corps Pract-App. One of the many reasons I love R is the ease to write functions and test results.

The 2008 financial crisis was the topic of a recent dinner conversation and a friend brought up CLNs and SIVs . We’ve all heard of of the untenable simulation models and institutional products like Moody’s CDOROM. The conversation reminded me of one of the more explainable statistical methods, the Monte Carlo.

Let’s say a group of 30 diners were selected at random. How would we obtain the probability that at least two diners share a birthday?

The birthday shall be a numeric value signifying a day in a domain of [1,365] or 1 through 365. Thus, every person has an equal probability at 1/365 or .0027. We randomly select 30 integers from the domain of [1,365] ]with replacement (putting the numbered ball back in the jar after picking it) and check whether two diners share a birthday. We note the event outcomes dichotomously; If at least two diners share the birthday we notate the outcome as 1. Conversely if there is no match we notate with 0.

We repeat this process over many iterations and the LLN (Law of Large Numbers) explains that the estimation function will eventually approximate the true probability that at least two diners share a birthday.

Algorithm:

[sourcecode language=”r”]

##############

##Using Monte Carlo Method to Solve a Birthday Problem ##########

##############

n = 10000 # Define the number of iterations

birthdaySet = c(1:365) # The set of birthday: integer from 1 to 365 included

count = 0 # The number that at least two people share the same birthday

for (i in 1:n){

# Random select 30 numbers with replacement in [1, 365]

numbers = sample(birthdaySet, 30, replace=T)

# If there are any duplicates in the vector of numbers, add count by 1

if ( any(duplicated(numbers)) ){

count = count + 1

} # End if

} # End for

# The proportion that at least two people share the same birthday

pHat = count / n

The estimated probability is:

[1] 0.7074 or 70.74%

[/sourcecode]