# Central Limit Theorem Simulation

Statistics via Simulations

Part 1: Law of Large Numbers Simulation

In this week’s studio, you simulated the Central Limit Theorem, and may have begun working on simulating the Law of Large Numbers (LoLN). Your first task on this homework assignment is to complete this simulation.

Part 2: Approval Polls

As you have probably heard, Donald J. Trump is the president of the United States. To get a sense of how he is faring in office, many people rely on job approval polls, conducted from a national sample of U.S. adults. So an obvious question is: how well do the results from a small sample generalize to the entire population? We will investigate this question, as it pertains to Trump’s approval rating, using data from a recent poll.

In 8/31/2017, Gallup conducted a poll, and found that among 1500 U.S. adults, 34% approve of President Trump’s job performance, and 60% disapprove. To quantify how closely the poll results can be expected to mirror the opinions of the population as a whole, Gallup also reported the poll’s margin of error (MoE): 3 points. Polling agencies don’t generally report their confidence levels, only their margins of error, but with a little trial and error, we can recover a confidence interval from a MoE. In this exercise, you will verify that Gallup used a 95% confidence interval to calculate the MoE of Trump’s approval rating. What that means is, if they repeated this poll multiple times, the proportion of the population that approves of Trump would fall within the range [0.31, 0.37] 95% of the time.

If we assume that “approve” and “disapprove” are the only available answers to the poll, each participant’s response can be viewed as a Bernoulli trial, and we can use a binomial distribution to model their collective responses. By the central limit theorem, the estimates 0.34 (34% approval rating) and 0.6 (60% disapproval rating) are normally distributed, with standard error where p^ is the sample probability of success (either .34 or .6). So the 95%

confidence interval is computed as 1. Practice: Write a couple of lines of R code to determine the 95% confidence interval for a proportion. Do this by first creating variables phat and n, and assigning those variables values (e.g., .34 and 1500). Then input the formulas as shown using variables instead of hard-coded values.
2. We would like to use the code you just wrote to compute a 95% confidence interval both for Trump’s approval rating and his disapproval rating. In other words we want to run your code twice, assuming two different sample proportions. Rather than repeat code, embed your code in a function that takes as input a sample proportion and a sample size, and returns the 95% confidence interval as a vector.

Hint: the last line of your function should be something like c(lower, upper), where lower and upper are variables you define in your function (denoting the lower and upper bounds of the confidence interval the function computes).

Another Hint: As 1500 is the sample size of all polls in this question, feel free to assign n the default value of 1500 in the function header, like this: function(phat, n = 1500).

Use your function to report a confidence interval that estimates the proportion of the population that approves of Trump, and a confidence interval that estimates the proportion of the population that disapproves. Do the confidence intervals overlap? If not, we can conclude (at the 95% confidence level) that Trump’s approval and disapproval ratings are different. If they do overlap, we cannot say with confidence that his approval and disapproval ratings are different.

1. Is it also the case that Trump’s approval and disapproval ratings do not overlap at the 99% confidence level? Generalize your function to take as input a third parameter, namely a confidence level α, and calculate the confidence interval at that level.

Hint: Use the qnorm function.

1. Here are the approval ratings for the past four presidents after their first 10 months in office. Using your function, calculate the 95% confidence intervals for Obama, Bush Jr, Clinton, and Bush Sr. Do you observe any relationship between the confidence intervals around their approval ratings and their re-elections?
 President Approval Disapproval Sample Size Source Obama 47% 52% 1500 source Bush Jr 88% 9% 1500 source (Note that this poll was conducted almost immediately after 9/11.) Clinton 47% 44% 1500 source Bush Sr 58% 22% 1500 source

Part 3: Deflategate

Hypothesis testing is a branch of classical statistics that asks questions of the form “how likely is it that an observed outcome is the result of random chance, or is it more likely attributable to other factors?” We are not going to discuss hypothesis testing in detail; but this problem should nonetheless give you some intuition about how hypothesis testing works.

During the 2015 Super Bowl, the New England Patriots were accused by the Indiana Colts of purposefully supplying footballs with pressure below the minimum amount required, perhaps to make their balls easier to catch. During halftime of the game, two NFL officials measured the pressure of most of the Patriots’ footballs (11 of them) and some of the Colts’ footballs (4 of them). (Halftime ended before all the balls could be inspected.) It was determined that, on average, the Patriots’ balls were lower pressure than the Colts’. Your task in this question is to work with the available data (15 measurements) to try to determine whether this outcome could be attributed to random chance.

They contains 3 variables: Football: the football being measured Blakeman: the first official’s measurement Prioleau: the second official’s measurement

1. The two officials obtained slightly different measurements for each ball due, presumably, to a normal amount of measurement variation. Compute the average of the two measurements for each of the footballs. Confirm our earlier assertion that the Patriots’ footballs had less pressure, on average, than the Colts’. (Hint: Use tidyr to separate the “Football” variable into two variables, and then use dplyr’sgroup_by functionality to calculate the team averages.)
2. The NFL requires its footballs to be inflated to between 12.5 and 13.5 pounds per square inch (psi). The Patriots’ footballs were measured at the start of the game to be inflated to 12.5 psi, while the Colts’ were inflated to 13 psi. Calculate the drop in pressure for each ball, between its measurement at the start of the game and its average measurement at halftime. As above, calculate the average drops. Which team’s average drop was greater?
3. A natural test statistic in this sort of circumstance is the difference between the means, in this case the difference between the two mean drops (i.e., the Patriots’ mean less the Colts’). Calculate the value of this statistic.

If you calculated this statistic as we did (i.e., the Patriots’ mean less the Colts’), the result is positive, which indicates that the Patriots’ average drop was larger than the Colts’. The next obvious question then is: Can this positive difference be explained by chance? Or does it require an alternative explanation?

1. Let’s assume that there is no real difference between the Patriots’ and the Colts’ mean drops. Our next goal is to test this assumption. To do so, we’ll use a simulation!

Use the sample function, which by default samples without replacement, to permute the vector of drops, and then assume the first 11 values correspond to the Patriots’ measurements, and the last 4, to the Colts’. Recalculate the test statistic.

Redo this exercise 10,000 times to calculate 10,000 values of the test statistic. Hint: How can you run the same piece of code 10,000 times without copy-and-pasting it 10,000 times?

1. Create a histogram of these simulated test statistics. Where does the actual value lie? Eyeballing your histogram, does it appear to be a high- or a low-probability outcome? Calculate the empirical probability of the test statistic by counting how many times it falls below one of your simulated test statistics.

This empirical probability tells you the probability of the observed value of the test statistic, given these 15 measurements, and assuming the Patriots’ mean drops are the same as the Colts’.

Solution

BQ1.R

one.coin<- function(k){

sample_data<- sample(0:1,k,replace=TRUE)

sample_data

}

sample1=one.coin(100)

table(sample1)

#Alter num_trials before changing the arguement to loln_fair_coin() function

#For eg,if you want to run 100 simulations, change num_trials to 100 before running the code

num_trials=10000

sample_means<- numeric(num_trials)

loln_fair_coin<- function(num_trials,probability){

for (i in 1:num_trials){

fair_coin_flips<- sample(0:1,prob=c(p=probability,q=1-probability), i, replace = TRUE)

sample_means[i] <- mean(fair_coin_flips)

}

sample_means

}

f10=loln_fair_coin(10,0.5)

f100=loln_fair_coin(100,0.5)

f1000=loln_fair_coin(1000,0.5)

f10000=loln_fair_coin(10000,0.5)

#As we increase the number of simulations, mean moves closer to 0.5

BQ2.R

phat=0.34

n=1500

fn95 <- function(phat, n){

sd=sqrt(((phat*(1-phat))/n))

lower=phat-1.96*sd

upper=phat+1.96*sd

range=c(lower,upper)

range

}

trumpapprove=fn95(0.34,1500)

# 0.316 0.364

trumpdisapprove=fn95(0.60,1500)

# 0.575 0.625

fngen<- function(phat, n,alpha){

z=qnorm((1+alpha)/2)

sd=sqrt(((phat*(1-phat))/n))

lower=phat-z*sd

upper=phat+z*sd

range=c(lower,upper)

range

}

obama=fn95(0.47,1500)

# 0.445 0.495

bushjr=fn95(0.88,1500)

# 0.864 0.896

clinton=fn95(0.47,1500)

# 0.445 0.495

bushsr=fn95(0.58,1500)

# 0.555 0.605

trump=fn95(0.34,1500)

# 0.316 0.364

#We don’t see a clear relationship between approval ratings and their

#reelections because both Clinton and Obama had approval ratings below 50%

#but still got reelected and Bush Jr having ~88% ratings also got reelected

#But Bush Sr with ~58% ratings but lost in the general to Bill in 1992

#Bush Sr’s loss was due to a vareity of factors like recession, Ross Perot etc

BQ3.R

deflategate\$combined=(deflategate[,2]+deflategate[,3])/2

mean(deflategate[12:15,4])- mean(deflategate[1:11,4])

#Pats’ footballs had lower pressure than Colts’

deflategate\$drop=12.5-deflategate\$combined

deflategate\$drop[12:15]=deflategate\$drop[12:15]+0.5

Patsdrop=mean(deflategate[1:11,5])

Coltsdrop=mean(deflategate[12:15,5])

Patsdrop-Coltsdrop

#0.7335527, This shows that Pats drops were greater thank Colts’

#Null H: Pats drops were greater than Colts’ purely due to chance variation

#Alt H: Pats drops were greater than Colts’ not just because of chance

drops=data.frame(Drops=deflategate[,5])

sample_drops=numeric(10000)

for (i in 1:10000){

sample1=sample(drops[,1],15,replace=FALSE)

Patsdrop=mean(sample1[1:11])

Coltsdrop=mean(sample1[12:15])

sample_drops[i]=Patsdrop-Coltsdrop

}

hist(sample_drops)

stddeva=sd(sample_drops)

mu=mean(sample_drops)

table(sample_drops-0.7335>=0)

# Yields only 30 TRUE and 9970 FALSE, Which gives us an empirical probability of

# 0.003, which is way below our P-value. (If we take alpha=0.95, P=0.05)

# Thus we reject the null hypothesis that the drops were just coincidental

# Pressure drops don’t reflect the chance variation.