# Conduct Appropriate Statistical Tests

1)

Formulate a research question about a political topic of interest. Identify a clear hypothesis and the corresponding null hypothesis. Identify the independent and dependent variable.

2)

Write a paragraph explaining why and how the independent variable causes the dependent variable. Build your answer around at least one academic journal article. Provide the complete citation of the article used. Please use the Chicago in-text citation style.

3)

Identify an available dataset that allows you to test the hypothesis formulated in Question 1. Provide the complete (and working) link to the dataset.

4)

Test your hypothesis. Conduct all appropriate statistical tests learned in class. Only conduct the tests that are appropriate for your data (that is, do not perform all tests learned). Provide a table and/or a figure with your results. Write a paragraph describing the relationship and the statistical test(s) conducted. Is the hypothesis confirmed by the results? Is the null hypothesis rejected?

Solution

Formulate a research question about a political topic of interest. Identify a clear hypothesis and the corresponding null hypothesis. Identify the independent and dependent variable.

The research question is whether “Red” states or “Blue” states have higher average incomes. For the purposes of this research, a state will be called a Red state if its governor for the most recent year for which state income data is available, 2016, is in the Republican party and a Blue state if its governor in 2016 was a Democrat. The null hypothesis is that the population mean of per capita GDP is the same in Red states as in Blue states and the alternative hypothesis is that the mean of per capita income differs across Red and Blue states.

Write a paragraph explaining why and how the independent variable causes the dependent variable. Build your answer around at least one academic journal article. Provide the complete citation of the article used. Please use the Chicago in-text citation style.

It is important to start by noting the direction of causality, which is from voters’ preferences to state “color” rather than from a party ideology having more or less effectiveness in generating income. Voter preferences came under increased academic scrutiny following an article in the Atlantic Monthly by the journalist David Brooks. Brooks contended that American culture was split into rural (“Red”) and urban (“Blue”) cultures in which people have different relations with their neighbors, use different restaurants and shops (Cracker Barrel, MacDonald’s and Walmart versus Thai restaurants) enjoy different recreations ( NASCAR, hunting and snowmobiling versus cross country skiing and National Public Radio).[1] Moreover, the rural culture tended to vote Republican or “Red” whereas the urban culture tends to vote Democrat or “Blue.” Subsequent efforts by scholars to both quantify differences in culture and relate them to different political preferences have yielded different conclusions, partly because of the different measures that researchers have used. For example, Ansolabehere, Rodden and Snyder (2006) claim that “culture wars” explanations of voting patterns have less explanatory power than variation in economic opinions, but that culture and economic opinions tend to pull in opposite directions, so that richer urban voters tend to vote Democrat as a reflection of cultural values and against their economic interests.[2] In contrast, Glaeser and Ward (2006) show that the correlation between actual economic well-being and voting Republican has been in decline since the mid-1970s and that cultural beliefs remain the primary explanation for differences in voting patterns. They argue that industrial development was fostered by tolerance of immigrants and different religious views in concentrated population centers. In their view, tolerance of diversity not only led to identification with the Democrat party from the “cultural wars” perspective, but also furthered economic development, which explains an association between state color and income.[3]

Identify an available dataset that allows you to test the hypothesis formulated in Question 1. Provide the complete (and working) link to the dataset.

Both per capita state income and the party of the governor of each state are readily available from Wikipedia (per capita GDP by state is on the page https://en.wikipedia.org/wiki/List_of_U.S._states_by_GDP_per_capita, state governors https://en.wikipedia.org/wiki/List_of_current_United_States_governors, supplemented by Google search for current governors for a handful who assumed office after 2016).

Test your hypothesis. Conduct all appropriate statistical tests learned in class. Only conduct the tests that are appropriate for your data (that is, do not perform all tests learned). Provide a table and/or a figure with your results. Write a paragraph describing the relationship and the statistical test(s) conducted. Is the hypothesis confirmed by the results? Is the null hypothesis rejected?

There are 19 states with Democrat governors, 30 states with Republican governors, and one state, omitted from the analysis, with an Independent governor. Descriptive statistics including the sample size, mean standard deviation and four quartiles for the overall sample and by political party of the governor are shown in Table 1 below. Not only is the mean higher in Blue states than in Red states, but each of the four quartiles is higher in Blue states as well.

Table 1

Descriptive statistics

 Party N mean sd min p25 p50 p75 max Republican 30 52,093.43 9,356.54 36,029 44,518 51,142 58,028 74,564 Democrat 19 57,793.42 9,804.61 40,071 49,780 58,327 64,454 75,360 Overall 49 54,303.63 9,839.43 36,029 46,625 53,565 60,481 75,360

Given the small sample size, especially for Democrat governors, an independent samples t test is appropriate if the population distribution of per capita income by state follows a normal distribution and a nonparametric Wilcoxon Rank Sum test is appropriate otherwise.

A histogram of the distribution of per capita income by state is shown below in Figure 1.

Figure 1

Histogram of 2016 per capita income by state

Although the graph does not look exactly like a normal distribution, neither does it look especially skewed in one direction. The Kolmogorov-Smirnov statistic can be used to formally test the null hypothesis that the distribution of per capita income across states is a normal distribution. The p value for the Kolmogorov-Smirnov statistic is 0.96, implying that the sample provides no evidence that the population distribution of per capita income is not a normal distribution. Per capita income for a state is an average of individual incomes in the state, and since the Central Limit Theorem states that the sampling distribution of the mean will follow a normal distribution the finding that per capita income follows a normal distribution across states is not surprising.[4]

Because there is no evidence that the population distribution of per capita income is not a normal distribution, an independent samples t test is appropriate for testing the null hypothesis that state income does not differ by whether the governor of the state is a Democrat or a Republican. Computations for the test are shown in Table 2 below.

Table 2

Two-sample t test with equal variances

 Group N Mean Std. Err. Std. Dev. 95% LCL 95% UCL Republican 30 52093.43 1708.263 9356.54 48599.64 55587.22 Democrat 19 57793.42 2249.332 9804.61 53067.75 62519.09 Combined 49 54303.63 1405.633 9839.43 51477.42 57129.85 Difference -5699.988 2794.359 -11321.51 -78.46168

Assuming equal variances for each population, the t statistic for the test is 2.04 with 47 degrees of freedom, which has a p value of 0.047. Therefore the sample provides fairly strong evidence against the null hypothesis that per capita income is the same in Red and Blue states and we conclude that per capita income is higher in Blue or Democrat states. The research literature does not clearly identify whether the higher income of the Blue states reflects only the predominant cultural preferences of higher income individuals or reflects cultural values as well as economic opinions.

set more off

/* Lable the variables with descriptions */

/* of their contents.                    */

label variable Party “Party of state governor, 2016”

label variable GDPpercap2016 “State GDP per capita, 2016”

label variable Governor_2016 “Name of state Governor, 2016”

label variable State “State name”

/* Generate state color variable:        */

/* Blue if Democrat governor,            */

/* Red if Republican governor.           */

generate color = 1 if Party == “Democrat”

replace color = 0 if Party == “Republican”

replace color = .a if Party == “Independent”

label variable color “Color of state in 2016”

label define COLOR 0 “Republican” 1 “Democrat” .a “Independent”

label values color COLOR

/* Descriptive statistics of GDP per capita */

/* for full sample.                         */

/* Table 1.                                 */

tabstat GDPpercap2016, statistics( count mean sd min p25 p50 p75 max ) columns(statistics)

/* Histogram of income per capita across states. */

/* Figure 1.                                     */

histogram GDPpercap2016, width(5000) start(30000) percent fcolor(navy) xlabel(30000(5000)80000, format(%9.0gc))

/* Kolmogorov – Smirnov test of null hypothesis */

/* that per capita income follows a normal      */

/* distribution.                                */

/* Discussion around Figure 1.                  */

ksmirnov  GDPpercap2016 = normal(( GDPpercap2016 -54303.63)/9839.43)

/* Independent samples t test.                  */

/* Table 2.                                     */

ttest GDPpercap2016, by(color)

[1]David Brooks, 2001. “One Nation, Slightly Divisible,” Atlantic Monthly, December 2001, 53-65.

[2] Stephen Ansolobehere, Jonathan Rodden and James M. Snyder, Jr., James M., “Purple America,” The Journal of Economic Perspectives, 2006, Vol 20, No. 2, pp. 97-118.

[3]Edward L. Glaeserand Bryce A. Ward. “Myths and Realities Of American Political Geography,” Journal of Economic Perspectives, 2006, Vol20, No. 2, 119-144.

[4] For example, imagine that the entire USA draws from a single population distribution of income and that after each person’s income is drawn, they are randomly assigned to a state. Then the per capita incomes for the states are just 50 sample means from 50 different samples, albeit of different sample sizes, from the same distribution and the Central Limit Theorem holds precisely in this situation.