# Descriptive Statistics

Use this data file (Muijs, 2011) to complete the following items/questions. Make sure to include the SPSS output in the word document. The SPSS output does not count in the page limit.

1. Identify one nominal, ordinal and continuous variable.
2. Run a frequency distribution for the variables school is funand sometimes I think I can’t do anything right. What findings do these two frequency distributions reveal to you?
3. What measure of central tendency would be the most appropriate to use to compare school is funand sometimes I think I can’t do anything right?  Report and interpret this central tendency measure for each variable.
4. What measure of central tendency would be the most appropriate to use for School Grades-Math and School Grades-English?  Report and interpret the variable comparison of these central tendency measures.
6. Find, justify and present one variable within the assignment SPSS dataset that is appropriate to construct a pie chart.

Solution

The given data file contains 61 variables with 889 observations. The variables are of different types, like nominal or ordinal or scale variables. In order to explore frequency distribution and summary statistics for suitable variables, only few variables had been selected for the analysis and are described below:

 Variable Type Gender     (boy or girl) Nominal School is fun    (Disagree strongly to Agree strongly) Ordinal School Grades-Math or School Grades-English scores Continuous

The frequency distribution for Gender has been calculated and the corresponding result showed that 50.1% (i.e., 445 out of 889) are boys and 49.9% are girls (i.e., 444 out of 889). Similarly, the frequency distribution has been constructed for two ordinal variables “School is fun” and “sometimes I think I can’t do anything right” and the corresponding result shows the actual frequencies of each level of the variable along with the respective percentages. Since these two variables are ordinal in nature, these variables can be compared by the use of mode. In this case, a maximum of 38.4% of the students replied that they ‘agree strongly’ that their school is fun, while a maximum of 32.7% of the students answered that they ‘agree’ that “sometimes I think I can’t do anything right”.

The variable spread between two continuous variables can be examined by the use of standard deviation and coefficient of variation. In this case, the descriptive statistics have been calculated for the two continuous variables “School grades English” and “School grades Maths” and the corresponding results showed that the mean English score is 78.35 with the standard deviation of 10.42 and the mean Maths Score is 75.99 with the standard deviation of 12.21. The coefficient of variation is calculated by dividing Standard deviation by Mean. In this case, the coefficient of variable for English Score is 13.30% (=10.42/78.35) and that of Maths Score is 16.07% (= 12.21/75.99). It can be clearly seen that both the coefficient of variation for English Score is less than that of Maths Score, which indicates that the School Grade English is more consistent (or less variation) than School Grade Maths.

In addition to the descriptive statistics, a pie chart can be constructed for a nominal variable, namely ‘Type of School’. The variable ‘Type of School’ is a nominal variable with four levels (State, Catholic, COE, and Other), for which it is appropriate to use Pie chart as a graphical representation. The pie chart constructed for this variable is shown below:

From the above pie chart, it can be clearly seen that 69.40% of the schools are ‘State’, 18.90% of the schools are of type ‘Catholic’ and only 11.70% of the schools are of type ‘COE’.

Appendix

 gender Frequency Percent Valid Percent Cumulative Percent Valid boy 445 50.1 50.1 50.1 girl 444 49.9 49.9 100.0 Total 889 100.0 100.0
 school is fun Frequency Percent Valid Percent Cumulative Percent Valid disagree strongly 141 15.9 15.9 15.9 disagree 142 16.0 16.0 31.9 agree 263 29.6 29.7 61.6 agree strongly 341 38.4 38.4 100.0 Total 887 99.8 100.0 Missing 9 2 .2 Total 889 100.0
 sometimes I think I can’t do anything right Frequency Percent Valid Percent Cumulative Percent Valid agree strongly 113 12.7 12.8 12.8 agree 291 32.7 32.8 45.6 disagree 218 24.5 24.6 70.2 disagree strongly 264 29.7 29.8 100.0 Total 886 99.7 100.0 Missing 9 3 .3 Total 889 100.0
 Descriptive Statistics N Minimum Maximum Mean Std. Deviation Variance school grades English 575 31.00 96.60 78.3472 10.41636 108.500 school grades maths 575 30.00 98.80 75.9874 12.21432 149.190 Valid N (listwise) 575

FREQUENCIES VARIABLES=attsc2 self6

/ORDER=ANALYSIS.

Frequencie

 Statistics school is fun sometimes I think I can’t do anything right N Valid 887 886 Missing 2 3

Frequency Table

 school is fun Frequency Percent Valid Percent Cumulative Percent Valid disagree strongly 141 15.9 15.9 15.9 disagree 142 16.0 16.0 31.9 agree 263 29.6 29.7 61.6 agree strongly 341 38.4 38.4 100.0 Total 887 99.8 100.0 Missing 9 2 .2 Total 889 100.0
 sometimes I think I can’t do anything right Frequency Percent Valid Percent Cumulative Percent Valid agree strongly 113 12.7 12.8 12.8 agree 291 32.7 32.8 45.6 disagree 218 24.5 24.6 70.2 disagree strongly 264 29.7 29.8 100.0 Total 886 99.7 100.0 Missing 9 3 .3 Total 889 100.0

/STATISTICS=MEAN STDDEV MIN MAX.

Descriptives

 Descriptive Statistics N Minimum Maximum Mean Std. Deviation school grades English 575 31.00 96.60 78.3472 10.41636 school grades maths 575 30.00 98.80 75.9874 12.21432 Valid N (listwise) 575

/STATISTICS=MEAN STDDEV VARIANCE MIN MAX.

Descriptives

 Descriptive Statistics N Minimum Maximum Mean Std. Deviation Variance school grades English 575 31.00 96.60 78.3472 10.41636 108.500 school grades maths 575 30.00 98.80 75.9874 12.21432 149.190 Valid N (listwise) 575

* Chart Builder.

GGRAPH

/GRAPHDATASET NAME=”graphdataset” VARIABLES=schtypeCOUNT()[name=”COUNT”] MISSING=LISTWISE REPORTMISSING=NO

/GRAPHSPEC SOURCE=INLINE.

BEGIN GPL

SOURCE: s=userSource(id(“graphdataset”))

DATA: schtype=col(source(s), name(“schtype”), unit.category())

DATA: COUNT=col(source(s), name(“COUNT”))

COORD: polar.theta(startAngle(0))

GUIDE: axis(dim(1), null())

GUIDE: legend(aesthetic(aesthetic.color.interior), label(“type of school”))

SCALE: linear(dim(1), dataMinimum(), dataMaximum())

SCALE: cat(aesthetic(aesthetic.color.interior), include(“1.00”, “2.00”, “3.00”, “4.00”))

ELEMENT: interval.stack(position(summary.percent(summary.percent(COUNT, base.all(acrossPanels())))), color.interior(schtype))

END GPL.

GGraph

FREQUENCIES VARIABLES=gender

/ORDER=ANALYSIS.

Frequencies

 Statistics gender N Valid 889 Missing 0
 gender Frequency Percent Valid Percent Cumulative Percent Valid boy 445 50.1 50.1 50.1 girl 444 49.9 49.9 100.0 Total 889 100.0 100.0