# Principal Component and Factor Analysis

### Principal component and factor analysis

Principal component analysis (PCA) and factor analysis (FA) are two confusing topics in statistics. It is not hard to see why they are so often confused since they are similar in more than one way. Although they appear to be different varieties of the same analysis instead of two different methods, there is a fundamental difference between the two. This difference has a huge effect on how these analyses are used.

Similarities of principal component analysis and factor analysis

• They are both data reduction techniques – an analyst can use both analyses to capture the variance in variables in a smaller set
• You can use the same procedure to run both of them in Stat software. The output is always pretty much the same.
• It takes the same steps to run them. These steps are extraction, interpretation, rotation, and choosing the number of factors or components.

Despite these similarities, PCA and FA have a fundamental difference between them. Factor analysis is the measurement model of a latent variable while principal component analysis is a linear combination of variables.

Principal Component Analysis

In data reduction, Principal Component Analysis creates one or more index variables from a larger set of measured variables. It uses a linear combination (basically weighted average) of a set of variables to do this. These index variables are called components. PCA aims to figure out how to do this optimally. This means the optimal number of components, the optimal choice of measured variables for each component, and optimal weights.

Properties of principal

The components in PCA are derived by solving a particular optimization problem. They naturally have some built-in properties that are desirable in practice, for example, maximum variability. Additionally, we can also derive several other properties of the components. We have listed some of them below:

• The proportion of the total variance of the original variables and the variances of each component is given by the eigenvalues
• You can calculate component scores to illustrate the value of each component at each observation
• You can also calculate the component loading that describes the correlation between each component and each variable.
• The p-components can be used to reproduce the correlations among the original variables.
• The p-components can  reproduce the original data
• To increase the interpretability, we can rotate the components

Factor Analysis

Data reduction is approached by factor analysis in a fundamentally different way. FA is a model for measuring a latent variable. A single variable cannot be used to directly measure this latent variable. Think of intelligence, health, and social anxiety.

For example, it is impossible to directly measure social anxiety. However, we can measure whether social anxiety is high or low. We can use variables like “The individual is uncomfortable in large groups”, and “the individual get nervous talking to strangers”. People suffering from high social anxiety tend to give similar responses to these variables. Likewise, those with low social anxiety will give similar low responses to these variables.

Types of factoring

There are several methods used to extract the factor from the data. A principal component analysis is the most commonly used. Here are the others:

• Common factor analysis

Researchers also prefer this method. It extracts the common variance and puts them into factors. However, it does not include the unique variance of all variables.

• Image factoring

The image factoring method is based on a correlation matrix. The method used to predict the factor in image factoring is OLS regression.

• Maximum likelihood method

It also works on a correlation metric but uses the maximum likelihood method as a factoring method.

The other methods include Alfa factoring which outweighs least squares. The other regression-based method that can be used for factoring is weight square.