Chapter 3 PCA plots

In this chapter, we present Principal Component Analysis (PCA) results of all samples in the current dataset. Principle component analysis is a powerful method in exploratory data analysis, by visualizing complex datasets in lower dimensions. Thus it is also a useful method for dimension reduction, preserving critical information on the similarities and differences between all samples. For more information, please check the Wiki page.

The percentage of PC1 or PC2 indicates how much proportion of variance in the dataset can be explained by each PC. Usually, PC1 can explain most of the variance, but for complex data with more noises, we also rely upon PC2 for better separation.

The PCA plot displays, by default, the first 2 PCs in the dataset, labels the data names, and colors study groups. From this plot, we are able to examine the consistency of samples within the same group, and identify potential outliers. The separation of groups on the PCA plot also indicates the differences between them, and this trend should be consistent with the differential expression analysis, i.e., two groups with higher divergence should have more significantly changed genes.

3.1 General PCA plot

In this section, we plot a single PCA figure to visualize the default covariate, usually by the group information.

3.2 PCA plots by covariates

In this section, we display PCA plots colored by different covariates.

The PC1 and PC2 values are same across all plots, but the color panels are different.