TransWikia.com

data visualization RNAseq : scaling data for PCA and cluster dendogram

Bioinformatics Asked on May 20, 2021

I have count data from a RNAseq experiment (2 samples are from normal cells and 3 samples are cells with a disease), and the data is already standardized by trimmed mean of M values (TMM).
I want to do some plots: biplot of Principal Component Analysis (PCA) and a cluster dendrogram to see if the samples normal/disease are well separated (there is a clear difference between them).
Since the data is already standardized (TMM) should I scale and center the data prior to perform PCA and cluster dendrogram??

thank you!

One Answer

Scaling (or centering) makes the genes comparable: Putting the expression levels of genes in the same scale (i.e. between 0 and 1) sustains that all of your genes contribute equally to the PCA or distance calculations. On the other hand, without this step, such calculations would be dominated by the highly/lowly expressed genes. I get "better" results with scaling.

Correct answer by haci on May 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP