TransWikia.com

RNAseq biological replicates not clustering in PCA plots

Bioinformatics Asked on December 13, 2021

I have RNAseq data from 4 samples with 3 biological replicates per sample. I am currently trying to do the differential expression analysis with DESeq2 but the biological replicates will not cluster together when I make the PCA plot or correlation heatmap. This is my first time with RNASeq analysis and so am not sure what the best route forward is? I would like to avoid repeating the experiment with new samples if possible!

My pipeline prior to DESeq2 was the following:

FastQC quality check -> Trimmomatic -> Kallisto

I used tximport to convert kallisto files into suitable format for DESeq2

PCA plot with rlog transformed data

3 Answers

You used Kallisto for alignment. I think Kallisto reports TPM values, are you using this value? DEseq2 uses count data, so I am not sure whether these two methods are compatible.

Also, I agree with previous answers that your PCA actually looks OK. One possible way to improve is to choose top variable genes. For example, you can try top 3,000, 5,000, 7,000 genes and so on. The idea is that for the genes that do not show much variation between samples, including them in PCA may just introduce noise. You can also try to color samples in your PCA by some other variables, like batch, sequencing depth and so on to trouble-shooting.

Answered by Phoenix Mu on December 13, 2021

Clustering of replicates looks decent enough to me, so you should be abl to push ahead, but I agree the tissues are grouping, which could mask any differences based on sex or genotype.

You might consider the EdgeR package for DE analysis here. It allows for flexibility when making complex comparisons while accounting for tissue/batch effects. I've had good luck with using it to compare across batch effects from complex experiments like this.

Answered by neonglow on December 13, 2021

PC1 is 81% of the variance?

This PCA plot confirms that different tissues are different. You probably already knew that. I'd make more PCA plots that are tissue specific. That will be more informative than these.

Personally, I'd also not do DESeq on all of these samples together, unless the goal of your experiment really is to learn the differences between these two tissues.

Answered by swbarnes2 on December 13, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP