How can I take cell number into account to find total RNA expression?

Bioinformatics Asked by user9283 on September 27, 2021

I’m hoping to quantitatively show differences in total RNA expression for a gene in a cluster of interest between different experimental groups. My exported average RNA values for each experimental group in my macrophage cluster are shown below:

00 d Control: 10.43

00 d PLX5622: 14.48

07 d Control: 16.08

07 d PLX5622: 17.12

28 d Control: 15.90

28 d PLX6722: 19.40

However, the violin plot for Cd68 in the macrophage cluster looks like this

VlnPlot(combine.combined, features = ("Cd68"), pt.size = 1, idents = "Macrophages", = "orig.ident", = NULL, assay = "RNA"):enter image description here

The violin plot suggests that the Control groups actually have more total CD68 RNA in them than in the PLX5622 groups, which isn’t what the average RNA values show. Any code you can share showing me how to get total RNA per group would be an amazing help!

Thanks so much!

2 Answers

To me, it looks like the widest parts of the plots are higher in the treated. I don't see why you are so sure that the violin plots are hugely different from the calculated averages.

Either way, the differences are really small. Maybe not statistically significant.

BTW, in the future, you might want to disguise your treatment and gene of interest when on a public forum.

Answered by swbarnes2 on September 27, 2021

(I currently cant comment as I used up all my rep for a bounty.)

Could you explain how you calculate the average? In you plot the averages are between an expression level of 2 and 3, while you report values over 10. Given the data you show, there is no significant difference and I would guess p-values are above 0.8!

And you mention that the violin plots suggest the control groups have higher expression.. but how do you see that? For me they are identical with minimal trends that the PLX5622 groups are higher.

And to address your question regarding showing the total: No, you would want to show the mean or median, as the total is pointless, given deviating sample sizes.

EDIT: Overall, I have a feeling you uploaded the wrong picture!

EDIT2: I think I got your point now: The difference in mean is caused by an increased number of outliers in your controls at the value 0. Despite those, the means are identical and this is what can be seen through the violin plots. You should rather try to diminish the effect of these outliers.

Answered by KaPy3141 on September 27, 2021

Add your own answers!

Related Questions

SLURM script for running RSEM star fails

0  Asked on April 24, 2021 by angelo


Genomic relationship matrix explanation

0  Asked on April 21, 2021


Fastq: how can I check if they are from DNA or RNAseq data?

2  Asked on April 15, 2021 by emma-athan


ATAC seq density calculation

2  Asked on April 15, 2021


plink: –update-name vs. editing the BIM

2  Asked on April 11, 2021 by coderguy123


About getting rs id from chromosome and position

0  Asked on April 10, 2021 by susuauidikd


How to convert a Pileup file to VCF format with Hg19 alignment

1  Asked on April 9, 2021 by samir-bouftass


Ask a Question

Get help from others!

© 2023 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir