TransWikia.com

Probability of all alleles represented in a sample

Biology Asked by Andrés Flores on December 20, 2020

I’m trying to wrap my head around some formulas presented in the 1992 paper from Chakraborty Sample Size Requirements for Addressing the Population Genetic Issues of Forensic Use of DNA Typing, but I have not been able to.

Specifically, the right hand side of formula (16) and it’s relation with formula (13).

$1-sumlimits_{i=1}^{k}(1-p_{i})^{2n}$ (13)

$[1-(1-p)^{2n}]^{r}geqslant1-alpha$ (16)

Formula 13 indicates the probability, for a locus with $k$ segregating alleles whose frequencies are contained in the vector $p$, that all alleles are represented in a given sample of size $n$, and the right hand side of formula 16 indicates the probability of $r$ alleles to be represented in a given sample of size $n$.

First of all, why, based on 13, the expression inside the summation indicates the probability of an allele of frequency p, to remain unobserved in a sample of size n?

I tried to understand this from the Hardy-Weinberg equation but did not have any success.

Second, Why to take the expression in (16) to the r’th power?

Which biological concepts am I missing?

One Answer

I'm going to strictly answer the questions, rather than step through the proof, because it involves a lot of formatting that I'm not familiar with. Other folks are welcome to edit this!

Equation 13

This equation assumes a diploid genotype, given by the $2n$ power with $n$ individuals. For anything with greater ploidy than mono-, it's mathematically simpler to determine the probability that an allele is not present. As an example, see this calculation of a triploid Hardy-Weinburg equilibrium equation. Using this simplification,

$P(single$ $allele$ $not$ $present)$ $= (1$ $- P(allele$ $present))$ ^ $(ploidy)$ ^ $(n)$

$= (1$ $- P(allele$ $present))$ ^ $(ploidy$ * $n)$

With $k$ segregating alleles, each allele has its own non-presence probability. The probability of total non-presence is $1 - (sum$ $of$ $P(each$ $non$-$presence))$

Equation 16

In this equation, the author describes the probability that all alleles are present at a given frequency. These allele presences are independent of each other and therefore multiplicative. Since $P(allele$ $present)$ is vectorized, this product can be simplified to $^r$

Correct answer by Punintended on December 20, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP