Number of alleles application
Instructions
The application requires the user to upload a .csv-file in FSI:Gen-format holding allele
frequencies for a number of STR loci. By assumption (only for the overall probabilities, cf below)
we assume that these loci are statistically independent with respect to the allelid distribution.
After upload of the data, the uploaded frequencies will be presented as a barbplot to confirm correct
and successful upload of the user supplied data. Simultaneously the required probabilities are computed
for the specified number of contributors and \(\theta\)-value. When the user changes these parameters
(using the sliders to the left) or select/deselect STR loci, the probabilities are recomputed on-the-fly.
The problem
Locuswise probabilities
The problem we want to solve is computing the probability that \(m\) individual show \(n\) alleles.
If we let \(N(m)\) denote the number of alleles show the quantity of interest is $$P(N(m) = n).$$
Cumulative probabilities
We is also interested in \(P(N(m)\le n)\) where \(n\) can be \(2(m-1)\), which means that the mixture could
appear as a \(m-1\) person mixture, $$P(N(m) \le n) = \sum_{i=1}^n P(N(m)=i).$$
Overall probabilities
The probabilities is computed for each locus, but due to independence assumption of the forensic markers we can
aggregate over loci by recursion (convolution): $$P(N(m)^{l+1} = n) = \sum_{i=1}^{2m} P(N(m)^l=n-i)P(N(m)_{l+1}=i),$$
where \(P(N(m)^{l}=n)\) is the probability that \(m\) individuals will show \(n\) alleles using \(l\) loci and
\(P(N(m)_l=i)\) is the probability of observing \(i\) alleles on the \(l\)th locus.
References
"On the exact distribution of the numbers of alleles in DNA mixtures" (2014).
Tvedebrink T. International Journal of Legal Medicine 128(3): 427-437