5.4 Example: DNA match

Suppose that a high quality full DNA profile is recovered from a blood stain at a crime scene. This is known as the questioned profile. The profile is analysed and is found to contain only one person’s DNA, making it a single donor profile. A suspect is detained and their DNA profile is taken. This is known as the reference profile. The reference profile is found to match the questioned profile. Consider the following competing source-level propositions:

\(H_p\): the suspect is the source (of the questioned profile),
\(H_d\): someone other than the suspect is the source (of the questioned profile).

In this situation the DNA match is the evidence for which we would like probabilities conditioned on the above propositions. The LR is given by

\[\text{LR}=\frac{\text{probability of a match assuming the suspect is the source}}{\text{probability of a match assuming someone other than the suspect is the source}}.\] To obtain the LR, we need to obtain values for the above conditional probabilities. Consider the numerator first.

The probability of obtaining a match assuming that the suspect is the source is usually set to 1; it is considered to be certain that a match would be obtained if the suspect were truly the source. This is reasonable although it is not strictly true. There is always the risk of a false positive or other laboratory or technical errors occurring, but in practice it is typically assumed that the risks of these errors are negligible, especially for high-quality single donor full profiles. If the questioned profile is not high quality, or another factor affecting the integrity of the match is present for a given case, then the expert might assign a value less than 1 to the probability of a match assuming the suspect is the source.

The probability of obtaining a match assuming that someone other than the suspect is the source is more complex to assign a value to. It requires the probability of a match if we compare the questioned profile with that from a person other than the suspect selected at random from the population to which the suspect belongs. This is known as the random match probability (RMP). The RMP reflects how common the recovered profile is in the relevant population for the case. The more common the profile, the higher the RMP and the lower the LR when the formula is applied; The LR is inversely related to the RMP. The logic behind this is the following: the more common a characteristic is in a population, the worse that characteristic is at discriminating between source level propositions. More common DNA profiles are worse at discriminating between source-level propositions and so they result in smaller LRs when compared to uncommon profiles.

The RMP is calculated using frequency databases of profiles from specific ethnic groups of people, since ethnic group is a large factor in determining genetic variation. The genetics behind DNA evidence is highly discriminating between individuals, so the RMP is usually very small. This results in LRs which can be very large for competing source level propositions for single donor full profiles.