Science & Tech Spotlight:
Probabilistic Genotyping Software
GAO-19-707SP: Published: Sep 16, 2019. Publicly Released: Sep 16, 2019.
Faced with minimal or complicated DNA evidence—a sample with DNA from more than 1 person, for example—investigators can use a forensic tool called probabilistic genotyping software. It estimates how likely it is that the genetic material in the sample is linked to a person of interest.
Users of this software include crime labs, police, and the FBI. However, more work is needed to establish its scientific validity. For example, there are no standards for using and interpreting results, and different software packages may produce different results. Inappropriate use may have implications for constitutional due process protections.
Boxes of evidence on a metal shelf
What is it? Probabilistic genotyping software (PGS) is used in criminal investigations to help link a genetic sample — such as a sample from crime-scene evidence — to a person of interest (POI). It facilitates genetic analysis in complicated situations, such as when a sample is partially degraded or contains DNA from more than one person.
How does it work? The usual first step is to gather genetic material from both the evidence and the POI. Both samples are then separately analyzed using a process that examines multiple regions of DNA whose length varies among individuals. Investigators can then create genetic profiles that allow them to distinguish among individuals using this variability.
Next, laboratories compare the genetic profile of the evidence with that of the POI. They often do this with a computer simulation of many different scenarios (fig. 1). PGS provides a probability that the evidence gathered would have led to the evidence profile that was obtained, if the POI were – or were not -- a contributor to the sample. Investigators can use the relative values of these two probabilities to establish the strength of the evidence in favor of, or against, the POI.
Figure 1. Genetic profiles consist of “peaks.” The peak heights represent the quantity of DNA fragments, and the peak’s horizontal position corresponds to the length of the DNA fragments. The top graph shows the POI’s DNA profile. Scenario A indicates the possibility that the DNA from the POI (orange) could have been mixed with DNA from one or more other contributors (blue) to generate the evidence sample. Scenario B indicates the possibility that DNA from other contributors (green and red) could have generated this sample, resulting in the same evidence profile.
How mature is it? PGS was available by the late 1990s, yet it is not fully mature. There are several software packages for PGS, some open source, some commercial. About 100 laboratories in the Unites States reportedly use PGS. PGS analyses are used by law enforcement offices, crime or forensics laboratories, defense attorneys, and law offices at the county, city, state, and federal levels. For example, according to a President’s Council of Advisors on Science and Technology (PCAST), the FBI started using a PGS package called STRmix in 2015.
PCAST stated that, in order to establish the scientific validity of PGS, outside groups need to conduct scientific evaluation studies, in addition to the developers and affiliated laboratories that typically conduct such studies currently. PCAST also recommended publication of study results.
• Usable on a variety of samples. PGS allows for interpretation of genetic material that is degraded, comes from multiple people, or is present at low concentrations, such as when a person only touched a piece of evidence (instead of leaving blood behind, for example).
• Scenario analysis. PGS also could facilitate analysis of a large number of scenarios and may help ensure consistency in laboratory methodology.
• False negatives. When a genetic marker is present but at a concentration too low to detect, it may produce a false negative result (fig 2).
Figure 2. A peak (orange) is below the threshold (dotted line) for recognizing peaks, which may inadvertently exclude the POI during analysis. The rest of the peaks below threshold could represent background “noise” or minute quantities of DNA fragments.
• False positives. Conversely, when contamination or random “noise” gives the appearance of a marker that is not actually present, it can lead to a false match.
• Limited information content. PGS cannot attribute a DNA sample to a particular event. For example, a high likelihood of matching the POI does not mean the POI handled the object at a particular time or during a particular incident.
• Lack of clarity. It can be challenging to present results in a way that is meaningful to a lay audience. For example, if the test shows that the POI match is 500,000 times more likely than a match to a random person, how a non-specialist would interpret this statistic is unclear.
• Lack of consistency. Different software packages may yield different results from the same sample. In some cases, even the same software package can yield varying results, although this may not invalidate the results. One of the causes for lack of consistency is the lack of standards for using and interpreting PGS results.
• Lack of validation. It is challenging to validate PGS for certain scenarios, such as when a sample contains DNA from more than three people, or if the amount or quality of DNA decreases. If outside parties cannot validate the methods or examine how validation was conducted, legal questions could arise. For example, one news report suggested that results from a single PGS were used as the sole physical evidence in a trial that ended in conviction. However, the defense argued that the software company did not make its source code available for examination. Additionally, without validation, one may not know specifically why different methods produce different results.
Figure 3. Lack of clarity, standards, and validation studies may raise legal concerns about the use of PGS results.
Why This Matters
New developments in software to analyze contaminated or partly degraded DNA could greatly facilitate criminal investigations. However, the validity of the analysis and the implications for constitutional due process protections remain unsettled.
Policy Context and Questions
PGS use in forensic analyses is increasing, but PGS results reportedly can be used with only limited confidence under certain circumstances. Some key questions for consideration include:
• In what situations is PGS useful, and when should it be avoided or used with caution?
• What are the gaps in empirical evidence that need to be filled to increase confidence in PGS results for use in criminal or civil trials, and what is the cost and feasibility of addressing these gaps?
• How are federal agencies evaluating and using PGS and what should the federal role be?
• What additional validation work is needed to expand use of forensic PGS?
For more information, contact Timothy M. Persons, Ph.D., Chief Scientist, at (202) 512-6412 or firstname.lastname@example.org.