A Quantitative Assessment of Shoeprint Accidental Patterns with Implications Regarding Similarity, Frequency and Chance Association of Features

Jacqueline A. Speir
West Virginia University

Executive Summary

The National Academy of Sciences (NAS) 2009 report on Strengthening Forensic Science in the United States revealed several research recommendations related to forensic footwear examinations, including the need for greater clarity concerning the variability of outsole class and individual (randomly acquired) characteristics (RACs), the validity and reliability of current methods and practices, the relative frequency of features, and the appropriate use of statistical standards (NAS, 2009). In response to this request, this project performed foundational research to clarify the empirical frequency and shape distribution of randomly acquired characteristics on outsoles collected from a general population.

To achieve this goal, an outsole database was generated, resulting in summary statistics and frequency estimates on 72,306 randomly acquired characteristics extracted from 1,300 outsoles. The subsequent results are based on a combination of automated and analyst-derived image extraction and processing tools, with the human-dependent step of RAC detection and marking. Given some unavoidable subjective steps in the image processing chain, inter- and intra-analyst variability in RAC marking was assessed using a quality control/assurance program that included the duplicate marking of 5,477 randomly acquired characteristics across 160 shoes (320 RAC maps). The results indicate that RAC detection is the largest variable not easily controlled (even with training), but when RACs are equally detected in repeat analyses, they are marked relatively consistently, with mean polar coordinate localization differences of less than r ± 0.2mm, and ? ± 0.1o, and shape attribution (e.g., isometric, elongated or irregular) agreement nearly 75% of the time.

Post-detection and extraction, each RAC was broadly characterized in terms of its degree of linearity, circularity and triangularity. Using geometric shape classification rules, automated shape attribution was compared to human-perceptual assignments and found to be in agreement between 68% to 95% of the time, across 1,352 comparisons, and depending on the complexity of the dataset presented for analysis. Overall, the results indicate limited utility in classifying complex features into prescribed shape classes (such as circles, lines, curves, rectangles, triangles, etc.), and that future work should consider alternative mechanisms (such as shape clustering), as opposed to strict categorization, as a means of grouping randomly acquired characteristics in terms of shape similarity.

Next, outsole size and shape normalization was performed. This step, although not ideal, was deemed unavoidable in order to create sufficient power in the inter-comparison of all 1,300 shoes in the database, regardless of outsole style/shape and size. Post normalization, each RAC was localized to one of 990 possible spatial bins, each 5mm x 5mm in size. Post-localization and binning, estimates of co-occurrence and similarity were possible. This was accomplished by computing the Fourier descriptor of each RAC, and for RACs with positional co-occurrence, pairwise comparisons were performed using five similarity metrics (Euclidean distance (ED), Hausdorff distance (HD), modified cosine similarity (MCS), matched filter (MF), and modified phase only correlation (MPOC)). Variation in similarity score as a function of RAC shape, perimeter and area were computed and are reported, along with receiver operator and cumulative match characteristic curves that provide insight on the use of numerical metrics This resource was prepared by the author(s) using Federal funds provided by the U.S. Department of Justice. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. to rank-order RACs from different sources. Results indicate superior performance with distance metrics (HD and ED), making Hausdorff distance the best candidate (of those metrics compared) for computing score-based likelihood ratios. More specifically, it was noted that both HD and ED had statistically indistinguishable AUCs (area under the curve) of 0.82, and that both were significantly better than MCS, MF and MPOC. However, alternative metrics, including deep learning, might prove equally or more useful, and additional work is needed to fully appreciate the strengths and weaknesses associated with the use of numerical shape comparisons within the field of forensic footwear examinations.

Equipped with RACs with known positional co-occurrence and shape similarity, three questions related to chance co-occurrence were asked. First, what was the empirical frequency of finding a pair of RACs with positional similarity anywhere on an outsole within this dataset? Second, what was the empirical frequency of selecting two shoes at random and finding shape similarity at a specific location? Lastly, what kind of numerical/quantitative similarity is expressed by RACs with positional co-occurrence?

With regard to positional co-occurrence anywhere on an outsole, the empirical frequency is extremely high (1 in 4 for elongated features, to 1 in 10 for isometric features). However, when positional co-occurrence in a specific location is queried, median results range from 1 in 2,080 for elongated features, to 1 in 9,279 for irregularly-shaped features. In addition, the worst case scenario (greatest chance association) was found to be 1 in 281 for elongated features, while the best case scenario (lowest chance association) was found to be 1 in 844,350 or better (this value is limited by the size of the database).

However, RACs with positional co-occurrence (and even identical shape categorizations), are not necessarily geometrically similar (e.g., two linear elements could vary in orientation, length, thickness, curvature, etc). Thus, the mathematical similarity of RACs with coincidental positional and shape similarity were computed based on 6,993 known match comparisons, and 3,239,114 known non-match comparisons. The results indicate that 13% of known non-matches have likelihood ratios (LR) greater than 1.0, but even for known non-match RACs with some degree of numerical similarity, very few were found to be visually indistinguishable. In fact, to assess the possibility of numerical versus visual confusion, a subset of 19,800 of the most similar RACs from different sources (1,000 shoes) were visually compared. More specifically, all pairwise comparisons for RACs from a subset of 1,000 shoes were compared and ranked (for a total of 2,022,595 known non-matches), and the five most similar RACs per spatial bin were examined to determine visual differentiability. Of the almost 20,000 visual comparisons that were performed, all but 25 pairs were deemed distinguishable based on RAC geometry (orientation, size, shape, complexity, etc.), with an associated probability of confusion on the order of 1.2E-05 (or 0.001% of the time, assuming an effective ranking). Moreover, when the 25 indistinguishable RAC pairs were further characterized, all were found to be differentiable based on shoe class characteristics (differences in make/model, size and/or degree of wear).

In conclusion, there is evidence to assert that RACs possess a high degree of forensic discrimination potential. However, the widespread and general applicability of any associated probability and chance association computed based on this study must be considered within the confines that bound the research dataset and methodology. More specifically, all results are This resource was prepared by the author(s) using Federal funds provided by the U.S. Department of Justice. Opinions or points of view expressed are those of the author(s) and do not necessarily reflect the official position or policies of the U.S. Department of Justice. a function of the nature of the footwear population studied, which was predominately athletic gear (86%), men’s wear (72%), and of sizes 9 through 11 (53%). Moreover, all shoes have been inter-compared without regard for class characteristics, which required the less-than-ideal step of normalization as a function of outsole size and shape. As such, the probability of confusion, reported to be on the order of 1.2E-05, must be interpreted within the confines of this footwear population, and with full understanding of the nature of the data analyses that lead to these conclusions.


  • Executive Summary
    • Research Overview
    • Review of Selected Works
    • Research Methodology & Results
      • Data Acquisition
      • Pre-processing of High Quality Prints
      • Registration
      • Segmentation
      • Processing
      • Outsole Size & Shape Normalization
      • Shape Descriptor
    • Similarity Assessment
      • RAC Loss
      • Comparison
      • Individual RAC Similarity
      • Similarity as a Function of RAC Shape
      • Similarity as a Function of RAC Size
    • Differentiating between KM and KNM Crime Scene RACs using Similarity Metrics
      • RAC Map Correlation
    • Chance Co-occurrence
      • Use of Similarity in the Web Application
    • Impact, Outcomes, Evaluation & Dissemination
    • Added Value
    • Publications & Abstracts
    • Meetings, Presentations & Invited Talks
      • Limitations
    • Appendices
      • Bibliography
      • Author Response to Reviewer Comments

    Read/download the study (62 pages)