Sex estimation represents a fundamental step in forensic identification protocols, traditionally relying on morphoscopic pelvic assessment. However, the increasing integration of machine learning approaches and population-specific validation requirements necessitate comprehensive evaluation of alternative methodologies. This study provides the first direct comparison of established morphoscopic methods (Phenice 1969, Bruzek 2002) against a multivariable long bone linear discriminant analysis (LDA) model in a sample of 333 documented individuals from the CAL Milano Cemetery Skeletal Collection. Metric data were preprocessed to address missing values through imputation prior to analysis. The morphoscopic methods achieved high accuracy rates (Phenice: 96.9%, Bruzek: 97.7%) but showed significant exclusion rates due to preservation limitations (32.4% and 2.4% respectively). The long bone LDA model demonstrated comparable performance with threshold-dependent accuracy ranging from 95.2% (0.50 threshold) to 98.1% (0.88 threshold), while maintaining universal applicability across all specimens. Crucially, disagreement analysis revealed method-specific error patterns with minimal overlap in misclassified individuals, supporting complementary rather than redundant diagnostic signals. These findings validate long bones as a reliable alternative for sex estimation in fragmentary remains while establishing population-specific accuracy benchmarks for contemporary Italian forensic applications. The threshold-adjustable probabilistic framework offers operational flexibility for balancing classification certainty against sample coverage requirements.
Knecht et al. (Sat,) studied this question.