December 1, 2025Open Access

Unsupervised discovery of standard model structures via dimensionality reduction

Key Points

UMAP demonstrated superior clustering of particle families, showing its capability in analyzing intrinsic properties.
The study utilized dimensionality reduction techniques on a dataset of 31 particles, focusing on their distinct characteristics.
Principal component analysis failed to resolve particle relationships, unlike UMAP which highlighted distinct clusters.
These findings may support further advances in particle classification using unsupervised algorithms.

Abstract

Objective: To determine the efficacy of unsupervised machine learning algorithms in autonomously reconstructing the established classification structures of the Standard Model of particle physics. This study investigates whether dimensionality reduction techniques can identify these structures using only a dataset of intrinsic particle properties. Design: A dataset of 31 fundamental and composite particles was curated, detailing their physical properties (mass, charge, spin, and lifetime). Three prominent dimensionality reduction algorithms were applied to project the high-dimensional property space into a two-dimensional visualization: Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP). Results: The UMAP algorithm demonstrated superior performance, generating well-defined, distinct clusters that accurately correspond to the established particle families. In contrast, PCA was unable to resolve the non-linear relationships between categories, while t-SNE, despite identifying local groupings, failed to preserve the global structure between clusters. The success of UMAP confirms that the intrinsic physical properties of particles contain sufficient information for an advanced unsupervised algorithm to rediscover their fundamental classifications.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper