Driving behavior primitives serve as fundamental building blocks for modeling and semantically interpreting time-series driving behavior. Extracting behavior primitives is challenging due to the high dimensionality and complex interdependencies among behavioral variables, as well as the rich temporal dynamics of real-world driving maneuvers. This paper proposes an unsupervised two-stage framework that optimizes time-series segmentation and segment clustering to yield interpretable and context-aware behavior primitives. First, a Hierarchical Bayesian Model-based Agglomerative Sequence Segmentation (H-BMASS) method is introduced that decouples longitudinal and lateral driving behaviors and performs hierarchical segmentation. This design mitigates under-segmentation by ensuring that change points reflect genuine behavioral transitions. Second, to cluster driving segments of varying durations into a finite set of primitive types, an Integrating Numerical and Trend Discretization Latent Dirichlet Allocation (INT-LDA) model is developed. The model combines variables’ temporal trend discretization with numerical discretization to create symbolic representations of driving data, thereby preserving the essential time dependency of driving behavior and improving segment clustering accuracy. Evaluated on naturalistic driving data collected from a high-fidelity simulator, the proposed framework identifies five distinct behavior primitives with clear physical interpretations. The resulting primitives provide a compact, semantically rich representation of driving behavior, facilitating driver modeling, decision prediction, and scenario-based testing for autonomous vehicles.
Zhang et al. (Wed,) studied this question.