DBSCAN is widely used to identify structured regions in unlabeled data, but its performance depends critically on the selection of the neighborhood parameter ε. Traditional heuristics for estimating ε often become unreliable in high-dimensional or varying-density settings because they rely heavily on local geometric criteria and may fail under smooth transitions or topological ambiguity. This work presents a three-level perspective on DBSCAN hyperparameter selection. At the algorithmic level, ε controls neighborhood connectivity and structural transitions in clustering. At the modeling level, the ordered k-distance signal is approximated through a surrogate dynamical estimation framework inspired by a mass–spring–damper system. At the causal level, the resulting estimator is interpreted through interventions on its internal threshold-selection mechanism. The proposed method models the variation of ε using ordinary differential equations defined on the ordered k-distance signal, enabling analysis of structural transitions in density organization via a surrogate dynamical representation. System identification is performed using L-BFGS-B optimization on the smoothed k-distance curve, while the system dynamics are solved with the fourth-order Runge–Kutta method. The resulting estimator identifies transition regions that are structurally informative for ε selection in DBSCAN. To analyze the estimator at the intervention level, Pearl’s do-calculus is used to compute the Average Causal Effect (ACE). The method was evaluated on synthetic benchmarks and on the Covtype dataset, including scenarios with multi-density overlap and dimensionality up to R10. The resulting ACE values, +0.9352, +0.5148, and +0.9246, indicate that the proposed estimator improves intervention-based ε selection relative to the geometric baseline across the evaluated datasets. Its practical computational cost is dominated by nearest-neighbor search, behaving approximately as O(NlogN) under favorable indexing conditions and degrading toward O(N2) in high-dimensional or weak-pruning regimes.
Garcia-Sanchez et al. (Wed,) studied this question.