Using big indirect data (BID) is a relatively inexpensive and effective way to predict the unknown design parameters at a target site. This study analysed the Benchmark Problem #1 in Otake et al. (2025) which provides four real-world BID datasets and one validation dataset. The undrained shear strength parameter ( s u ) values were predicted using the Bayesian Markov Chain Monte Carlo simulation method. The borehole data was first divided into two groups based on the spatial trend of soil properties and layer division for each borehole was performed using a clustering method. Outliers were detected and removed to ensure reasonable random field parameter estimation. Autocorrelation of s u and cross-correlation between s u and other parameters were used to generate posterior samples of s u at target points using the Bayesian method. The results show that the Local-BID-V/4 dataset yields the highest prediction accuracy. In contrast, Cluster-BID/4 dataset underestimates the uncertainty of the s u due to limited data whereas Global-BID/4 dataset overestimates the uncertainty of s u because it lacks data similar to the target dataset. These results highlight the importance of selecting relevant and similar BID for an accurate site characterisation.
Qi et al. (Sun,) studied this question.