Entity Resolution (ER) constitutes a challenging data integration task that is typically addressed through the Filtering-Verification framework. Filtering reduces the quadratic search space in an unsupervised manner that relies on heuristics, whereas verification performs matching, usually through a machine or a deep learning-based approach. Numerous solutions have been proposed for each step, but analyzing their combined performance constitutes a non-trivial task, due to technical and methodological challenges, while the literature typically examines them as orthogonal tasks. We facilitate the benchmarking of state-of-the-art verification algorithms under realistic settings, applying them to the candidate pairs generated by established filtering approaches from popular real-world datasets. To democratize this benchmarking, we developed an open-source, hands-off Web application, called SMBench, which allows users to perform a wealth of experiments through an intuitive user interface that requires no coding or ER expertise. SMBench is publicly available at https://smbench.kbs.uni-hannover.de , while its code is released through https://github.com/erbench/erbench . We delve into its frontend and backend, elaborating on the technologies used for their implementation as well as on the state-of-the-art ER methods they support. Using SMBench, we perform an extended experimental analysis that combines 3 filtering methods with 7 verification approaches, applying them to 9 datasets. The experimental results lead to interesting insights into the relative effectiveness, time and memory efficiency of the considered methods.
Astappiev et al. (Tue,) studied this question.