Efficient clustering of high-dimensional data sets with application to reference matching | Synapse