Key points are not available for this paper at this time.
One of the most commonly used clustering algorithms within the worldwide pharmaceutical industry is Jarvis−Patrick's (J−P) (Jarvis, R. A. IEEE Trans. Comput. 1973, C-22, 1025−1034). The implementation of J−P under Daylight software, using Daylight's fingerprints and the Tanimoto similarity index, can deal with sets of 100 k molecules in a matter of a few hours. However, the J−P clustering algorithm has several associated problems which make it difficult to cluster large data sets in a consistent and timely manner. The clusters produced are greatly dependent on the choice of the two parameters needed to run J−P clustering, such that this method tends to produce clusters which are either very large and heterogeneous or homogeneous but too small. In any case, J−P always requires time-consuming manual tuning. This paper describes an algorithm which will identify dense clusters where similarity within each cluster reflects the Tanimoto value used for the clustering, and, more importantly, where the cluster centroid will be at least similar, at the given Tanimoto value, to every other molecule within the cluster in a consistent and automated manner. The similarity term used throughout this paper reflects the overall similarity between two given molecules, as defined by Daylight's fingerprints and the Tanimoto similarity index.
Building similarity graph...
Analyzing shared references across papers
Loading...
Darko Butina
Journal of Chemical Information and Computer Sciences
University of Hertfordshire
Building similarity graph...
Analyzing shared references across papers
Loading...
Darko Butina (Tue,) studied this question.
www.synapsesocial.com/papers/69da2535387cf70698686462 — DOI: https://doi.org/10.1021/ci9803381
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: