Key points are not available for this paper at this time.
Strings are ubiquitous in computer systems and hence string processing has attracted extensive research effort from computer scientists in diverse areas. One of the most important problems in string processing is to efficiently evaluate the similarity between two strings based on a specified similarity measure. String similarity search is a fundamental problem in information retrieval, database cleaning, biological sequence analysis, and more. While a large number of dissimilarity measures on strings have been proposed, edit distance is the most popular choice in a wide spectrum of applications. Existing indexing techniques for similarity search queries based on edit distance, e.g., approximate selection and join queries, rely mostly on n-gram signatures coupled with inverted list structures. These techniques are tailored for specific query types only, and their performance remains unsatisfactory especially in scenarios with strict memory constraints or frequent data updates. In this paper
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhenjie Zhang
Nankai University
Marios Hadjieleftheriou
AT&T (United States)
Beng Chin Ooi
Ningbo University
National University of Singapore
AT&T (United States)
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Sun,) studied this question.
synapsesocial.com/papers/6a1be1abd54006be995f2a15 — DOI: https://doi.org/10.1145/1807167.1807266