Los puntos clave no están disponibles para este artículo en este momento.
Abstract In interactive query formulation systems for SPARQL, suggested query extensions may result in queries with no answers, so-called unproductive queries. To avoid these, systems must detect and eliminate unproductive extensions. A straightforward approach is to check whether each extended query yields results, but this can be highly inefficient, especially for complex queries involving multiple costly joins. One possible solution is to construct an index that allows efficient checks. However, since queries can be arbitrarily large and complex, it is infeasible to build a complete index within finite space. A practical alternative is to index only the most common query patterns. This produces a smaller, finite index but may introduce inaccuracies in detecting unproductive extensions. The key challenge is balancing precision (the proportion of unproductive extensions correctly identified) against cost (the storage required for the index). In this article, we study this trade-off and present methods to construct indices with low cost while retaining high precision, given a dataset and representative query log. The problem is difficult because the space of possible indices is extremely large, making exhaustive search infeasible. Moreover, computing precision and cost for even a single index can be expensive. We therefore propose four heuristic-based search strategies to explore the index space efficiently, together with approximate cost and precision estimators. We evaluate these methods analytically and experimentally, using a benchmark based on Wikidata. Results show that our approaches identify non-trivial index configurations with high precision and low cost, making them practical for eliminating unproductive extensions in interactive systems.
Klungre et al. (Tue,) studied this question.