Key points are not available for this paper at this time.
One of the main challenges that data cleaning systems face is to automatically identify and repair data errors in a depend-able manner. Though data dependencies (a.k.a. integrity constraints) have been widely studied to capture errors in data, automated and dependable data repairing on these errors has remained a notoriously hard problem. In this work, we introduce an automated approach for dependably repairing data errors, based on a novel class of fixing rules. A fixing rule contains an evidence pattern, a set of nega-tive patterns, and a fact value. The heart of fixing rules is deterministic: given a tuple, the evidence pattern and the negative patterns of a fixing rule are combined to precisely capture which attribute is wrong, and the fact indicates how to correct this error. We study several fundamental prob-lems associated with fixing rules, and establish their com-plexity. We develop ecient algorithms to check whether a set of fixing rules is consistent, and discuss approaches to resolve inconsistent fixing rules. We also devise ecient algorithms for repairing data errors using fixing rules. We experimentally demonstrate that our techniques outperform other automated algorithms in terms of the accuracy of re-pairing data errors, using both real-life and synthetic data. 1.
Wang et al. (Wed,) studied this question.