Locally Repairable Codes (LRCs) have become the dominant design in wide-stripe erasure coding storage systems due to their excellent locality and low repair bandwidth. In such systems, the repair degree—defined as the number of helper nodes contacted during data recovery—is a key performance metric. However, as stripe width increases, the probability of multiple simultaneous node failures grows, which significantly raises the repair degree in traditional LRCs. Addressing this challenge, we propose a new family of codes called TFR-LRCs (Locally Repairable Codes for balancing fault tolerance and repair efficiency). TFR-LRCs introduce flexible design choices that allow trade-offs between fault tolerance and repair degree: they can reduce the repair degree by slightly increasing storage overhead, or enhance fault tolerance by tolerating a slightly higher repair degree. We design a matrix-based construction to generate TFR-LRCs and evaluate their performance through extensive simulations. The results show that, under multiple failure scenarios, TFR-LRC reduces the repair degree by up to 35% compared with conventional LRCs, while preserving the original LRC structure. Moreover, under identical code parameters, TFR-LRC achieves improved fault tolerance, tolerating up to g+2 failures versus g+1 in conventional LRCs, with minimal additional cost. Notably, in maintenance mode, where entire racks may become temporarily unavailable, TFR-LRC demonstrates substantially better recovery efficiency compared to existing LRC schemes, making it a practical choice for real-world deployments.
Yan et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: