Los puntos clave no están disponibles para este artículo en este momento.
The suffix array is arguably one of the most important data structures in sequence analysis and consequently there is a multitude of suffix sorting algorithms. However, to this date the GSACA algorithm introduced in 2015 is the only known non-recursive linear-time suffix array construction algorithm (SACA). Despite its interesting theoretical properties, there has been little effort in improving GSACA ’s non-competitive real-world performance. There is a super-linear algorithm DSH , which relies on the same sorting principle and is faster than DivSufSort , the fastest SACA for over a decade. The purpose of this article is twofold: We analyse the sorting principle used in GSACA and DSH and exploit its properties to give an optimised linear-time algorithm, and we show that it can be very elegantly used to compute both the original extended Burrows-Wheeler transform ( eBWT ) and a bijective version of the Burrows-Wheeler transform ( BBWT ) in linear time. We call the algorithm “generic,” since it can be used to compute the regular suffix array and the variants used for the BBWT and eBWT . Our suffix array construction algorithm is not only significantly faster than GSACA but also outperforms DivSufSort and DSH . Our BBWT -algorithm is faster than or competitive with all other tested BBWT construction implementations on large or repetitive data, and our eBWT -algorithm is faster than all other programs on data that is not extremely repetitive.
Olbrich et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: