The alpaca (Vicugna pacos) genome poses assembly challenges due to pervasive complex repetitive sequences accounting for approximately 16% of the genome resulting in assembly errors, gaps, and collapsed sequences that have impacted both the quality and completeness of previous alpaca genomes. Here we used PacBio HiFi long reads, Hi-C chromatin conformation capture, optical genome mapping, and manual curation to construct VicPac4, a 2.6Gb alpaca genome of which 2.1Gb is assembled into 36 alpaca autosomes and the X chromosome, each represented by a single scaffold. While all chromosomes improved in size and contiguity, the X chromosome showed the greatest improvement and allowed demarcation of the 6.2 Mb pseudoautosomal region. VicPac4 incorporates several previously unresolved repetitive regions, such as telomeres in 30 chromosomes, nucleolus organizer regions (NORs) in two chromosomes, and 15 tentative centromeres. Notably, we identified a novel tandemly repeated satellite (SAT) exclusive to South American camelids (SAC). The SAC-SAT, with a 267 bp repeat motif, constitutes 2.42% of the alpaca genome and colocalizes with NORs in all SAC species. As most NORs remained unassigned, their numbers and chromosomal locations in camelids were studied by FISH and Oligo-FISH, revealing extensive dynamism across chromosomes, individuals and species. Resolution of NOR positional variation is essential to the understanding of rDNA-associated disease such as minute chromosome syndrome, which induces infertility in female alpacas. Until the development of telomere-to-telomere resources, VicPac4 stands as the most complete and accurate reference among South American camelids, offering a powerful resource to capture genetic variation in the species and advance genomics of alpaca biology and populations.
Mendoza et al. (Sun,) studied this question.