The complete human 基因组 sequence of the two X chromosomes and autosomes from the female tissue derived cell line has been completed. This includes the 8% of the 基因组 sequence that was missing in the original draft that was released in 2001.
完整 人类基因组 sequence of the entire 3.055 billion base pairs has been revealed by the Telomere-to-Telomere (T2T) Consortium. This represents the largest improvement to the human reference 基因组 released in 2001 by Celera Genomics and the International Human 基因组 Sequencing Consortium. That 基因组 sequence covered most of the euchromatic regions while either leaving out the heterochromatin regions or erroneous representation. These regions comprise 8% of the human 基因组 that has finally been revealed. The new T2T-CHM13 reference1 包括所有 22 个常染色体和 X 染色体的完整序列。这个新的参考序列还纠正了许多错误,并增加了大约 200 亿 bp 的新序列,其中包含 2,226 个基因拷贝,其中 115 个预测为蛋白质编码。
The current GRCh38.p13 reference genome has been as a result of two major updates, one in 2013 and the other one on 2019 on the 2001 Celera genome sequence. However, it still had 151 million base pairs of unknown sequence distributed throughout the 基因组, including pericentromeric and sub telomeric regions, duplications, gene and ribosomal DNA (rDNA) arrays, all of which are necessary for fundamental cellular processes. The new reference has been named as T2T-CHM13 as it comes from sequencing the DNA from CHM13 (Complete Hydatiform Mole) cell line and is performed by T2T consortium. The cell line is derived from abnormal fertilized egg or an overgrowth of tissue from the placenta in which women appears to be pregnant (false pregnancy), hence the sequence represents only of the two X chromosomes and autosomes of the female. Multiple sequence technologies have been put to use such as PacBio, Oxford Nanopore, 100X and 70X Illumina sequencers to name a few. The technological advances in sequencing have led to the sequencing of the remaining 8% as mentioned above.
T2T-CHM13 序列的唯一限制是缺少 Y 染色体。 该测序目前正在进行中,使用来自 HG002 细胞系的 DNA,该细胞系具有 46(23 对)XY 核型。 然后使用为纯合 CHM13 基因组开发的相同方法组装序列。
T2T-CHM13 的可用性作为新的参考 基因组 代表了一项重大突破,将有助于理解异染色质区域的作用,并有助于更详细地了解其对细胞过程的影响。 直到 Y 染色体测序完成,这将作为未来研究了解细胞过程和功能的参考基因组。
***
参考资料
- Nurk S、Koren S、Rhie A、Rautiainen M、Bzikadze AV、Mikheenko A 等。 一个人类基因组 bioRxiv 的完整序列 2021.05.26.445798; DOI: https://doi.org/10.1101/2021.05.26.445798
***