The Telomere to Telomere project, centered at UCSC, is focused on developing the technology and techniques to repeatedly sequence complete human genome sequences. It is co-led by Karen Miga of UCSC (who did her PhD work on this subject at CWRU), and Adam Phillipy of NHGRI.
The T2T project has been instrumental in working with leading vendors in the field to co-develop the technology to truly construct a complete, exact replication of a human genome sample. As the name states, telomere to telomere. Up to this point, NGS technology was used to study many samples from many cultures. But was limited to mapping to a common, linear model of the human genome. One built mostly out of a small subset of Western Europeans. The 3rd generation sequencing technology refined here, along with pangenome graph model construction techniques, is allowing both (a) that final 5% of the genome to be accurately recreated for each sample, and (b) a true de-novo construction of the whole samples genome. A huge benefit has occurred with patriline study as over 50% of the yDNA was determined (analyzed) using NGS techniques.
Key items used to develop the models that are expanded on in the HPRC effort are:
T2T is also used as a generic term for their first reference developed under the project. T2T v2.0 was released in January 2022 as a complete 24 sequence model of the DNA in a human cell. Almost entirely from the well studied CHM13 cell line but which included a last minute "slap-on" of the HG002 cell line Y chromosome. Since that initial release, there are now well over 100 cell lines with full T2T models available and being captured in a developing pangenome graph.
The T2T project has now, since developing the initial tools and techniques, morphed with the Human Pangenome Project (HPP) into the Human Pangenome Reference Consortium (HPRC).
The T2T project has been instrumental in working with leading vendors in the field to co-develop the technology to truly construct a complete, exact replication of a human genome sample. As the name states, telomere to telomere. Up to this point, NGS technology was used to study many samples from many cultures. But was limited to mapping to a common, linear model of the human genome. One built mostly out of a small subset of Western Europeans. The 3rd generation sequencing technology refined here, along with pangenome graph model construction techniques, is allowing both (a) that final 5% of the genome to be accurately recreated for each sample, and (b) a true de-novo construction of the whole samples genome. A huge benefit has occurred with patriline study as over 50% of the yDNA was determined (analyzed) using NGS techniques.
Key items used to develop the models that are expanded on in the HPRC effort are:
- Experimental 250 base-pair read length paired-end NGS sequencing from Illumina (very high reliability)
- Circular Consensus Sequencing (CCS) technology from PacBio's HiFi series to get highly reliable, 20-30K read length sequences
- Ultra-long read, low reliability sequences from Oxford Nanopore Technology yielding 100K to 1Million long sequences
- Assembly techniques to merge these to create the consensus view of the complete, accurate DNA strand
T2T is also used as a generic term for their first reference developed under the project. T2T v2.0 was released in January 2022 as a complete 24 sequence model of the DNA in a human cell. Almost entirely from the well studied CHM13 cell line but which included a last minute "slap-on" of the HG002 cell line Y chromosome. Since that initial release, there are now well over 100 cell lines with full T2T models available and being captured in a developing pangenome graph.
The T2T project has now, since developing the initial tools and techniques, morphed with the Human Pangenome Project (HPP) into the Human Pangenome Reference Consortium (HPRC).