There is a lot more interest and reporting about the (Human) Pangenome since the T2T project refined a method to generate true, de-novo assembled reference genomes of a given DNA sample. The link below to the NIH paper is a great overview and we borrow its main graphic to display here as well. See also our glossary entry on the Human Pangenome Project.
Up until now, the human reference genome has been primarily based on a few, like samples of human genome samples. Modified over time to try and represent as original, old an example as possible. Instead of a single, modern, living person. But it is known there is a lot more diversity in the genome than the single reference can represent. Just the creation of the first two complete human genomes of a single sample have led to many more discoveries of likely disease-causing variants. The first 47 Y chromosome samples alone have divulged that they have different lengths spanning from 45 to 80 megabases. This as opposed to the current reference of around 60 megabases. What the human pangenome attempts to do is capture all this diversity in a graph form. Basically splitting the currently linear map of a single chromosome up into segments. Many of which will be common between all. But many areas of great diversity. And just to clarify, the current human reference genome does not incorporate as much diversity as the colors may indicate. So while the current short-read sequencing technology does not really allow for de-novo assembly into the original DNA strand form of the sample, having the diversity of the pangenome helps more closely capture what the sample likely is and thus more closely find small variances between people in recent times. Just the first true yDNA chromosome map made from the HG002 sample shows a large section of DNA that does not exist in the current reference genome. And SNPs in that area, for people who map to it, can be used to help create a more refined phylogenetic tree.
Up until now, the human reference genome has been primarily based on a few, like samples of human genome samples. Modified over time to try and represent as original, old an example as possible. Instead of a single, modern, living person. But it is known there is a lot more diversity in the genome than the single reference can represent. Just the creation of the first two complete human genomes of a single sample have led to many more discoveries of likely disease-causing variants. The first 47 Y chromosome samples alone have divulged that they have different lengths spanning from 45 to 80 megabases. This as opposed to the current reference of around 60 megabases. What the human pangenome attempts to do is capture all this diversity in a graph form. Basically splitting the currently linear map of a single chromosome up into segments. Many of which will be common between all. But many areas of great diversity. And just to clarify, the current human reference genome does not incorporate as much diversity as the colors may indicate. So while the current short-read sequencing technology does not really allow for de-novo assembly into the original DNA strand form of the sample, having the diversity of the pangenome helps more closely capture what the sample likely is and thus more closely find small variances between people in recent times. Just the first true yDNA chromosome map made from the HG002 sample shows a large section of DNA that does not exist in the current reference genome. And SNPs in that area, for people who map to it, can be used to help create a more refined phylogenetic tree.
External Resources
- NIH Overview Paper on the Human Pangenome significance