We mention the term gene here but it is, for the most part, unimportant to genetic genealogy. Genes are the (known) active regions of the DNA that are also known as the coding regions. Most of our DNA is simply unknown function (also called non-coding and historically junk). Sometimes also called inter-gene. So DNA strands consist of non-coding (or inter-gene) regions and genes. Genes themselves are much more complex in that they have intro and exit regions, coding and non-coding, and the like. SNPs may occur in any DNA region within and outside genes. But all this structure is not important to understand genetic genealogy. Suffice it to say, the DNA strand has more important areas than others and the SNP and STR markers that are tested for are scattered in and among them all. Gene "knowledge" pre-dates the understanding of DNA by hundreds if not thousands of years. The term genealogy itself is derived from this term that predates the understanding of DNA. Hence, it behooves us to mention it in this glossary.
It is known that more changes (markers) occur in the junk regions than in the gene regions. The genes are less than 1.2% of the DNA so this statistically would make sense. But major changes in a gene prevent the cell or organism from surviving and reproducing as well. SNPs and STRs can occur in gene or non-coding regions; albeit most (/all?) STRs seem to be in the non-coding region.
It is known that more changes (markers) occur in the junk regions than in the gene regions. The genes are less than 1.2% of the DNA so this statistically would make sense. But major changes in a gene prevent the cell or organism from surviving and reproducing as well. SNPs and STRs can occur in gene or non-coding regions; albeit most (/all?) STRs seem to be in the non-coding region.
External References
- NY Times article on DNA Garbage
- US NIH Non-coding summary