Loading...
 

Genetic Genealogy Matching

So you decided to test your DNA, ordered a test, and maybe even got your results back.  Now what?  Most do not realize that Genetic Genealogy Testing is just the start of a long, drawn-out process.  Genetic genealogy matching is where the real work begins. But that is not a negative.  As mentioned elsewhere on this site, the real benefit to analyzing your DNA is to then match it in a database of other testers. You then work with those matches to collaborate and discover what the common, shared ancestors are. Genetic genealogy matching is the second, long-term, and difficult part of genetic genealogy. You simply quickly start with Genetic Genealogy Testing but then spend a lifetime with Genetic Genealogy Matching. Only test companies that also support a match database are true ((genetic genealogy) company. But as the genetic genealogy market is not enough on its own, having it as separate and distinct services is often seen.

Genetic Genealogy is a resource that keeps on giving up more of its hidden information with the more work you put into it. Genetic Genealogy Testing is like being taken to a rock-faced mountain side. Your matches are like glimpses of gold on the surface rock showing to you. You now have to work on each one to see if it leads to a gold vein buried deep in a mountain of rock. You gently chip away deeper and deeper into the mountain to follow the vein and continue to uncover more information. It takes a long time, possibly never ending, to follow that gold vein into that mountain. And likely there will be many starts in a direction that dead ends. Sometimes the vein is barely visible, other times a chip opens up to a large cache of easily extracted gold that is connected to other veins you are working on (that is, information about your ancestors or the "mother load"). So you are a miner. You have to dig. Be mentally prepared to think of it this way. It is a long, hard process but usually pays off in the end. (Note that for some, there are not any glimmers of gold after testing. That is, no real matches. Just as panning is fraught with dry spells if you are not near a gold source. So can be the case for DNA Matching. In those cases, you often develop a process to solicit other possible testers from your tree to try and develop a match pool.) Note: This analogy was invented by our co-founder, Randy Harr, who has lived in California / Silicon Valley for 30 years and among the gold / silicon rush fever it entails.

It is not uncommon these days for Autosomal tests to return with over a hundred matches of distant relatives; maybe thousands to tens of thousands or more for some with autosomal on AncestryDNA.  If you multiply that by 5 or 10, as most will start getting others tested to better triangulate and work to figure out their results, then the process quickly becomes unmanageable. So to continue to mine the value from your new genealogical resource, you need to work on a systematic way to analyze your results.  The analysis differs for different types of tests and results. And varies over time as others test and match to you. Or you are able to associate more with ancestors in your past to enable you to make sense out of those far flung, distant matches. Or you test deeper with advanced tools to get a better signature of your DNA. And, the technology for analysis is rapidly evolving.

A key thought to get ingrained is that genetic genealogy test results are only helpful when compared to others AND analyzed along with your own genealogical body of work and that of your matches. A corollary and likely more important idea is that genetic genealogy is only helpful if you have other testers DNA AND genealogical results to compare with. The less genealogical work you have done, the harder it will be to analyze and use your results. Kind of a Catch-22 we know, but the real issue. But if you do not have that genealogical work, all hope is not lost. Just think of the case for adoptees! In the cases of adoption or other unknown biological parents, you must just work that much harder at comparing, sharing and investigating deeply with all the potential leads. Even if you are not adopted, be prepared to help someone who is to help figure out your common past if you match with them. Once regular "paper" records are exhausted on your Brick Wall, genetic genealogy does offer a hope to solve the problem. But often not without much work on many fronts.

Genetic genealogy testing has actual, raw data results. In basic form, the SNP genetic markers we have described elsewhere. But most do not work with this directly. Instead, most test companies provide a Matches that has compared you with other testers in their database. Some only include testers from their company. Others allow results from other test companies to be transferred in and compared as well. Understanding the raw data, the test process, genetic inheritance and the matching techniques is something to acquire as you gain experience. The key to start with is the Matches that includes some sort of ranking based on a measure of the match strength you have with others.

Because the techniques are so unique to the type of testing, we really cannot discuss matching except in context to the type of test. So like on the genetic genealogy testing page of before, we discuss match analysis separately with each major type of test. Most tests are returning tested SNP values that then need to be transformed into something else. For yDNA testing, STR values are often tested as well or initially on their own. For Autosomal SNP testing, there are hundreds of thousands of tested values across the 22 autosomes and xDNA . Much fewer if yDNA and mtDNA SNPs tested as well and thus handled differently. Because so many SNP values are tested, the match process by users is not to see what individual SNPs match, but to see where large areas of the chromosome match to another tester across hundreds to thousands of SNPs. This process of building up matching areas is called matching segment creation and a key, underlying principle of autosomal match analysis. Simply, match lists are formed by summing up the matching segment lengths to create a total match strength.

''NOTE: For as crucial and important a topic this is, this page is a nostly a stub that needs lots of work. Likely each matching type will be flushed out into separate sub-pages.''

Autosomal SNP (atDNA SNP)

Most of the test companies try and provide a simplified, high level view of the results. To really understand, track and verify (which you must do), you need to develop a deeper understanding and ability to look at the RAW results and their implications. Key issues are what is returned, how is matching done on what is returned, and what are the hiccups that can occur or be verified as to not have occurred.

Autosomal Match Analysis (work-in-progress)

Review the Consanguinity page first before reading any particular section below.
(A) Work to verify known, close relationships with people you know or discovered have tested (400cM and greater)
    1. Verifying Parent / Child relationship in a nuclear family (distance 1)
      - Parents share 50% of their DNA with each child. So the half-identical match tool will report (roughly) a full match (of around 3,600 cM and a largest segment length of >220 cM)
      - Note: There is no half-parent or half-child relationship. A biological parent is either a parent or not.
    2. Verifying Sibling relationships (distance 1)
      +- Full-siblings share 50% of their DNA (on average) but half-identical match tools only report it at about 40%.
      +- You can only determine the real percentage matching between two full-siblings in half-identical match tools using the phased results to compare each and then summing the result. You need at least one parents result to phase the children.
      1. Determine Half versus Full-Sibling (distance 2 vs 1)
        +- GEDMatch, in their 1:1 comparison tool, has a graphic mode that shows Full-identical match areas. This is a visual, quick way to differentiate between Half- and Full-Siblings.
    3. Verifying Uncle / Aunt / Niece / Nephew relationship (distance 2)
    4. Verifying Grandparent / Grandchild relationship (distance 2)
    5. Verifying first cousin relationship (distance 3) and first cousin once removed (distance 4)

(B) Determining the possible relationship(s) of unknown, newly matched more distant relationships (roughly 40 to 300cM)
    1. Not including closer matches than distance 5 (exclude possible 1C2R and similar; anything that could be closer than 2C)
      - Done from knowing just the total match strength and/or largest match segment length (genealogical relationship unknown).
      - Discussing and finding common surnames, geographic locations and similar in the background (GEDCom / Tree analysis)
      - Looking at "in common with" matches between the two who match to find possible family groupings indicating a common ancestor
    2. Verifying second cousin, distance 5, and second cousin once removed (distance 6) and more distant (till roughly 40cM) and includes third cousin and third cousin once removed
      +-Matches at this distance are often best solved by using match clustering first
      +-Match strength is a rough indicator of relatedness but not very accurate (often have to drop clustering tools down to maximum 150cM to remove 1C2R matches from clustering)

(C) Look into deeper analysis techniques to pull out more information (less than 40cM matches)
+++-Matches at this distance really require segment triangulation or may fall out as part of match clustering if closer second cousins from the same line are found +++-Match strength really has no bearing on the real relationship. Even though a very weak match, the match could even be a 2nd cousin or closer.
    1. Chromosome segment clustering
    2. What more can I tell with the X matching?


Further Reading

1 Jim Bartlett's Intro to Managing your Autosomal Results posted on Kitty Cooper's Blog
2 ISOGG Wiki Fully Identical Region explanation
3 Analysis of how deep (many generations back) Autosomal segments may exist and how large. Note that this is independent of how reliable the test tool analysis may work at determining a matching segment. Two separate issues.
4 X Chromosome inheritance analysis

1.1.2. yDNA STR

The summary of a genetic distance as given on the Matches page for FamilyTreeDNA is a very high level first cut idea of what it means to match. It is important to understand what values are different between two kit's, how much they are different by, and whether there are SNP results to back up the apparent STR matching. Because STR values change much more frequently, and because they can just as easily change back as they change further away in value, then one needs more stable SNP results which tend not to veer back to verify the STR match is truly a match and not just a Convergence of STR values.
 

Further Reading

1 NIH STRbase

1.1.3. yDNA SNP


Further Reading


1.1.4. NextGen Sequencing (NGS)

More specifically, yDNA Sequencing (in 2016)
yFull
yTree for those in R1b-P312 and below

Further Reading


1.1.5. Mitochondria SNP (mtDNA SNP)

 

Further Reading