Loading...
 

Finding Deeper Y Haplogroups

So many have done microarray (aka Autosomal) testing with the major players and maybe a basic yDNA STR test with FamilyTreeDNA or similar. But everyone hears about the phylogenetic tree of haplogroups and want to know where they are placed. The best way to get placed accurately is an NGS test like BigY. But if not ready for that plunge, there is often more you can extract from existing tests you have done to get you further. Remember, haplogroups are put into an ordered phylogenetic tree of age from the first human to the current time. So the deeper in the tree (closer to the current time) you can determine, the better to helping out your genealogical search. Haplogroups are determined from SNPs that are measured; not imputed, predicted or the like.

Microarray Test SNPs

Ancestry and 23andMe both test yDNA SNPs. Not many, but enough to get you to a major, single letter branch if not deeper. Often deeper than the STR predicted haplogroups of FTDNA. As reported in our genetic genealogy testing section, Ancestry tests a little over 1,000 SNPs with 23andMe a bit over 3,000. If you cannot find what they report or are not sure it is the most accurate interpretation, you can do the analysis yourself. If you do not know how to read the RAW data files (just text CSV spreadsheet files), then maybe run the results through a tool like Cladefinder or MorleyDNA. See the Measuring with different services section on the B10DNA page for examples of using the MorleyDNA tool, different test services and the like. Or read up about the Microarray File Formats to determine how you might figure out your haplogroup on your own.

Here is a simpler example we helped with recently down a rare haplogroup. Notice the difference in a Build37 versus Build38 measurement and thus the importance of using a Build38 BAM from an NGS test. Also how the tree and its SNPs defined makes an impact on the depth of the measured haplogroup.
Tree path: A0-T > A1 > A1b > A1b1 > A-M32 > A-Y20629 > A-YP4735 > A-M13 > A-YP4751 >  ... (continued below)
| |
Dante Build 37 (yLeaf v2.2) ++++++++ |
23andMe (Cladefinder) +++++

... > A-YP4740 > A-PF1069 > A-Y30506 > A-V3663 > A-FGC38299 > A-V2667
| |
Dante Build 37 (Cladefinder) +++++++ |
Dante Build 38 (Cladefinder) +++++++
Example of measuring the haplogroup with various test services

So even though the same NGS result is there with all the SNPs, yLeaf could not find deeper because the ISOGG tree is not defined deeper.
Note: To ease the display on certain devices, the path has been split in the middle and continued below.

Better STR Predicted Haplogroups

Deeper Haplogroup from STR Match List

SNP Haplogroup From STR Match List
FTDNA is very conservative in their STR predicted haplogroup. In a groups results table, you know the predicted values as they are in red (versus green for measured ones). You will see the same predicted result for a y12 test as a y111 one. But with 37 or more markers, one can often get a better, deeper predicted haplogroup.

Try using the NevGen tool as a start. For some of the clades in R and I, they can get pretty deep in the current tree. We have had good success in this project with that tool. Both verifying that it comes close to BigY measured results in the refined sub-clades it works on. And seeing it have a strong prediction with good confidence on y67 values from project members. NevGen scraped the publicly-displayed projects for STR values and BigY haplogroups. They then perform a best-fit analysis for the supplied STR values to see what the closest haplogroup is from the scraped data. A confidence value and possibly multiple haplogroups are returned.

Otherwise, do a manual analysis from your match list. Look at your STR match list for 37 markers and above. Using the "advanced matches" feature and sorting on the haplogroup is a good way to get multiple match lists merged for this analysis. If someone is on your match list, you are pretty sure they are near your likely haplogroup. Look for others in the list that appear to have tested haplogroups. That is, different from the predicted everyone else seems to have. Start searching for these haplogroups in the public FTDNA tree or maybe yFull. (Note: yFull tends to name some of the haplogroup blocks differently so the (FTDNA match list haplogroups may not appear as a ((yFull) haplogroup directly.) Often, you will find many / most of the haplogroups mentioned are on a common path down to a few leaf haplogroups measured using BigY or similar. On the path also will be the predicted STR haplogroup that FTDNA provides. The others could be reporting other haplogroups if they tested using an SNP Pack, individual SNP (was common in the 2000's before BigY and SNP Packs, or with National Geographic Genographic. If different sub-branches or leaf nodes in the tree are represented on the match list, that simply means these are likely nearer term (recently formed) haplogroups. So look for the common ancestor for these multiple leaf nodes and that is a good predictor for where you will lie (within that clade).. See the chart here where we did just that to help someone in an online forum unaffiliated with this surname project. (They are an OSM with 58 y111 matches. The thought, at first, is the haplogroups are all over the map. Charting them like shown here showed the matches were all likely within a recent haplogroup which has a few leaf branches just below it.) If the match lists are large (often the case in OSM), then this method provides a similar result to NevGens more automated analysis above.

For mapping the various, apparent disparate haplogroups into the tree, you need a tree that gives you the breadcrumbs or path down to the various nodes you are viewing. The FTDNA trees, both public and private, do not have this (yet; except for the new BigY block tree available to BigY testers only). yFull and yTree do have the breadcrumbs though. ISOGG does not have breadcrumbs either but is sometimes a little easier to follow the paths as they still use the old, long-form YCC path nomenclature for haplogroup names.