Sunday, December 28, 2014

When a Match is not a Match?

A few days ago I was exploring my Family Finder matches at FTDNA. When I do this I first select to "Show Full View" which is just above the icon for my first match:

This allows me to see "Longest Block" and other items of interest.

I then located the person with whom I shared the longest block but had yet to find a relationship on paper. That person shares a block of 41 cMs with me on Chromosome 10 and is predicted by FTDNA to be my 2nd to 4th cousin. I then looked for others who matched the two of us. There were several who overlap part or all of this 41 cM area of Chromosome 10. Ten others overlap that shared block in amounts varying from 9.9 cMs to 29 cMs. The various relationships and common ancestors shared among the individuals in this cluster will take some time for us to try to sort out.

All of these individuals are on my dad's side of the family. In addition to matching me, they also match my paternal first cousin. Also four of them (including the one with the 41 cM match) match each of the others. All of them match at least five others in this group.  

Not all apparent matches are real.

One of the individuals in the above grouping seemed to have an exactly identical match. Both were predicted to be 5th cousins to distant cousins. In the Chromosome Browser their segments looked like this:

Upon closer examination the exactness apparently seen in the Chromosome Browser was confirmed:


But wait. Let's not get ahead of ourselves. It is easy to get hypnotized by the statistical precision of DNA lab reports. DNA doesn't lie and lab errors are rare. However, we must be careful to interpret the results correctly. 

My first clue that something was amiss was when I loaded these individuals into Family Finder's Matrix tool. The individual represented by the orange bar in the Chromosome Browser view of Chromosome 10 above matched me, my paternal first cousin, and seven of the others when I compared them in the Matrix tool. The apparently identical blue bar represented an individual who only matched me. Later I discovered that this second individual also matched my maternal first cousin. In other words the first (orange) matched a segment of DNA that I had inherited from my father. The second (blue) matched a segment of identical length and location that I had inherited from my mother. That double helix must be respected. The two apparent exact and identical matches with me turned out not to be matches with each other.  

Sunday, December 14, 2014

Got BIG Y test results? Now what?

A few thousand if not several thousand men have or soon will have BIG Y test results. Of these the R1b-L21 haplogroup project has 800 all by itself. Funding the test is only the first major hurdle. Next comes the formidable task of incorporating the information into your family history.

Making sense of all the SNPs that have been discovered in the last year is overwhelming to many of us. That SNP Tsunami wave train is not a single event but a series that will be washing newly discovered SNPs ashore for the foreseeable future as more men are tested.

Hopefully you will have some SNP Superheroes in your haplogroup like the ones from whom I have benefited in L21. Without their mentoring I would still be struggling to stay afloat and would have very little understanding of the information newly liberated from my yDNA.

As I discussed in a previous post, you will find your BIG Y results in the Other Results section of your My DNA report at FTDNA. Part of that information is shown below:

Before we go further with our analysis I'd like to share with you an important caveat from Ray Banks who is the guru of the Z253 subclade of L21. My deceased father-in-law belongs to that subclade. Ray says:
"Big Y results are like slices of Swiss Cheese - full of holes and inconsistencies. It is only by putting together all of the slices that you get the full picture."
The data in the various columns in my report above are examples of Ray's slices of Swiss Cheese. As is typical of the BIG Y reports I have seen, the men in this listing share about twenty-five thousand SNPs (right column). That is interesting but so far I've not found that particularly relevant to my research. 

However, the order of the matches is relevant. You will note that they are ranked by the values in the Known SNP Difference column. Based on this column one could assume that the man listed first is my closest genealogical match. Not so. He is my second closest match in this group. My closest genealogical match is the 6th man listed. He is a known 6th cousin--once removed. Remember Ray's Swiss Cheese!

When a much more comprehensive amount of the evidence from my BIG Y results was analyzed by SNP Superhero Alex Williamson, he appears to have arrived at the correct conclusion about our relative relationships  and arranged our SNP branching in the correct sequence. Alex is the creator of The Big Tree of BIG Y results for those of us who have tested positive for R-P312 -- a parent SNP of L21. He found 5 SNPs that I shared with my known cousin after we parted company with the man listed at the top of my list above. The three of us share about twenty BIG Y novel SNPs that have not yet been found in other BIG Y results. We are hopeful that this situation will branch further when the results of three other men, thought to be somewhat distantly related, are posted in February. 

Analysis for Novices

Most of us, including Dr. D, have not begun to master the wizardry demonstrated on our behalf daily by Alex, Ray, Mike Walsh and many of their associates. However, I would like to share one trick that even novices can feel free to try at home as long as you remember the "Swiss Cheese" caveat.

Open your Big Y - Results and enter the Matching tab  

Next open the drop down menu under Shared Novel Variants. In the example below I have scrolled down the list until I came to the point where the matches start narrowing down from a few hundred to a few. The long series of numbers indicate the location on the Y chromosome where that particular SNP is located -- in this case 19201991. This just happens to be the location of the SNP that defines my subclade S1026. Note that 12 other men have tested positive for this SNP and are also members of this subclade.

Slide the scroll bar to the bottom of the list in order to find those likely to be your closest cousins. In the example above the SNPs followed by "(2)" will show two other men if the entire screen were displayed. For privacy reasons I have not shown their names or the buttons to display their email addresses. If you move the slider scroll bar completely to the bottom of the list, you may have a single individual who should be your closest match. However, remember Ray's Swiss Cheese! 
"Big Y results are like slices of Swiss Cheese - full of holes and inconsistencies. It is only by putting together all of the slices that you get the full picture."
Occasionally, you may have a SNP that is totally at random or appear to be that may match with one or a few men in some totally separate and distinct haplogroup. That is when you need to combine several slices of cheese to get the full picture. If you do look at a half dozen or more SNPs a true pattern should emerge. Happy snipping!

It think I'll go make a grilled cheese sandwich -- Swiss of course.

Friday, December 12, 2014

The Long Journey of your Genome: Part 2

Many of us wonder what path our ancestors traveled through prehistory to the time that pieces of their journey were recorded in various forms of the written word. Those of us who have European female ancestry can use a full mitochondrial test to tell us from which of the Seven Daughters of Eve we descended through our direct maternal lines. However, we must not lose sight of the fact that we may have descended from several of the seven daughters described by Bryan Sykes or even from sisters of the Eve hypothesized in his book. For example my maternal grandmother in a direct umbilical line descended from Helena but my paternal grandmother descended in a parallel line from Ursula. My daughter and son descended from Helena by a very different "umbilical cord" line. Through my daughter-in-law my Dowell grandchildren picked up a second line from Ursula and a line from Katrine through their maternal grandfather. 

Connecting these ancient SNP defined lines with our documented genealogies has been more problematic. Some of us have been able to make haplogroup connections that are meaningful to our genealogical research; but most of us have not. Full mitochondrial databases are still very small compared to both yDNA and atDNA databases so matches are not as common. Also, as I discussed in Part 1 of this series, the amount of information recorded in your mitochondria is minuscule compared to that contained in your chromosomes.

Beginning to read your Big Y Results - Results 

Much of the information that is reported to those of us who have taken the BIG Y test is unintelligible to most of us -- at least at first. FTDNA does not report our BIG Y results in the yDNA section of our My DNA page. Rather, it is in the Other Results section. This is the first indicator that BIG Y results have not yet been integrated with the rest of your yDNA reports. This is most important to remember when you try to understand the place of your own SNPs within the FTDNA. No SNPs have been added to the Y-DNA Haplotree since the inception of BIG Y testing a year ago. 

Only SNPs that had been discovered by FTDNA or GENO2 prior to November, 2013 are included in the FTDNA's current tree. Even some of the SNPs for which you may have confirmed results from individual tests at FTDNA are not reflected on their current tree. These also may not be included in their listing of your confirmed results on your opening my DNA page. For example in 2012 I took an individual SNP test at FTDNA for a SNP named DF13 and was found to be positive. DF13 was then and is now known to be below L21. However, I am still being shown to have a terminal SNP of L21 on my FTDNA report. More recently BIG Y has discovered about thirty more SNPs below DF13. 

There is no way FTDNA could have included those thirty SNP in their tree yet. This is a different kind of exploration. The BIG Y is a voyage into the unknown inner space of our yDNA. However, DF13 was known and I had been tested for it more than a year before BIG Y blasted off and more than a year before the last update of FTDNA's current tree. This is not a criticism of FTDNA's tree as much as it is a caveat warning you not to read too much into it. Probably less that one-tenth of the SNPs on our Y chromosomes, about which we know today, were known at the time FTDNA was putting the current table together. It is going to be a monumental effort to update it.

I think I'll stop now before continuing soon with some hints on how you can begin to interpret your BIG Y results. That is really what I started to do in Part 1 before I decided I needed to give some background first.

Wednesday, December 10, 2014

The Long Journey of your Genome: Part 1

Your genome had already been on a long journey before your parents got together to conceive the unique you. If the current estimates of our best scientists are to be believed, the human portion of that journey could have taken more than 300,000 years. For our genomes to have survived the extreme climate changes, wars, famines and disease is a miracle equal to those of the ones surrounding the creation of our species and our universe.

Most of our genealogies focus on identifying and chronicling the lives of those who were the carriers of our genomes during the last few hundred years of this incredible journey. That is likely only one tenth of one percent of the journey of our species down to who we are today. Documenting even this tiny part of the journey of our genomes can be a very formidable challenge.  
Success in this endeavor is more likely if we follow some principles that became well established in the 20th century. First and foremost among them I described in my Crash Course in Genealogy as:

Rule #1. Start what your know (yourself) and build back to what you don't know --- step-by-step. Don't skip steps!!
That's still a great rule that 21st century genealogists violate only at great peril. However, now the more intrepid of us can turn this rule on its head and attempt what in my just released NextGen Genealogy: The DNA Connection I call reverse genealogy. Basically this involves starting at the beginning with mtEve or yAdam and tracing our SNP flows down toward the present. We are able to do this because of two kinds of "celibate DNA" or DNA that is not recombined between the contribution of the mother and the contribution of the father when an embryo is conceived. As a result this DNA is passed relatively intact from generation to generation to generation. 

We have been able to trace our "umbilical lines" of descent for the last few years if we have tested all 16,569 locations along our mtDNA. Since all of us inherited mtDNA from our mothers, all of us can trace our "umbilical line" down to the present. Conceptually this was made easier in 2012 when Doron Behar and colleagues published a new approach to reporting mtDNA results that uses mtEve as a starting point and reports the mutations that occur as we fast forward down through the millennia to the present. 

Although 16,569 locations seem to be a large number, they do not provide the nuanced distinctions offered by the more than fifty million locations of our other celibate DNA -- our yDNA. Until recently most males were limited to looking at the number of Short Tandem Repeats (STRs) located at 111 distinct locations along our yDNA. Now it is possible to look for Single Nucleotide Polymorphisms (SNPs) along more that ten million locations with the BIG Y test. Other tests on the market now offer to test even more locations -- perhaps half again as many. 

These NextGen tests offer the possibility of tracing our ancestors' paths down from prehistory into genealogical times -- the era for which we can hope to find written records about our ancestors. I will discuss some of my early experiences with BIG Y test results in Part 2 of this discussion and how you can begin to investigate your own results if you have taken the test.

Sunday, December 7, 2014

NextGen Genealogy: The DNA Connection

The more I learn about the publishing industry, the more confused I get.

As some of you know the official release date for my new book, NextGen Genealogy: The DNA Connection, was November 30th.  

If you pre-ordered it from the publisher ABC-Clio, you probably already have the physical version in hand. Amazon started allowing Kindle downloads last Sunday, but as of this morning does not yet have the paper copy in stock. Barnes and Noble has been offering the Nook version all week. 

Next comes the matter of price. On the publisher's site the price has consistently been $40 for the physical book. If readers email me, InfoDoc [AT], I can send you a discount code from the publisher that will give you a 20% discount. That will make the price $32 before shipping. 

Amazon originally offered the paper copy for $40 but has recently lowered that to $38. With Amazon's pre-order guarantee, any of you who have placed orders there should get it at that price and you may get free shipping if you have a Prime account.

Over on the e-book side, Amazon has been offering the Kindle version for $35.99. Barnes & Noble started out offering the Nook version for about $23 but quickly changed its price to the current $30. Although the publisher also offers an e-book version, you probably should not be interested unless you are a library that plans to offer the book to your patrons from your own server.

Then comes the description of the book. Some of you may know that the information about books that you see online generally was created before the book was written. It was created by the publisher soon after the contract was signed. In the current example, all three online sources mistakenly agree that the book has 136 pages. The actual physical book I have in hand has 173 pages including the index. The lower number was the publisher's guess to use a place holder before the manuscript was received. Such information takes on a life of its own.

Another example of information taking on a life of its own is the co-authorship of CeCe Moore. CeCe was originally contracted to participate in this book project. Based on that the publisher originally created a cover that included her name. Initial information including her name was included in the publisher's catalogs and was sent to others including Amazon and Barnes & Noble. Then CeCe's career as a DNA consultant for television programs and her independent consulting work rocketed at a pace that forced her to withdraw from this book project at a very early stage. A new version of the cover was created and the publisher no longer mentions her on its site. I was able to get Amazon to remove reference to her authorship. However, the Kindle division appears to operate in a different universe than does the print division of Amazon. The short version of this saga is that you will still see CeCe mentioned by Kindle and Nook.

Thank you for indulging me in this rant. Perhaps this will make you a more informed consumer as you contemplate purchasing this and other books.