An Overview and Discussion of Various DNA Mutation Rates and DNA Haplotype Mutation Rates.
Do the YSTR Haplotypes in some Y Chromosome Male Lines Mutate Faster Than in Other Male Lines?

By: Charles F. Kerchner, Jr., P.E. (Retired)
Written:  7 Jan 2005
Last Edit/Update:  17 Apr 2008
Copyright ©2005-2008
All Rights Reserved

Add Your Calculated Surname Project Y-STR Mutation Rate Data to the Surname Project Y-STR Mutation Rate Log

Kerchner's DNA Testing and Genetic Genealogy Info and Resources Page

See my online Genetic Genealogy Glossary for help with esoteric terms

Kerchner's Genetic Genealogy DNA Testing Dictionary

 


 

DNA Mutation Rates


Mutation Rate: The rate at which a genetic marker mutates or changes over time. The number of mutations per hundreds of generations expressed as a decimal value or a percentage. For example: A typical mutation rate quoted in early (circa 2001/2002) Y chromosome STR (Y-STR) TMRCA calculations and analysis is one per 500 generations (transmission events). That would be an average mutation rate (Y-STR Genetic Clock Mutation/Tick Rate) of .002 or 0.2%. Some commercial DNA testing labs are using an average Y-STR mutation rate of .003 or 0.3%. And a 2004 study by FamilyTreeDNA indicates that the average mutation rate for all Y-STR markers for the male population as a whole may be twice as fast as the historical standard rate, i.e., .004 or 0.4% instead of .002 or 0.2%.

Does the Y-STR Mutation Rate Genetic Clock Tick at the Same Average Rate for All Markers and for All Males Lines?


Studies indicate that the Y-STR mutation rate varies for different markers. Also recent studies indicate that the average Y-STR rate for all markers used, and for all male lines tested, could be twice as fast as previously surmised. Also for some panels of markers the average rate could be even higher. Thus for example the average Y-STR mutation rate could be once per 250 generations which is .004 or 0.4% for the entire male population instead of .002 or 0.2%. In addition to the overall average mutation rate for all males lumped together being subject to debate, anecdotal evidence reported over the past few years by Genetic Genealogists indicates that the calculated average Y-STR mutation rate varies from one Y chromosome male line to another. The Y chromosome average mutation rate of one surname project male line compared to another surname project male line mutation rate is very different in some cases. One could see a significantly higher average mutation rate in one surname male line. One could see significantly lower average mutation rate in another surname male line. Some family male lines appear to have a very stable Y chromosome. And other males lines may have one that mutates far more than the average estimated for the total population of all males, i.e., significantly more than the frequently used .002 Y-STR mutation rate per generation rate. When averaging together these surname project group male line averages, the overall average may still approximate the overall averages seen in the general studies of Y-STR mutations rates. It would be logical to expect that. But there may be dramatic differences in the average mutation rate in one family male line surname project compared to another family male line surname project. Explanations have been postulated as to why one male line Y chromosome would be mutating faster than the overall average of other projects and the general male population as a whole. Some have suggested the fundamental mutation rate may be the same but that there is a Y-STR copy repair mechanism to fix mutations during the Y-DNA replication/copying process which may work better in some male Y chromosome family lines than it does in other family male lines resulting in net differences being observed in average mutation rates from one male line surname project to the other. Therefore it is speculated that the male lines which have the higher average mutation rates have a less effective Y-STR copy repair system. If so, that would help explain the marked differences observed in the average Y-STR mutation rate sometimes found when comparing one male line surname project average mutation rate to another male line surname project's average mutation rate. It should be noted that the Y-STR average mutation rate is also currently the subject of much debate. But in my opinion, when it comes to Y-STR mutation rates, it is definitely not one size shoe that fits all male lines.

Mutation Rate Term Applies to Various Types of DNA Markers
(but as later will discussed the mutation rate varies greatly for these different types of DNA markers)


A mutation rate can be defined and estimated for a Single Nucleotide Polymorphism marker (SNP), a single Short Tandem Repeat (STR) at a DNA Y-chromosome Segment (DYS) marker location, i.e., a Y-STR marker, and/or for a Haplotype or set of several markers. A Y-DNA Haplotype is a set of numbers, i.e., the numeric allele values of a set of STRs (typical commercial tests look at a set or panel of 12, 25, 26, 37, 43, or 67 DYS STR markers) which are located in the vast non-coding "junk DNA" region of the Y chromosome. The haplotype term is also used to describe the set of SNP mutations in the case of Mitochondria DNA (MtDNA) testing in the Hyper Variable Regions (HVRs) which are located in the small non-coding region of the mtDNA molecule. Note: The commerical mtDNA test is basically a giant multi-nucleotide SNP test checking 1050 nucleotide locations if both HVR1 (540 nucleotides) and HVR2 (510 nucleotides) are tested. Your actual mtDNA haplotype would be all 1050 alleles (DNA letters, i.e., A, T, G, or C for each nucleotide location). But the standard convention for reporting the results of mtDNA test is to only report the mutations/differences as compared to the mtDNA Cambridge Reference Sequence (CRS), which was the first mtDNA molecule fully sequenced. Even the mutation rate for the entire Y chromosome molecule located in the cell nucleus can be estimated although typing the entire molecule is not economically practical at this time because of its large size (about 58,000,000 nucleotides).

What are some of the observed or estimated mutation rates for various types of DNA and DNA test haplotypes? See the following discussion of commonly used mutation rates and their implications to genetic genealogists.


 

Estimated DNA Marker Mutation Rates for the Various Types of DNA


Reference Y-STR Y chromosome DYS marker mutation/slippage rate:
2 x 10^-3 or .002 per marker transmission event (birth of a new generation) is a commonly used value. Source for this historically quoted Y-STR average mutation rate is from FamilyTreeDNA circa 2001. This Y-STR mutation or repeat count slippage rate is currently the subject of much debate.

Reference Mt-DNA molecule (maternal line non-nuclear DNA - located inside cells but outside the nucleus) nucleotide (SNP) mutation rate:
3.0 x 10^-5 or .00003 per nucleotide transmission event (birth of a new generation) is a commonly used value for D Loop HVR regions.
The 0.00003 one significant digit average mutation rate I'm using in this overview is based on the work by Parsons, A High Observed Substitution Rate in the Human Mitochondrial DNA Control Region, Nature Genetic, Vol. 15, April 1997, p.363-367. That report determined a rate of about 2.9 x 10^-5.

Reference Y-DNA nuclear molecule (paternal line nuclear DNA - located inside the cell nucleus) nucleotide (SNP) mutation rate:
2 x 10^-8 or 0.00000002 per nucleotide transmission event (birth of a new generation) is a commonly used value. The 0.00000002 one significant digit average mutation rate I’m using is this overview was given in a presentation given in February, 2003 to the Department of Biological Sciences at George Washington University by Dr. Peter M. Vallone of the NIST.  Vallone, Development of Multiplexed Assays for Evaluating SNP and STR Forensic Markers, National Institute of Standards and Technology (NIST), 28 Feb 2003, slide 17.


 

Number of Markers or Potential Markers in the Various Types of DNA


Number of Y-STR Y chromosome DYS markers in Y-DNA, i.e., the Y chromosome:
Surmised to be hundreds available although currently 67 or less are offered in a single test by a single commercial Y-DNA testing company. In the not too distant future commercial tests for typing over 100 YTRs may be offered by a single company.

Number of nucleotides in the Mt-DNA molecule (non-nuclear DNA, i.e., located outside the cell nucleus):
A frequently quoted number from the Cambridge Reference Sequence (CRS) for the mtDNA molecule is 16,569 nucleotides of which 1050 nucleotides are currently commercially tested in the non-coding section of the mtDNA molecule containing two hyper variable regions 1 and 2 which are the regions tested in the typical commercial mtDNA test. Commercial tests are now available to type the whole mtDNA molecule.

Number of nucleotides in the Y-DNA molecule, i.e., the Y chromosome (part of the nuclear DNA, i.e., located inside the cell nucleus):
About 58,000,000. Does anyone know the exact count from the first whole Y chromosome sequenced? Does anyone know the exact count of the number of nucleotides in the large non-recombining Y-DNA (NRY) portion of the Y chromosome? In the future a commercial test may be become available to economically sequence and type the whole Y chromosome.


 

DNA Haplotype Mutation Rates
(a haplotype is a set of DNA markers used together as a panel in a DNA test)

 

Typical Average Y-STR (paternal line) Haplotype Mutation Rates (ystrHMR)


(Note: The first example calculations are assuming the historical .002 rate as the underlying Y-STR average mutation rate is the correct overall rate which more recent studies (Kerchner 2005-2007) now indicate it is not. Yet many newbies believe that is the rate to use. Others have argued at various times that .002 is still the correct rate and all the other studies indicating otherwise are statistical aberrations. The below calculations are for example purposes only to define my various sized haplotypes ystrHMR calculations. For the different haplotype sizes a weighted average Y-STR mutation rate for each size Y-STR panel or haplotype should be used. I have provided some examples of those from my own Kerchner Surname Project.)

First let me explain some math simplification assumptions I used to make things easier to calculate and understand.  I did this by reducing the true exponential calculations to a simple arithmetic model for the examples given later in this report. Here is my explanation of the simplification.

Given that mu = the assumed Y-STR marker mutation rate and M = number of markers in the haplotype then:

Probability (new haplotype) = 1 – probability (no mutations at any marker in the haplotype)

Probability (no mutations at any marker in the haplotype) = (1-mu)^M

Probability of new haplotype = 1 – (1-mu)^M

For a specific example if we use M = 12 markers for an example haplotype size and a mu = 0.002 (an often quoted historical average Y-STR marker mutation rate) we get a probability of no new haplotype with each transmission event (birth) equal to:  1 – (1-.002)^12 = .0237

This whole process can be simplified for very small mutation rates, i.e., mu much, much smaller than one, and for a relatively small number of markers, i.e., M less than 100, by using a simpler approximation which gives very close to the same answer as the more complex equation.  That simplified assumption is:  Probability new haplotype =  mu*M.  For a 12 marker haplotype and a mutation rate of .002 this simplification yields .002*12 = 0.024 per transmission event which is very close to the actual true probability of .0237. Using this simplification allows one to use of simple arithmetic in my examples below to demonstrate the value of knowing the average mutation rate for your male line.  Thus for the below simplified examples I have chosen to use this simplification to calculate the expected life expectancy of a haplotype, i.e., an estimate of how many generations a haplotype can remain unchanged given various assumed or determined average marker mutation rates. However, if you are familiar with using exponents, and have a pocket calculator which can do the exponent calculations using the precise formulas described above, then I of course recommend that you use the precise equations to get a more precise answer for your male line’s average haplotype life expectancy, i.e., how many generations the YSTR haplotype of a given size would likely on average survive unchanged in your male line, without at least one marker allele value change and thus creating a new and slightly mutated haplotype.


Reference Y-STR (12) haplotype mutation rate (ystrHMR12) calculations:
.002 x 12 DYS STR markers = .024 per transmission event (birth of new generation). (1/.024)=41.6. A new mutation can happen at any time but a 12 marker haplotype using the .002 historical rate indicates that it can typically survive unchanged since the generation of the prior mutation event for several dozen generations (transmission events). Thus random matches are common with people of different surnames because of a shared common ancestor who probably predates the adoption of surnames and written family records.

 

For my Kerchner Surname Project the 12 marker average haplotype mutation rate for ten people YDNA12 tested is .0044.
.0044 x 12 DYS STR markers = .0528 per transmission event (birth of new Kerchner generation). (1/.0528)=18.9. Thus the longevity of the 12 marker Kerchner haplotype on average can typically survive unchanged about 18.9 generations (transmission events). And since this is a time frame which predates the adoption of surnames in some areas of Europe, I would expect to see random matches. And I do have a few random matches with people of different surnames in the FTDNA database for the Kerchner 12 marker haplotype.

Reference Y-STR (25) haplotype mutation rate (ystrHMR25) calculations:
.002 x 25 DYS STR markers = .050 per transmission event (birth of new generation). (1/.05)=20. A new mutation can happen at any time but a 25 marker haplotype using the .002 historical rate indicates it can typically survive unchanged since the generation of the prior mutation event for about 20 generations (transmission events). Random matches with people of different surnames will be markedly reduced with a 25 marker test.

For my Kerchner Surname Project the 25 marker average haplotype mutation rate for ten people YDNA25 tested is .0042.
.0042 x 25 DYS STR markers = .105 per transmission event (birth of new Kerchner generation). (1/.105)=9.5.  Thus the longevity of the 25 marker Kerchner haplotype on average can typically survive unchanged about 9,5 generations (transmission events). This is well within the time frame of when surnames were adopted so one would expect to see few if any random matches with different surnames. And in my Kerchner project I don't have any random matches at 25 markers to other surnames in the FTDNA database.

Reference Y-STR (26) haplotype mutation rate (ystrHMR26) calculations:
.002 x 26 DYS STR markers = .052 per transmission event (birth of new generation). (1/.052)=19.2. A new mutation can happen at any time but a 26 marker haplotype using the .002 historical rate indicates it can typically survive unchanged since the generation of the prior mutation event for about 19 generations (transmission events). Random matches with people of different surnames will be markedly reduced with a 26 marker test.
At present I have not calculated the 26 marker ystrHMR26 rate for my Kerchner project. Only three of my project participants where tested at multiple labs. So I don't have the necessary data for the 26 marker test.

At present I have not calculated the 26 marker ystrHMR26 rate for my Kerchner project. Only three of my project participants were tested at multiple labs. I don't have the necessary data for the 26 marker test.

Reference Y-STR (37) haplotype mutation rate (ystrHMR37) calculations:
.002 x 37 DYS STR markers = .074 per transmission event (birth of new generation). (1/.074)=13.5. A new mutation can happen at any time but a 37 marker haplotype using the .002 historical rate indicates it can typically survive unchanged since the generation of the prior mutation event for a bit more than a dozen generations (transmission events). Random matches will be minimal, if any. The resolving power of a 37 marker test places the most likely time to recent common ancestor definitely in a time frame of genealogical interest and a time frame when many male lines had already adopted their surnames and written birth records started to be maintained. If you share the same or similar surname and match closely with a 37 marker test you probably share a genealogically relevant common male ancestor even if not known via the traditional evidence.

For my Kerchner Surname Project the 37 marker average haplotype mutation rate for ten people YDNA37 tested is .0057.
.0057 x 37 DYS STR markers = .2109 per transmission event (birth of new Kerchner generation). (1/.2109)=4.7. Thus the longevity of the 37 Kerchner haplotype on average can typically survive unchanged about 4.7 generations (transmission events). This is well, well within the time frame of when surnames were adopted and well, well within the time frame when the American colonies were settled, so one would not expect to see any random matches with different surnames. And in my Kerchner project I don't see any random matches at 37 markers to people with other surnames in the FTDNA database.

Reference Y-STR (43) haplotype mutation rate (ystrHMR43) calculations:
.002 x 43 DYS STR markers = .086 per transmission event (birth of new generation). (1/.086)=11.6. A new mutation can happen at any time but a 43 marker haplotype using the .002 historical rate indicates it can typically survive unchanged since the generation of the prior mutation event for a bit less than a dozen generations (transmission events). Random matches will be minimal, if any. The resolving power of a 43 marker test places the most likely time to recent common ancestor definitely in a time frame of genealogical interest and a time frame when many male lines had already adopted their surnames and written birth records started to be maintained. If you share the same or similar surname and match closely with a 43 marker test you probably share a genealogically relevant common male ancestor even if not known via the traditional evidence.

At present I have not calculated the 43 marker ystrHMR43 rate for my Kerchner project. Only three of my project participants were tested at multiple labs. I don't have the necessary data for the 43 marker test.

Reference Y-STR (67) haplotype mutation rate (ystrHMR67) calculations:
.002 x 67 DYS STR markers = .134 per transmission event (birth of new generation). (1/.134)=7.5. A new mutation can happen at any time but a 67 marker haplotype using the .002 historical rate indicates it can typically survive unchanged since the generation of the prior mutation event for a bit more than seven generations (transmission events). Random matches will be minimal, if any. The resolving power of a 67 marker test places the most likely time to recent common ancestor definitely in a time frame of genealogical interest and a time frame when many male lines had already adopted their surnames and written birth records started to be maintained. It is also a time frame when the American colonies were being settled. If you share the same or similar surname and match closely with a 67 marker test you probably share a genealogically relevant common male ancestor even if not known via the traditional evidence.

For my Kerchner Surname Project the 67 marker average haplotype mutation rate for the seven people YDNA67 marker tested is .0043.
.0043 x 67 DYS STR markers = .289 per transmission event (birth of new Kerchner generation). (1/.289)=3.5. Thus the longevity of the 67 Kerchner haplotype on average can typically survive unchanged about 3.5 generations (transmission events). This is well, well within the time frame of when surnames were adopted and well, well within the time frame when the American colonies were settled, so one would not expect to see any random matches with different surnames. And in my Kerchner project I don't see any random matches at 67 markers to people with other surnames in the FTDNA database. An anecdotal comment: In my Kerchner family project I match my brother exactly at 67 markers. But I have a genetic distance (GD) of 1 with my second cousin (thus 6 transmission events of separation), i.e., we have a mutational difference at one DYS marker in the upper 30 makers of the 67 marker panel. This observation correlates with what is expected to be observed in our male line given the higher average mutation rate consistently being observed in our male line as more people and markers have been tested. See the Excel table for the details: http://www.kerchner.com/kerchner67mkrs.htm

Note: There is much debate at present as to whether the underlying Y-STR .002 average mutation rate used in my above haplotype mutation rate examples is correct. It is probably not. And that historical average rate should certainly not be used to calculate the various larger haplotype average mutation rates. My examples above demonstrate that large difference in results which are obtained by using different overall averages and project specific rates. A recent study by FamilyTreeDNA.com indicates the average Y-STR marker mutation rate may be more like .004. Relative Genetics has been at times using an average Y-STR marker mutation rate of about .003. If the underlying mutation rate on average is significantly higher than .002, then the expected longevity of the various haplotypes listed above on average will be proportionately much shorter. Thus the parent/original/ancestral haplotype of these mutated haplotypes would be likely fewer generations back in time. Higher Y-STR average mutation rates in some surname projects is what some surname project administrators have been reporting, i.e., average Y-STR mutation rates in their male line under study which is on average substantially higher than .002 and thus seeing more mutations in a time frame of relevant genealogical interest for the ancestral haplotype of a previously known common male ancestor. For example, in my Kerchner Surname Project the average mutation rate is .0044 for the 12 marker haplotype, .0042 for the 25 marker haplotype, .0057 for the 37 marker haplotype, and .0043 for the 67 marker haplotype for ten related participants in three independent descendant branches from the common male ancestor who was known by prior traditional genealogical research and family history.

Also, notice how when adding the 3rd panel (going up to 37 markers) for the FTDNA haplotypes the average mutation rate jumps up. In my opinion, the 0.002 historically quoted mutation rate originally used for the Y-STR mutation rates for the overall male population, and still widely mentioned in discussions, were based on only a few markers which were relatively slow moving compared to some of the more recently offered panels of markers used in genetic genealogy tests. Thus the originally used average mutation rate was based on a few slower markers and does not hold true as other new panels of markers were added. Trying to stick by the historical .002 average Y-STR marker mutation rate when evaluating all haplotype mutation rates, with those haplotypes containing many new fast moving markers, is the cause of much confusion and anxiety in some family surname projects, in my opinion. And then again, in some other family surname projects the mutations are zero at 12 markers, zero at 25 markers, and zero at 37 markers. So it is clear that the average mutation rate varies from male family line to male family line. From the Kerchner Ancestral 37 marker haplotype there were eight unique mutations observed with ten people tested including a parallel one which occurred separately in two of the independent descendant branches. A faster than average Y-STR mutation rate is good for genealogists in those families who have it, not bad. It helps us sort out the various descendant branches by providing branch tags. And we also need to remember that for male lines where the underlying Y-STR mutation rate is substantially higher or lower than the historically quoted average rate, it will dramatically effect the estimates of time to most recent common ancestor calculations, for those surname projects who do not know the most recent common ancestor.

Therefore with significant variations in Y-STR average mutation rates from male line to male line, in my opinion, it is becoming increasingly important for surname project administrators to try to estimate the average Y-STR marker mutation rate and Y-STR haplotype mutation rates for the male line they are studying.

Someone asked me once. How do you do this if you don't know the MRCA ancestor for all your surname project members but your gut instinct and the traditional evidence and the similar haplotypes and surnames tells you that they are all probably related in the last several hundred years? Well if you don't know the ultimate common male ancestor and the ancestral haplotype you cannot do it directly. My suggestion of the way to estimate the Y-STR average mutation rate for males in a family surname project where the ultimate most recent common ancestral haplotype for a few surmised to be related sub-branches/groups has not been deduced, is to calculate the average mutation rate for each independent branch, cluster, or group for the participants in hand so far that you do know their respective more recent MRCAs. Let me clearly preface this and repeat this by saying I am talking about surname projects with clusters of participants who are known to be related to each other and who share the same or similar surname with the other clusters in the project who by prior traditional research were thought to be related, but the common male ancestor for the whole project is not known, and the Genetic Distance between the members of one cluster compared to another cluster is close enough such that combination of sharing the same surname, the traditional evidence, and now the new genetic distance evidence leads one to conclude these clusters do share a common male ancestor in a time frame of genealogical interest. For want of a better name I'll call it the "Sum of the Parts Method". What I am suggesting is to deduce the ancestral haplotype for each surmised sub-branch to calculate an estimated mutation rate for each sub-branch. Use the known common male ancestor for each independent branch/cluster as a reference to calculate the mutation rate for each cluster. Sum up all the unique mutations and unique transmission events in each cluster/branch and use the combined information to calculate an estimated Y-STR average mutation rate for your family surname project. While this will probably not yield the ultimate answer, it will provide an estimate none the less. And then use that Y-STR average mutation rate to estimate the Y-STR haplotype longevity as I did in my above examples instead of using the commonly used average mutation rate of .002. I think this will get some surname project administrators an answer closer to reality for how far back in time the common ancestor for their family could be as compared to the one size fits all for the Y-STR average marker mutation rate. I think this estimating approach would work especially well with large projects with many members and mutations in the clusters/branches. But again essential to this is that traditional evidence exists (and the new genetic evidence is reinforcing) that these clusters do share a common male ancestor in a time frame of genealogical interest, i.e., the last 500 years. For those of you with perfect haplotype matches and zero mutations, well the estimated solution is indeterminate via this method. You will have to continue to use the standard methodology offered by the testing companies which uses the overall Y-STR average mutation rate for the overall male population, rather than a surname specific male line mutation rate. Hopefully this page provides useful information to surname project administrators new to genetic genealogy who are looking at mutation rates and haplotype mutation rates for their surname project(s).

Another point that is worthy of noting in regards to Y-STR haplotypes is this. Given that the Y-STR marker average mutation rate varies from Y-STR marker to Y-STR marker, the average Y-STR haplotype mutation rate will be dependent on which markers are included in the set of markers which make up the Y-STR haplotype. For example: one could choose a panel of 12 Y-STR markers from the 48 or so Y-STR markers available commercially and put together a 12 marker Y-STR test and haplotype which would mutate significantly faster on average than the first panel of 12 markers offered by FamilyTreeDNA. Or likewise one could put together a Y-STR marker haplotype set of 12 markers which mutated on average slower than the 12 marker panel offered by FamilyTreeDNA. Also, the differences in which markers were used in the panels offered by the various early commercial testing companies could be part of the reason why there were differences in the average overall marker Y-STR mutation rates being reported by different companies, early on in this new industry, i.e., FamilyTreeDNA using .002 and Relative Genetics using .003 for Y-STR average marker mutation rate. Of the two early 25 and 26 marker panels, only about half of the markers in each panel overlapped, i.e., were the same markers. Before the modern 37 marker and 43 marker tests were offered one had to get tested at multiple labs to get Y-STR marker data for more than 26 Y-STR markers. One or more of the companies could have deduced the underlying average marker mutation rate they often cited from the set of markers they were using, which differed from company to company. The point is when comparing Y-STR haplotype mutation rates not only is the haplotype's Y-STR marker count relevant but also which markers are used. While I used the commonly used .002 average Y-STR marker mutation rate in my example Y-STR Haplotype Mutation Rate (HMR) calculations, the average marker mutation rate itself be dependent on the markers used in the haplotype too. This can all get very complicated very quickly as one digs deeper and deeper into the whole concept of haplotype mutation rates. My reference example Y-STR haplotype mutation rates above are simply that, examples. The examples were designed to show the Genetic Genealogist how increasing the number of Y-STR markers tested helps them deduce whether the close but non-exact match is relevant in a time frame of genealogical interest. For more on why see this Zip+4 analogy report I wrote: YSTR37 Extended Haplotype and Zip+4 Analogy

Typical Average Mt-DNA (maternal line) Haplotype Mutation Rate (mtsnpHMR)


Reference Mt-DNA (maternal line) haplotype (HVR1 Region) mutation rate (mtsnpHMR540):
.000030 x 540 nucleotides (markers) in HVR1 region tested = .0162 per transmission event (birth of a new generation). Thus an HVR1 mtDNA haplotype can easily survive unchanged for about 62 generations (about 1550 years). This is about 4.6 times slower than the typically used commercial Y-STR paternal line haplotypes mutate. Thus the common female ancestor of two people who randomly match exactly for the HVR1 region mtDNA haplotype usually long predates a time frame of relevant genealogical interest, i.e., long before the last 500 years. Therefore using mtDNA in a random search for matching maternal lines will lead to many wild goose chases. But mtDNA can be used to verify pre-existing traditional genealogy evidence that two maternal lines are related, i.e., the direct maternal line of descent of two families is thought to be from two women who are surmised to be sisters from prior paper trail or oral history evidence. But be prepared for numerous random matches with lines not recently related with mtDNA (maternal line) testing.

Reference Mt-DNA (maternal line) haplotype (HVR1 and HVR2 regions) mutation rate (mtsnpHMR1050):
.000030 x 1050 nucleotides (markers) in HVR1 and HVR2 regions tested = .0315 per transmission event (birth of a new generation). Thus a combined (HVR1+HVR2) mtDNA haplotype can easily survive unchanged for about 32 generations (about 800 years). This is about 2.4 times slower than the typically used commercial Y-STR paternal line haplotypes mutate. Thus the common female ancestor of two people who randomly match exactly for the combined HVR1+HVR2 mtDNA haplotype usually long predates a time frame of relevant genealogical interest, i.e., long before the last 500 years. Therefore using mtDNA in a random search for matching maternal lines will lead to many wild goose chases. But mtDNA can be used to verify pre-existing traditional genealogy evidence that two maternal lines are related, i.e., the direct maternal line of descent of two families is thought to be from two women who are surmised to be sisters from prior paper trail or oral history evidence. But be prepared for numerous random matches with lines not recently related with mtDNA (maternal line) testing.

Average Y-DNA Nuclear (paternal line) Whole Molecule Mutation Rate


Reference Y chromosome nuclear (paternal line) whole molecule mutation rate (ysnpHMR60meg):
For this calculation M is not small but is instead quite large, i.e., in the order of 58,000,000.  Thus if we use the simplification method I described above for the smaller marker counts, we get a large error in the result.  Using the simplication method above we would get a Y chromosome (haplotype) mutation rate, .00000002 x 58,000,000 nucleotides (markers) in the molecule if completely sequenced and tested,  equal to 1.2 per Y chromosome transmission event (birth of a new generation).  The actual results using the equation described earlier in this paper of “probability of new haplotype = 1 – (1-mu)^M”, yields a Y chromosome (haplotype) mutation rate of 0.687. Thus if we could economically sequence the whole non-recombining region of the Y chromosome molecule we would expect to see a SNP change roughly about once in every generation. Because of the sheer number of nucleotides in the Y chromosome molecule's NRY region it is not likely to survive much more than one generation unchanged. Thus a Y chromosome would statistically survive about 1.45 generations without change based on the reference Y-DNA nucleotide mutation rate of 2x10^-8.  And rounding things off to an integer, as a rule of thumb, I generally state in casual conversations that we can expect about one new SNP per generation.  But finding that new SNP, would be a costly challenge with today's technology (as of 2005). The cost of sequencing the entire Y chromosome molecule at present makes it totally out of the question for amateur genealogists. But think of it. In the not so far future we may be able to economically sequence and type the entire Y chromosome for about $1000.  Then we will be able to find our own unique family private Y-SNPs. But, for now we can now only economically do SNP tests for known nucleotide mutation locations. The Y-SNP markers now used are mutations which occurred long ago and which are found in large groups of males.  These Y-SNP markers are used to sort human Y chromosomes into the various Haplogroups we hear so much about in the anthropological use of DNA testing. The most recent common male ancestor of two people who share the same Y-SNP test Haplogroup is many thousands, even 10's of thousands of years ago. Thus Y-SNP testing, while interesting, is not of much use for traditional genealogical purposes, other than being able to tell you what large geographic area or continent of the world your ancient male Y chromosome ancestor originally is thought to have lived. Males who share the same 37, 43, or 67 Y-STR marker haplotype are very likely fairly recently related via their direct male line. Males who share the same SNP test haplogroup are in general not closely related and share that haplogroup assignment with millions and millions of other males. Traditional genealogists should only do a SNP test if they are interested in the ancient anthropological origin of their Y chromosome, if they wish to learn the broad geographical area of the modern world where that haplogroup is found in high frequency today, or to 100% confirm a haplogroup assignment predicted from one’s Y-STR haplotype pattern if that prediction is somewhat ambiguous.


 

Penn State (2008, January 29). Genome-wide Study Shed Lights on Factors that Contribute to DNA Mutations. Science Daily, January 29, 2008.

 


 

 

Kerchner's DNA Testing and Genetic Genealogy Info and Resources Page

 


Copyright ©2005-2023
Charles F. Kerchner, Jr., P.E.

All Rights Reserved
Created - 07 Jan 2005
Updated - 26 Jun 2023