¿ªÔÆÌåÓý

ctrl + shift + ? for shortcuts
© 2025 Groups.io

Testing statistics and country-level bias update


 

¿ªÔÆÌåÓý

Hi folks,


About November time I like to go through the haplotree, looking to see how much different branches and different countries have grown over the previous year. This helps us understand both what the country-level and haplogroup-level biases are, how they change over time, and where we might need to focus our efforts to help get the right people tested. The other reason for analysing these statistics is that it feeds back into the question of origins: where are people of particular haplogroups more or less likely to come from, hence where are specific haplogroups more likely to originate? But that's a question that's going to take me more time to answer than I have right now! Instead, see earlier messages like #5759. Thanks particularly to Ewenn who, late last year, gave me some code to speed up this process hugely.


Here, I'll be comparing two time periods: Nov 2022 - Nov 2023, and Nov 2019 to Nov 2023.


In general, this year continues the trend of previous years. The British Isles bias continues to get slightly worse, but this is more than counterbalanced by the extra information that new testers are bringing in.


Globally, the size of the FTDNA haplotree has increased by 6.6% in the last year, which is a slower rate than the average of the last four years (8.5%). The haplotree contains information from a variety of sources, so we can't infer information about how fast FTDNA's customer base is increasing from this information.


This increase has been disproportionately from people who cannot trace their origins back to Europe. The size of the European testing population has grown by only 6.1%. The British Isles bias now stands as follows: testers from the British Isles now make up 45% of European testers in the database, despite the British Isles making up only 8.6% of the comparative modern population, so we over-sample the British Isles by a factor of about 5.2 compared to the rest of Europe. It should be noted, however, that modern populations are not always indicative of historical populations. If we instead use population estimates from 1800, where the typical person's earliest-known ancestor lived, we find a British Isles bias of 5.9 instead. Some specific country or region-level modern/historical biases are:

England 2.3/1.6

Scotland 0.39/0.57

Wales 1.59/1.41

N.I. 0.71/1.6

Ireland 0.25/0.84

France 11.7/24.0

Germany 5.5/6.4

Netherlands 8.3/4.9

Poland 6.8/6.0

Czechia 6.5/11.5

Austria 9.5/14.1

Denmark 4.8/3.5

Scandinavia+Finland 1.45/1.16

European former USSR 15.0/16.1

Balkans+Turkey 20.9/13.9

Meditteranean 10.8/12.9


The further down we go in the tree, the faster the increase in testing becomes. This is because we get rid of the customers and studies that have only undertaken limited testing, and are increasingly left with only BigY testers. This is therefore a better estimator of the speed at which branches relevant to us are increasing.


The R-U106 testing population has increased by about 11.5% globally, by 10.9% in people with known European ancestry, and by 10.1% in the British Isles. These are very similar rates to those over the last four years, so the growth is fairly constant. These rates are faster than the growth in R-P312 (7.9% globally), but this may reflect the historical depth of testing rather than any inherent behaviour in R-U106! R-U106 testers make up 8.55% the haplotree at FTDNA. We've seen above-average increases in Northern Ireland and among the former Eastern Bloc countries, particularly the Czech Republic, but also Poland. We've seen below-average increases in Scotland, Belgium, Norway, Finland and Russia - the latter largely because FTDNA now offer sub-populations within Russia rather than because of geopolitics.


We can step down further to R-Z2265, which represents almost the whole of R-U106, but ignores the ~1240 testers that have only tested as far as R-U106 with single SNP or SNP-pack testers, i.e. mostly BigY testers. There are 19692 Z2265+ entries globally (an increase of 13.2%) and 8306 in Europe (an increase of 12.1%). This rate is slightly above the average for the last four years (12.4%/11.3%). Given the increase in database size, that shows that the rate of BigY testing is still proportionally increasing. At this rate, the database size doubles about every six years meaning that an average tester will have to wait about six years to receive a match closer to them and a new haplogroup designation (obviously not true for people who have purposefully tested close relatives).


Different haplogroups have grown faster than others. The reasons behind this haven't always been clear! For example, testing in R-Z18 and particularly R-L257 have been growing much faster this year (13.1%, 14.2%) than their average in the previous four years (10.2%, 9.6%). Testing in R-Z156 and particularly R-DF96 have slowed (from 13.9% to 10.8% and 12.6% to 10.3%). R-DF98 continues to out-perform other haplogroups in terms of deep testing (22.0% to 17.4%). R-L47 is below average (9.4% to 10.1%) but R-Z9 have increased testing (13.9% to 16.2%). R-FGC910 in particular has increased in size by a sizeable 20.7% on the year, and R-Z343 by 15.8%. Hopefully many of you will have seen these increases among your matches.


Cheers,


Iain.

Join [email protected] to automatically receive all group messages.