Keyboard Shortcuts
Likes
- R1b-U106
- Messages
Search
Re: Understanding Big Y Matches outside my Haplogroup
开云体育Hi Mark, A quick and hasty reply... The lack of matches is due to the distance of your closest relationship with other testers. Apart from your cousin, your closest relationship is at R-Y8604, 24 SNPs plus your private SNP before your relationship with your cousin, i.e., about 1800 years old, while R-S5245 is about 1900 years old and separated by only one extra SNP. The matching criteria for Y-STRs are set at around 1000 years ago, so we would expect that you would probably not match anyone else on Y-STRs. BigY on the other hand matches nominally matches to people up to about 1500 years ago: the limits are set so that matches must have a maximum of 30 SNPs that separate them in their non-matching variants. Normally, this is the same as separate them on the haplotree, but it depends on whether every SNP in the non-matching variants list is sufficiently reliably to be placed on the haplotree, and it dependson whether every SNP is called in both tests. With there being 26 SNPs that separate you from your R-Y8604 common ancestor, we would expect everyone else in R-Y8604 and any upstream haplogroup (like R-S5245) to be more than 30 SNPs distant from you, which is why you don't match anyone else at R-Y8604. However, the exception to this would be people not tested for all those 26 SNPs. This includes people who have taken the earlier BigY-500 test, which only covers about 2/3 of the modern BigY-700. These BigY-500 tests of your matches likely tested only some of the 24 SNPs that form R-FT201177, perhaps one or two of your private variants, and may only have a few SNPs and private variants of their own that separate them from the R-S5245 ancestor. I've had a look at these three matches: one isn't in the project, one is in the project but who has provided us with minimum access, and one definitely has only taken a BigY-500 test. So you are probably exactly right when you say these matches to you only took the original BigY and thus have fewer non-matching variants: if they upgraded to the modern BigY-700, then they wouldn't be matches to you any more. Best wishes, Iain. |
Re: Understanding Big Y Matches outside my Haplogroup
If your 3 “matches” are STR matches it is because STR matches are fickle due to parallel, back, and rapid mutations.? STR matches always include some degree of uncertainty. On Sun, Nov 19, 2023 at 10:06?AM Mark Winz via <markwinz68=[email protected]> wrote: When I took the Y-37 test in 2016 I had no matches at 37 STR or above. Upgraded to 111 markers, no joy. In 2019 a remote cousin appeared at 111 (6 steps) so we took the Big Y 700 test. We match on that test with 6 non-matching variants. The test established our Haplogroup R-FT201177, a little German enclave, in mostly British R-Y8604 under R-S5245.? My guess is that there are at least a 10 Big Y tests in R-Y8604 but I don't match any of them.? |
Understanding Big Y Matches outside my Haplogroup
When I took the Y-37 test in 2016 I had no matches at 37 STR or above. Upgraded to 111 markers, no joy. In 2019 a remote cousin appeared at 111 (6 steps) so we took the Big Y 700 test. We match on that test with 6 non-matching variants. The test established our Haplogroup R-FT201177, a little German enclave, in mostly British R-Y8604 under R-S5245. ?My guess is that there are at least a 10 Big Y tests in R-Y8604 but I don't match any of them.?
However, I do match 3 guys in other subclades under R-S5245. Why them? Did they take the original Big Y and so have fewer non matching variants to count for the map algorithm?? R-U106 > R-Z2265 > R-BY30097 > R-FTT8 > ?R-Z381 > R-Z301 > R-L48 > R-Z9 > R-Z30 > R-Z27 > R-Z345 > R-Z2 > RZ7 > R-Z31 > R-Z8 > R-Z1 > R-Z346 > R-DF101 > R-S1726 > R-DF102 > R-FGC12975 > R-S5245 > R-Y8604 > R-FT201177 Mark |
Re: Testing statistics and country-level bias update
Hi Iain, folks, ? Thank you for this inventory and your observations, which are as interesting as they are relevant. ? I would like to associate some maps, which, at least I hope, can somewhat graphically illustrate your remarks, as well as some of the biases encountered in the FTDNA’s databases. ? Below, two sets of ??European?? maps relating to R-U106. ? 1- The first series includes data from the FTDNA’s Discover tool, by countries (updated on 11.18.2023). 2- The 2nd series concerns data from the FTDNA’s SNP Map tool, by regions (updated on 11.14.2023 / Y-DNA Haplotree from 11.10.2023). These data correspond to the geo-location coordinates of EKA. Only testers who have provided the location of their EKA in Europe, or one of the neighboring countries (regions) displayed on the map, are considered. ? For each country / region, 2 numbers are associated?: the first concerns the number of FTDNA’s R-U106+ testers. The 2nd concerns the number of associated FTDNA’s testers, all haplogroups combined (e.g. South West England?: 322 R-U106+ testers out of a total of 1489). Finally, a percentage is computed. ? For each of these 2 series of maps, you will find?: 1- A frequency map of R-U106 (and all of its subclades) by country / region (for countries not subdivided into regions, such as the Republic of Ireland, this time I only use data from SNP Map). 2- A map that I call ??distribution??. It displays the percentage represented by the number of R-U106+ testers associated with a country/region, out of all the testers represented on the map. I don't apply any correction to it (e.g. South West England?: 322 / 4339 > 7.4%). 3- A called map of corrected distribution. To the previous map, I apply a correction factor taking into account the sampling rate of the population of each country/region. The populations considered here are modern populations. The best would actually be to use historical populations, for example from the beginning of the 19th century, or the second half of the 18th century. The lower the sampling rate of the population of a region/country, the greater the margin of error on the extrapolated result. This gives rise to some aberrations such as La Rioja in Spain, the Faroe Islands, certain regions of the Russian Volga, etc. We can also notice that the sampling rate for different regions of the same country is not fixed… ? Discover?(Countries): ? SNP Map?(Regions): ? It could potentially be interesting to follow the evolution of these data/maps over time. Consequently, if this provided interesting and complementary insight, I could update them around November 2024 for comparison. ? Cheers, ? Ewenn |
Re: The Importance of Colmar 239
Hello Iain Your post was so interesting, that I took the liberty of posting it to the England GB EIJ project; which is always very active. Already some some have asked about re-posting it elsewhere; to which I responded "I rely on Iain previously having said he was happy to have his words posted elsewhere". So it will be interesting to see the comments that flow. There is one comment, suggesting that not taking waterways into account, is yet another bias. Kind regards John |
Re: The Importance of Colmar 239
Thank you Iain for this insightful opinion.
Indeed, FTDNA does not seem to have completely succeeded in erasing prediction errors due to bias in its databases. An illustrative example of these prediction biases, that you are not without knowing, with the branch R-S775 > R-L745 > R-FGC34909 (> R-S781 - downstream of R-P312>>R-L21), illustrates due to the House of Stuart. The Stuart, as well as the FitzAlan, descend from the Breton knight Alan fitz Flaad (+ ~1120), seneschal of Dol-de-Bretagne. These 2 lineages were established in the United Kingdom in the 12th century. Their most distant known ancestor (MDKA) was Alain, dapifer sacrae ecclesiae Dolensis archiepiscopi Dolensis, alive in the 11th century. Therefore, upstream of R-L745, Globetrekker should pass through Brittany (France), for at least a century or more (between R-S775 and R-L745, almost 2500 years have passed!), or this is not the case... Globetrekker indicates that R-S552 (a descendant of R-L21) would have crossed the Channel around 2 600 BCE, and the entire lineage from R-S552 to R-S781 would have remained in United Kingdom... As a certain number of Bretons originally came from Great Britain, there is a significant possibility that this lineage actually came from Great Britain, then migrated to Brittany around the 6th century, before returning to settle on the other side of the Channel, from the 12th century. Cheers, Ewenn |
Re: The Importance of Colmar 239
Thanks, Iain.? I have already sent feedback to FTDNA concerning the Globetrekker issue, because it is so unrealistic in it's placing anything before S3997 on the coast of Britain.? Personally, I think that even S3997 was in Britain as late as 200 BCE - but that's just a hunch.
Ed?? |
Re: The Importance of Colmar 239
Louise Walsh Throop
If I have the geology of the English Channel correctly, the Channel was formed ca 6000 years before present, so an ancestor of Colmar239 could have crossed the channel area before 6000 ybp or taken a boat across the channel after the 6000-year ago event, and then descendants move south in France into Gaul.......etc. Louise
On Wednesday, November 15, 2023 at 12:00:48 PM PST, ejsteele56@... <ejsteele56@...> wrote:
Analysis of the DNA of ancient remains of a man believed to have been associated with the La Tene culture in Haut-Rhin, France, and known only as Colmar 239, reveal that he was descended from the common ancestor of haplogroup R-A10645.? Furthermore, according to the Discover Haplogroups Report, only 42 FTDNA testers currently share that common ancestor of A10645 with Colmar 239.
|
Re: The Importance of Colmar 239
开云体育Thank you Iain! ?This is reminiscent of several discussions and conversations in my neck of the aDNA woods over the last several years. ?We are fortunate to having these analyzed remains, but they have as many as or more limitations as modern day testers in regard to haplogroup or even subclade geographical origins.That is difficult for many to fully embrace, but none the less is our reality. Susan Hedeen? On Nov 16, 2023, at 12:00 PM, Iain via groups.io <gubbins@...> wrote:
|
Re: The Importance of Colmar 239
开云体育Hi Ed, I don't know that there's been a lot of talk specifically about Colmar 239, but if you're basing your origin estimates off Globetrekker, then you're in for a bad time. Globetrekker effectively brings Family Tree DNA up-to-date with the efforts that Rob Spencer, Hunter Provyn and others have been doing for a few years now, but no-one really comes close to an accurate solution because the biases in the data still haven't been correctly accounted for, and the ancient DNA (in my opinion) hasn't been given the right weighting in the calculations. The problem with assigning origins based on ancient DNA is the same as assigning origins based on modern testers: how much does one sample really tell you? The key here is contemporaneity. A sample like Ploti?tě nad Labem 1, who lived within a few generations of the R-U106 MRCA, says a lot about where R-U106 formed. Colmar 239, living 1000 years or so after the R-A10645 MRCA, says much less about where R-A10645 formed. We also have to be careful with our nomenclature: an ancient DNA sample may not be tested for all the SNPs in a haplogroup, therefore may descend from an intermediate haplogroup that is now either untested or extinct; similarly the sample may be positive for downstream sub-clades, for either not positive for all SNPs, or not tested for any. If we assume a front speed of a between one and a few km/year for human migrations (https://www.pnas.org/doi/10.1073/pnas.1920051117), we can presume that R-U106 was founded within a few hundred km of where Ploti?tě nad Labem 1 was buried. This puts the origin of R-U106 in the vicinity of Bohemia, although doesn't allow us to definitively narrow it down to a specific country. If we do the same to Colmar 239, we can say that R-A10645 was founded somewhere within between one and a few thousand km of the La Tene culture, i.e., somewhere within Europe. That's not very constraining. In both cases, we can say that at least some of R-U106 passed through the Bohemian part of the Corded Ware Culture, and that some of R-A10645 passed through the La Tene culture. In all likelihood, proportionally more of R-U106 probably passed through the Bohemian CWC because there had been much less time for R-U106 to diverge into different regions and cultures by that point. So, not only is Ploti?tě nad Labem 1 more constraining geographically regarding the origin of R-U106 than Colmar 239 is of R-A10645, but it also says more specifically about what cultures our contemporary early R-U106 ancestors were doing at the time. By contrast, by 600 BC, R-A10645 could have been spread to many different cultures in many different places, and we only sample one. To put some numbers on this, imagine a population that diffuses out at a constant rate over time. It might cover 1000 square km after 100 years, 4000 square km after 200 years, 9000 square km after 300 years, and 2.5 million square km (1/4 of Europe) after 5000 years, etc. Imagine a tester in a 5000-year-old haplogroup like R-U106 today. We can draw a circle around him of 2.5 million square km and say that there's a 68% chance that the origin of R-U106 lies within that circle. If we have ancient DNA that's from 1000 years after R-U106 split, then we can draw a circle of 100,000 square km around the burial, thus that ancient DNA is 25 times more precise, and thus "worth" 25 modern testers today. For Ploti?tě nad Labem 1, we are dealing with a sample probably about 100 years or so after R-U106 split, thus the circle is 2500 times smaller, and that ancient DNA is "worth" 2500 modern testers in terms of defining origins. Of course, populations move and spread and haven't expanded at a constant size, and all sorts of other confounding factors so, actually, each ancient DNA sample is "worth" much more than this, because it strips out all of these systematic trends. But the comparative "worth" of Ploti?tě nad Labem 1 is still more than 100 times that of Colmar 239 by this argument, and Colmar 239 on its own probably doesn't say much more about the origins of R-A10645 than the existing 42 FTDNA testers already did. The take-home point here, then, is that ancient DNA needs to assigned to a contemporary haplogroup (not a much more ancient one) before it says anything about our ancestors, and it is only samples really close to the TMRCA of the haplogroup that can be so incredibly defining in terms of origins. Cheers, Iain. |
The Importance of Colmar 239
Analysis of the DNA of ancient remains of a man believed to have been associated with the La Tene culture in Haut-Rhin, France, and known only as Colmar 239, reveal that he was descended from the common ancestor of haplogroup R-A10645.? Furthermore, according to the Discover Haplogroups Report, only 42 FTDNA testers currently share that common ancestor of A10645 with Colmar 239.
|
Re: Best-guess origins of the major R-U106 haplogroups: methods and results
#origins
Hi Roy, A new post on the FTDNA blog which deals precisely on this: Cheers,? Ewenn Le lun. 13 nov. 2023 à 15:36, Roy <apeiron@...> a écrit?:
|
Re: Best-guess origins of the major R-U106 haplogroups: methods and results
#origins
Thanks for the clarification, Iain. I thought you were referring to a split into three major haplogroups of the Germanic speaking people. Makes perfect sense now.
Thanks again for all your work on U106 and for sharing your findings with us! Best Regards, Randy |
Re: Best-guess origins of the major R-U106 haplogroups: methods and results
#origins
开云体育Iain can now add Indo-European linguistics to his portfolio of expertise. However, one small point might cause confusion: North Germanic languages only reached the Black Sea through the trading and raiding of mainly Swedish Vikings, who were known as Varangians.? Members of these groups formed the state of Kievan Rus and ultimately became Slavic speakers.? Centuries before, however, Greuthingi-Goths had penetrated into the Crimean Peninsula. Contrary to what is stated in the Wikipedia article, Gothic is normally classified as an East Germanic language. I expect that when Iain wrote Black Sea he intended Baltic Sea. It is interesting to note that Crimean Gothic survived to the
late 18th century in some areas
.
-Roy
On 11/13/23 04:34, Iain via groups.io
wrote:
|
Re: Best-guess origins of the major R-U106 haplogroups: methods and results
#origins
开云体育Hi Randy, There are plenty people who will give you a more authoritative answer on this, however I'll give it my best shot. The Germanic language family and, by extension, the Germanic people, can be split into three branches: Northern, Western and Eastern. The Northern Germanic group roughly co-incides with the earliest Germanic people around the shores of the Black Sea and Scandinavia, namely the Norwegian, Swedish, Danish and descendant peoples in the Faeroes, Iceland and Greenland, and parts of Finland. The Western Germanic group comprises modern Germans, but also the countries west of them, including the Dutch, Flemish and English languages. The Eastern Germanic group comprises language families in eastern Europe, which are now extinct but survive in Gothic texts. The Western Germanic group is often sub-divided. How far back we can place the origins of a distinct Germanic people is debateable. Most references I've seen typically put this definition around 700-500 BC, and the culture didn't fragment into these later branches until the Germanic peoples started expanding, both south-west into Celtic central Europe as the Germanic tribes the Romans knew, and eastwards into what ultimately became the Gothic migrations down towards the Black Sea and into Rome. These migrations and expansions were a gradual process across many centuries, and subsequent migrations (e.g. within Prussia) have erased many differences that there once were. Part of what we can do with genetic genealogy is use the TMRCA estimates that we create to merge with what is known from an archaeological understanding of this period of history, and help chart the migrations of people that we see evidenced in the archaeology. As you may anticipate, this isn't a straight-forward process. Cheers, Iain. |
Re: Best-guess origins of the major R-U106 haplogroups: methods and results
#origins
"R-U106>Z381>L48>Z9>Z2>Z7>Z8>Z1>Z346>Z343>CTS5601
Likely MRCA data range: 1100-100 BC Likely origin: Scandinavia, likely Sweden Culture: proto- or early Germanic (if early), north Germanic if later Narrative evidence: R-CTS5601 shows much stronger Scottish (and, to a lesser degree, Irish) themes than R-FGC11784, or much of the rest of R-Z8 as a whole. It remains strong in England, though only typically so compared to R-Z381 as a whole. In continental Europe, it is strong in the Netherlands, and moderately present in Germany and France. In Scandinavia, it is very strong in Sweden and Finland, and present in Norway. Sporadic returns are seen in eastern Germanic branches, but this is very much focussed towards north Germanic areas, with some western Germanic migrations happening later. Consequently, it appears that this predates the major split of the Germanic peoples into their three traditional branches, but has latterly become most associated with the northern Germanic groups." |
Re: Testing statistics and country-level bias update
开云体育Hi Mike, For R-FGC910, see R-Z7 and R-FGC902 on this post: /g/R1b-U106/message/5759 The period of history is very difficult to make predictions about. It's too far back in history for historical sources or modern distributions to accurately reflect origins without complex interpretation, and it's too far forward in history for the known origins of R-U106 to be instructive. Apart from some stand-out cases, we need better modelling in this period before we can be very confident in our results. Cheers, Iain. |
Re: Testing statistics and country-level bias update
Hi Ian, This data is really interesting. I run a small project for "Gleave" surname which is quite rare, mainly present in Lancashire and Cheshire. We are downstream of R-FGC910 and after spending a few years alone on???, a new big y match with the surname "Marchant" joined me on a new haplogroup,??R-FTA81892. Are there any indications of the geographical origin of R-FGC910? An analysis of the different branches reveals diverse locations such as Sweden, England, Scotland, Germany. The Globetrekker tool has R-FCG 902 moving swiftly back to the continent following Z-2's arrival, but I would guess this is not accurate. All the best, Mike Gleave
On Saturday, 11 November 2023 at 16:10:18 CET, Iain via groups.io <gubbins@...> wrote:
Hi folks, About November time I like to go through the haplotree, looking to see how much different branches and different countries have grown over the previous year. This helps us understand both what the country-level and haplogroup-level biases are, how they change over time, and where we might need to focus our efforts to help get the right people tested. The other reason for analysing these statistics is that it feeds back into the question of origins: where are people of particular haplogroups more or less likely to come from, hence where are specific haplogroups more likely to originate? But that's a question that's going to take me more time to answer than I have right now! Instead, see earlier messages like #5759. Thanks particularly to Ewenn who, late last year, gave me some code to speed up this process hugely. Here, I'll be comparing two time periods: Nov 2022 - Nov 2023, and Nov 2019 to Nov 2023. In general, this year continues the trend of previous years. The British Isles bias continues to get slightly worse, but this is more than counterbalanced by the extra information that new testers are bringing in. Globally, the size of the FTDNA haplotree has increased by 6.6% in the last year, which is a slower rate than the average of the last four years (8.5%). The haplotree contains information from a variety of sources, so we can't infer information about how fast FTDNA's customer base is increasing from this information. This increase has been disproportionately from people who cannot trace their origins back to Europe. The size of the European testing population has grown by only 6.1%. The British Isles bias now stands as follows: testers from the British Isles now make up 45% of European testers in the database, despite the British Isles making up only 8.6% of the comparative modern population, so we over-sample the British Isles by a factor of about 5.2 compared to the rest of Europe. It should be noted, however, that modern populations are not always indicative of historical populations. If we instead use population estimates from 1800, where the typical person's earliest-known ancestor lived, we find a British Isles bias of 5.9 instead. Some specific country or region-level modern/historical biases are: England 2.3/1.6 Scotland 0.39/0.57 Wales 1.59/1.41 N.I. 0.71/1.6 Ireland 0.25/0.84 France 11.7/24.0 Germany 5.5/6.4 Netherlands 8.3/4.9 Poland 6.8/6.0 Czechia 6.5/11.5 Austria 9.5/14.1 Denmark 4.8/3.5 Scandinavia+Finland 1.45/1.16 European former USSR 15.0/16.1 Balkans+Turkey 20.9/13.9 Meditteranean 10.8/12.9 The further down we go in the tree, the faster the increase in testing becomes. This is because we get rid of the customers and studies that have only undertaken limited testing, and are increasingly left with only BigY testers. This is therefore a better estimator of the speed at which branches relevant to us are increasing. The R-U106 testing population has increased by about 11.5% globally, by 10.9% in people with known European ancestry, and by 10.1% in the British Isles. These are very similar rates to those over the last four years, so the growth is fairly constant. These rates are faster than the growth in R-P312 (7.9% globally), but this may reflect the historical depth of testing rather than any inherent behaviour in R-U106! R-U106 testers make up 8.55% the haplotree at FTDNA. We've seen above-average increases in Northern Ireland and among the former Eastern Bloc countries, particularly the Czech Republic, but also Poland. We've seen below-average increases in Scotland, Belgium, Norway, Finland and Russia - the latter largely because FTDNA now offer sub-populations within Russia rather than because of geopolitics. We can step down further to R-Z2265, which represents almost the whole of R-U106, but ignores the ~1240 testers that have only tested as far as R-U106 with single SNP or SNP-pack testers, i.e. mostly BigY testers. There are 19692 Z2265+ entries globally (an increase of 13.2%) and 8306 in Europe (an increase of 12.1%). This rate is slightly above the average for the last four years (12.4%/11.3%). Given the increase in database size, that shows that the rate of BigY testing is still proportionally increasing. At this rate, the database size doubles about every six years meaning that an average tester will have to wait about six years to receive a match closer to them and a new haplogroup designation (obviously not true for people who have purposefully tested close relatives). Different haplogroups have grown faster than others. The reasons behind this haven't always been clear! For example, testing in R-Z18 and particularly R-L257 have been growing much faster this year (13.1%, 14.2%) than their average in the previous four years (10.2%, 9.6%). Testing in R-Z156 and particularly R-DF96 have slowed (from 13.9% to 10.8% and 12.6% to 10.3%). R-DF98 continues to out-perform other haplogroups in terms of deep testing (22.0% to 17.4%). R-L47 is below average (9.4% to 10.1%) but R-Z9 have increased testing (13.9% to 16.2%). R-FGC910 in particular has increased in size by a sizeable 20.7% on the year, and R-Z343 by 15.8%. Hopefully many of you will have seen these increases among your matches. Cheers, Iain. |
Re: Projects for Sub-groups of U106
toggle quoted message
Show quoted text
On Nov 11, 2023, at 9:52 AM, Tiger Mike <mwwdna@...> wrote:
|