¿ªÔÆÌåÓý

ctrl + shift + ? for shortcuts
© 2025 Groups.io

Request for ancient DNA analysis


 

Hi folks,

?

I know that some of you have experience dealing with extracting calls from ancient DNA, and possibly more time than me to perform this kind of analysis.

?

There are a lot of early ancient DNA samples that are typed as R-Z18 in , namely:

CGG107465
CGG106705
CGG106708
CGG106706
CGG105923
NEO752
CGG106744
CGG100212
NEO946

?

These samples may be descended from the R-Z18 MRCA, or they may represent earlier "pre-R-Z18" individuals that are positive for some of the R-Z18 SNPs and negative for others. They may also belong to sub-clades that have not been tested for recently. However, the earlier samples among these (CGG107465 in particular) could have a bearing on the TMRCA of R-Z18 and its sub-clades, if we can establish that many of the R-Z18 SNPs are positive in these burials.

?

What I'd really like is for someone to go through these burials and determine which SNPs are positive, from R-U106 down to the early branches of the R-Z18 tree. If someone's up for that, we may be able to make more concrete statements about R-Z18 and its expansion.

?

Cheers,

?

Iain.


 

These old Z18 samples are from a pre-print


On Fri, Jan 3, 2025 at 5:37?AM Iain via <gubbins=[email protected]> wrote:

Hi folks,

?

I know that some of you have experience dealing with extracting calls from ancient DNA, and possibly more time than me to perform this kind of analysis.

?

There are a lot of early ancient DNA samples that are typed as R-Z18 in , namely:

CGG107465
CGG106705
CGG106708
CGG106706
CGG105923
NEO752
CGG106744
CGG100212
NEO946

?

These samples may be descended from the R-Z18 MRCA, or they may represent earlier "pre-R-Z18" individuals that are positive for some of the R-Z18 SNPs and negative for others. They may also belong to sub-clades that have not been tested for recently. However, the earlier samples among these (CGG107465 in particular) could have a bearing on the TMRCA of R-Z18 and its sub-clades, if we can establish that many of the R-Z18 SNPs are positive in these burials.

?

What I'd really like is for someone to go through these burials and determine which SNPs are positive, from R-U106 down to the early branches of the R-Z18 tree. If someone's up for that, we may be able to make more concrete statements about R-Z18 and its expansion.

?

Cheers,

?

Iain.


 

preprint??and as far as I am aware, the raw data has not yet been made available.

Ray

On Fri, Jan 3, 2025 at 6:39?AM Raymond Wing <wing.genealogist@...> wrote:
These old Z18 samples are from a pre-print

On Fri, Jan 3, 2025 at 5:37?AM Iain via <gubbins=[email protected]> wrote:

Hi folks,

?

I know that some of you have experience dealing with extracting calls from ancient DNA, and possibly more time than me to perform this kind of analysis.

?

There are a lot of early ancient DNA samples that are typed as R-Z18 in , namely:

CGG107465
CGG106705
CGG106708
CGG106706
CGG105923
NEO752
CGG106744
CGG100212
NEO946

?

These samples may be descended from the R-Z18 MRCA, or they may represent earlier "pre-R-Z18" individuals that are positive for some of the R-Z18 SNPs and negative for others. They may also belong to sub-clades that have not been tested for recently. However, the earlier samples among these (CGG107465 in particular) could have a bearing on the TMRCA of R-Z18 and its sub-clades, if we can establish that many of the R-Z18 SNPs are positive in these burials.

?

What I'd really like is for someone to go through these burials and determine which SNPs are positive, from R-U106 down to the early branches of the R-Z18 tree. If someone's up for that, we may be able to make more concrete statements about R-Z18 and its expansion.

?

Cheers,

?

Iain.


 

For what it's worth, I left a comment on the preprint asking if raw data was available.
?
mike


 
Edited

Hi Iain, all,

?

Good news: the for these samples are finally available on the European Nucleotide Archive website (Many thanks to Jeremy Langton -?Preprint v2: ).

I took the opportunity to update my database regarding the SNPs identified as belonging to the R-U106 branch of the Y-DNA Haplotree (Discover updated on April 12, 2025 ¨C I haven't updated other data, such as the path to R-U106, which dates from November 2022, or the YFull data, which remains at version 10.08).

?

So I attempted an analysis of the requested ancient DNAs (+some others).

?

?

  • ?:

Consistent path to R-L151.

R-P312- (2 reads)

R-U106+ (1 quality positive U106 read?: BQ37, mapQ60)

R-Z18+?:

  • Z18+ 2 positive reads (BQ37, mapQ60)

  • Z16+ 1 positive read (BQ37, mapQ60)

  • Z370+ 4 positive reads (BQ37, mapQ60)

  • No read for YSC0000054, Z368, Z369, Z371, Z8183

  • Z14 (INDEL) status is not checked in my analysis.

?

Accordingly, CGG107465 can be either R-Z18, or possibly PRE-R-Z18.

?

?

  • ?:

Consistent path to R-L151.

R-P312- (1 read)

R-U106+ (1 quality positive U106 read?: BQ37, mapQ60)

R-Z381+ (2 reads)

R-S9891?:

  • S9891+ 1 positive read (BQ37, mapQ60)

  • FGC13963+ 1 positive read (BQ37, mapQ60)

  • Negative for FGC13950, FGC13961, FT78563, FT79678, PH1367, S24073

  • No read for A14230, BY41245, FGC13942, FGC74400, FT171810, FT77978, FT80239, S10250, S17205, S20758.

?

Consequently, CGG106838 has a non-zero probability of being PRE-R-S9891. An additional argument is provided by the analysis of sample which would be, according to the study (see ), a first-degree relative of CGG106838 (a father-son kinship):

Consistent path to R-L151. R-P312- (1 read)

R-FGC13959+:

  • FGC13959+ 1 positive read (BQ37, mapQ60)

R-S9891:

  • S9891+ 1 positive read (BQ37, mapQ60)

  • S17205+ 2 reads (BQ11, mapQ 60 / BQ37, mapQ60)

?

In summary:

CGG106838 and CGG106770 would be (combined)

  • positive for the 3 SNPs S9891, S17205, and FGC13963

  • negative for 6 SNPs: FGC13942, FGC13950, FGC13961, FT78563, FT79678, PH1367, S24073

  • Status unknown for the other 9 SNPs.

?

?

  • ?:

    These two samples would relate to a single individual, according to the study (see ).

?

Consistent path to R-L151.

R-P312- (1 read)

R-U106+ 4 positive U106 reads

R-Z2265+ 1 positive read

R-Z18+

  • Z18+ 2 positive reads (BQ37, mapQ60)

  • Z16+ 2 positive reads

  • Z369+ 1 positive read

  • Z370+ 3 positive reads

  • Z371+ 5 positive reads

R-CTS12023?:

  • CTS3624 3 quality positive reads (BQ37, mapQ60)

  • Negative for 14 SNPs du bloc

  • Unknown status for the last 11 SNPs.

?

CGG106705 may be PRE-R-CTS12023.

?

?

  • (with his supposed brother CGG106707)?:

Consistent path to R-L151.

R-P312- (3 reads)

R-U106?: no read

R-Z18?:

  • Z18+ 4 positive reads

  • Z368+ 2 positive reads

  • Z370+ 1 positive quality read (BQ37, mapQ60)

R-CTS12023?:

  • A19698+ 1 positive quality read (BQ37, mapQ60)

  • CTS3624+ 2 positive reads

  • 1 mixed read (DF95)

  • Negative for 10 SNPs of this block

  • Unknown status for 13 SNPs.

?

CGG106708 (and CGG106707) might be PRE-R-CTS12023.

?

?

  • ?:

Consistent path to R-L151.

R-P312- (3 negative P312 reads, but 1 positive read for CTS12684 at the start position of the DNA segment)

R-U106+ (1 positive quality read)

R-Z18+

  • Z18+ 2 positive reads

  • Z16+ 2 positive reads

  • Z369+ 1 positive read

  • Z370+ 2 positive reads

  • Z371+ 1 positive read

  • R-FGC5817+ 1 positive FGC5817 read

R-BY66533?:

  • BY54993+ 1 positive quality read

  • BY55557+ 5 positive reads

  • Negative for 27 SNPs of this block

  • Unknown status for 7 SNPs

?

CGG105923 might be PRE-R-BY66533.

?

?

  • ?:

The status of this sample is less clear (see the analysis file). The path to R-L151 ends at R-P310 (with 1 positive read for YSC0000082, but mapQ0).

R-P312- (1 negative read for CTS12684)

Nothing between R-P310 and R-U106.

Only one positive read for Z371 (R-Z18).

1 read positive for ZP156 (R-ZP156, but mapQ0, located at the last nucleotide of the DNA segment).

?

CGG106744 might therefore eventually be R-ZP156 (or PRE-R-ZP156).

?

?

  • ?:

Consistent path to R-L151.

R-P312- (4 reads)

R-U106+ 3 positive U106 reads

R-Z18?:

  • Z16+ 1 positive read

  • Z369+ 3 positive reads

  • Z370+ 1 positive read

  • Z371+ 2 positive reads

?

NEO752 may be R-Z18 (or PRE-R-Z18). FTDNA comes to the same conclusion (Mades? 752).

?

?

  • NEO946?:

FTDNA classifies NEO946 as R-L151 (Hove ? 946).

?

?

  • ?:

Consistent path to R-L151.

Unknown status for R-P312.

R-U106+ 1 read

R-Z17?:

  • Z17+ 1 positive quality read

R-S5970?:

  • ZP160+ 1 positive quality read

?

CGG106724 might possibly be R-S5970+ (or PRE-R-S5970+).

?

?

Obviously, these results should be taken with a grain of salt. FTDNA would have the potential to compare these ancient DNAs with modern testers in their database, as well as aDNAs between them.

?

Cheers,

?

Ewenn


 

Thanks Ewenn,

?

This is really helpful, as usual. It doesn't change my main takeaway from this dataset, which is that the initial R-Z18 thrust appears to have been as part of the Bell Beaker group into Denmark around 2300 BC. However, what it does do (via the R-CTS12023 and R-BY66533 reads) is firmly establish the TMRCA of the modern definition of R-Z18 as being before CGG106708 (2125-1947 BC). This dramatically limits the younger end of FTDNA's 2931-1727 BC and also limits on my most-recent estimate of 2623¨C1897 BC. I'll have to recompute some numbers...!

?

Cheers,

?

Iain.


 

Thanks Ewenn!?

For my R-CTS12023/R-DF95 peeps, it looks like Family Tree DNA's block tree has recognized CTS3624 and A19698. Last I looked, they have broken up the long string of CTS12023 SNPs, pulling? CTS3624 and A19698 out as our first branch away from the base of R-Z18.

That is good progress because that big unbroken block represented a lot of time and some big unknowns. Putting a pin in a place and time before the migration period is excellent.

mike


 

Hi Mike,

You're welcome.

Thanks for this feedback on FTDNA's modification of the R-CTS12023 block. I checked if the same was true for R-BY66533 (R-Z18>R-FGC5817>). This block was also recently modified with the formation of a new intermediate haplogroup upstream, consisting of BY54993, BY55557, and two new SNPs beginning with FTH...

I deduce that FTDNA most likely analyzed the fastQ files associated with this study. We'll see in future Discover updates (which will probably include verified haplogroups for the other samples).

Ewenn