Pages

Pages

Monday, May 4, 2015

Ethnicity Chromosome Mapping & Determining "Ethnicity" of shared DNA segments between related individuals

Source: http://www.differencebetween.info/difference-between-ethnicity-and-culture
>>>>> Savvy genealogists use autosomal DNA tests to explore their genetic ancestry along with such genetic tools (chromosome paintings and browsers, triangulation) to help them learn more about their relationships with genetic relatives by exploring specific DNA markers or segments shared with them. According to ISOGG, commonly used methods for this are "complementary" techniques known as  chromosome mapping ("determining which DNA segments came from which ancestor") and triangulation ("comparing matching DNA segments to determine which ancestor donated which particular segment"). However we also know that our genetic ancestry (colloquially known as "ethnicity") is more than just shared DNA segments. We often descend from numerous ancestries and thus have inherited DNA contributions from multiple biogeographical populations, often challenging our preconceived perceptions and assumptions about our genetic inheritance. For example "white" Americans with small amounts of "Sub Saharan African" DNA are likely to share only "European" admixture with their "black" American genetic relatives. People with multiple ancestries (ie Latinos, South Africans) or similar ancestries (ie Bulgarians) may share ethnic components contrary to what they might expect (ie Latinos sharing "Ashkenazi Jewish" DNA instead of Native American DNA, or Bulgarians sharing "South Asian" DNA segments due to Romani introgression). What's more adoptees and those with an unknown parent may not know anything about their ancestry. Therefore it's reasonable to assume that knowing the "ethnicity" of shared DNA segments between related individuals is an important consideration when doing genealogical research. Simply put, ethnicity matters! 
But how do we determine the "ETHNICITY" of these shared DNA segments? In this deep dive my objective is to discuss an underutilized (it's not new) method -- I'll coin this process Ethnicity Chromosome Mapping* (ECM) -- that can be used in conjunction with chromosome browsing, mapping and triangulation to determine the ethnicity of shared DNA segments as outlined here:

SECTION I.  Instructions (4 steps) for using ECM to find "Ethnicity" of shared DNA segments
-- STEP 1. Identify location and size of potential shared DNA segments using chromosome browsing and mapping tools (CBaMt)
-- STEP 2. Identify potential "ethnicity" of shared DNA segments using CBaMt
-- STEP 3. Find “START POINTS” AND “END POINTS” of shared DNA segments using CBaMt
-- STEP 4. Confirm “ethnicity” of shared DNA segments using Gedmatch's "Paint the Difference Between 2 Kits, 1 chromosome" tool
SECTION II. ECM Pitfalls and Technical Notes
-- (a) What happens when the "ethnicity" of shared IBD DNA segment does NOT match?
-- (b) Technical notes about Chromosome Paintings
-- (c) Technical notes about Gedmatch's "Paint the Differences..." tool

WHAT YOU WILL NEED:
Your results and access to 23andMe, FamilyTreeDNAAncestryDNAand third-party site Gedmatch.com. For optimal results, ECM works best under these conditions: 
*Disclaimer: The ECM methods presented herein are experimental, the term"ethnicity" has no legal meaning and is subject to DNA companies interpretation, so be careful about drawing conclusions. The term "Ethnicity Chromosome Mapping" is of my own invention and has not been endorsed by genetic genealogy organizations nor has it been adopted into genetic genealogy lexicon. You may also contact me with any comments or questions here: KingGenomebyTLDixon@gmail.com.
SECTION I. 
Instructions (4 steps) for using ECM to find "ETHNICITY" of shared DNA Segments 
STEP 1. Identifying location and size of potential shared DNA segments 
(a) Firstly, I would like to introduce you to my sibling JR; my cousin ID; and my new genetic matches CF and sibling RF. I've invited them to help me demonstrate ECM, and you'll see them again as we go along. To initiate ECM whenever you get a genetic match (after taking an autosomal DNA test) or a newly tested relative's results come in, you need to learn some specifics about shared DNA segments. This includes SIZE (aka Genetic Distance, indicating the length of DNA segment in centimorgans); LOCATION on the chromosomes (aka Chromosome mapping; see Kitty Kooper's tool here), and IN COMMON WITH RELATIVES sharing mutual DNA segments [aka Triangulation; see various methods @  Kitty Kooper's chromosome mapper tool, Kelly Wheaton's Lesson 11, & Blaine Bettinger's Visual Phasing]. For ECM these tasks are easily achieved using your DNA company's in-house tools and third-party sites with such chromosome browsing, chromosome mapping and triangulation capabilities as:
  • 23andMe's Family Inheritance: Advanced (FIA) & DNA Relatives Triangulation tool (note: Countries of Ancestry tool is now defunct);
  • FTNDA's Family Finder - Chromosome Browser;
  • Gedmatch.com One to One tool (for AncestryDNA, 23andMe, FTDNA) 
For this first step, I will now re-introduce my close cousin ID. We share a recent common ancestor with Amerindian heritage and signature Native American maternal haplogroup assignment. I wanted to know if me and cousin ID inherited any "Native American" DNA from our recent common ancestor, and whether we shared any of these "Native American" DNA segments. I know very little and am hoping this sort of information can help feed my lusty abandon to discover the source of our Native American heritage (which is relatively distant based on our current ethnicity admixture percentages). So let's see where me and cousin ID share DNA segments using the aforementioned tools. ...
23andme.com FIA tool comparing Me and ID. With FIA tool, you can compare your "genome" with up to five of your DNA relatives (names are interchangeable). FIA displays the "size" and "location" of the shared DNA segment, which can be viewed as a PLOT (shown below) or a TABLE (discussed in Step 2):

The illustration (below) is from the same FIA tool-Plot View as shown above but full display [showing all chromosome bars] is cropped from larger image. This shows DNA segments or HIRs shared between me and ID on chromosome #10. We share three separate DNA segments, but for this blog only the second (from left) and third segments will be considered:

23andMe CoA tool (below) is another way to discover where DNA segments may be shared with your genetic relatives. The CoA tool is driven by your genetic relatives filling out an Ancestry Survey listing where their four grandparents were born. Your matches are then plotted on vertical chromosome display showing the "size" (in centimorgans) of shared DNA segment and known nationality of grandparents, which in turn may help you make inferences about ethnicity of the DNA shared with them. 
I will now re-introduce my new relative CF. I learned that CF's maternal haplogroup is M32c, an "Austronesian" marker found frequently in populations from Madagascar. Since I recently discovered that my mother's paternal side has ancestry from this unique biogeographical region, I'm very interested in genetic relatives with whom I could potentially share DNA. Here is my CoA showing where my new cousin CF shares a DNA segment with me on chromosome #8:


FTDNA Family Finder - Chromosome Browser (below) is another tool that allows you to see where you share DNA segments with up to five genetic matches at time. Below I highlight the DNA segments shared between me and a third cousin (not discussed in this blog). As you can see we share four IBD DNA segments on chromosomes #2, #7, #21, and #23 (X chromosome):

Step 2. Identifying potential ethnicity of shared DNA segments.
 The second step (usually used co-extensively with Step 1) involves identifying the potential ethnicities of these shared DNA segments between you and a genetic match. This is best achieved with a CHROMOSOME PAINTING, which is a visual display tool used by DNA companies to show your ethnic components plotted on bars representing your 23 chromosome pairs (usually excluding Y-chromosome in males), with ethnicity categories represented by different colors. [Please be sure to read "Section II, Part b. Technical Notes...."]. Chromosome paintings are provided by 23andme.com ("Ancestry Composition-Chromosome View"); Gedmatch.com ("Chromosome Painting - Reduced Size"); Tribecode.com (Ethnicity Estimate -Chromosome View); and Dr. Doug McDonald's former Biogeographical Analysis program. But what about AncestryDNA or FTDNA? No fretting because you can upload your raw DNA data from AncestryDNA and FTDNA to Gedmatch.com and then use the tool "Chromosome Painting--Reduced Size" to provide them. Here are some examples of CHROMOSOME PAINTINGS:
My 23andme.com Ancestry Composition-Chromosome View (below):
For AncestryDNA, here are instructions for obtaining a Chromosome Painting from Gedmatch.com (follow the instructions on the screenshots.
On your Gedmatch.com homepage (below), click on the link "Admixture (heritage)":

Second (below), select project calculator (and for this blog we will use DODECAD). Next check circle "Chromosome Painting - Reduced Size" and then press "continue":
Once you're on the next page (below), enter your kit number. IMPORTANT: Please select "World9" (which is Dodecad World9). Then press continue. 
After you've made the proper selection, your  Chromosome Painting will be produced. Here (below) is the painting for my AncestryDNA results:

  
At this point you should make it a habit to familiarize yourself with your chromosome painting(s), especially where the different DNA segments are located, as well as the "ethnicity" assigned to those ethnic components. So when you get new DNA matches, you may be able to make preliminary predictions about the location of shared DNA segments and ethnicity assigned to them. For example me and cousin ID  share Native American, European and African on chromosome #10 (but remember only the 2nd and 3rd segment is discussed for the blog). When I initially surveyed our respective chromosome paintings I also noticed we appeared to have identical Native American and European segments and in the same location. (Here's a fun trick: at 23andMe if you toggle between you and your genetic relative's chromosome painting, alleged shared DNA segments appear to remain in the same place on each painting. The fixation can be due to genotyping chip.Before we go to Step 3, I'm going to show you how to make preliminary determination about "ethnicity" assigned to shared DNA segments:

On my 23andMe Ancestry Composition-CHROMOSOME VIEW (below), notice on chromosome #10 (bottom bar) about half-way from left, there is a long Native American segment (red), followed by small European one (blue), then another smaller NA segment and finally another European segment; I believe we share the long NA segment, the adjacent European segment as well as the European segment on the far right:
Here (below) is a close-up of my chromosome #10:
Now here is ID's 23andme chromosome painting (below). Notice on ID's chromosome #10 (bottom bar) that ID also has (from left) a long Native American segment (red); then a smaller European segment (blue), followed by small Sub Saharan African (purple) segment, and finally a European segment. It appears that we share the long Native American segment, the adjacent smaller European one, and the s European segment on the far right:
Here (below) is a close up of ID's chromosome #10:
For comparison, here is my Gedmatch.com chromosome painting; notice the similar locations of Amerindian (red) and Atlantic-Baltic (yellow) segments on chromosome #10:

And this is ID's Gedmatch.com chromosome painting; notice the similar locations of Amerindian (red) and Atlantic-Baltic (yellow) segments on chromosome #10 appear to be located in the same area as mine:

I will now re-introduce my sibling JR and new genetic match CF and sibling RF. We're all predicted to share a DNA segment on chromosome #8. You should also know that CF and sibling RF have a Madagascar-specific maternal haplogroup assignment M32c. Since I've discovered my maternal branch has ancestry from this unique population -- known as Malagasy, whose ethnicity admixture is roughly 50% Southeast African Bantu + 50% SOUTHEAST ASIAN) -- it is possible we might share this heritage with CF and sibling RF (not shown). Here  is 23andMe's FIA showing the location of the shared DNA segment:
Looking at my 23andMe (below) chromosome painting, you can see there is a Southeast Asian segment (yellow) on chromosome #8. If this potential Southeast Asian segment is shared with me, my sibling JR and CF then it is possible this segment comes from a common Malagasy ancestor:
Notice on my sibling JR's 23andme chromosome painting (below), the Southeast Asian segment (yellow) appears to be in the same location as mines:

Finally notice on CF's 23andMe chromosome painting (below), there is also a Southeast Asian segment (yellow) on her chromosome #8, which appears to be in the same general location as me and my sibling JR. (Of course there is a possibility the shared DNA segment is "Sub Saharan African" too):

STEP 3. FINDING START POINTS AND END POINTS OF SHARED IBD DNA SEGMENTS
Of course you can't just willy-nilly assume that you share an ethnic component just because the DNA segment it appears to be in the same location. You also have to verify the location of the shared DNA segment between you and your genetic matches. In this Step 3, I'm going to show you how to do that. It turns out you can use chromosome browsing and mapping tools to help find the location and "size" of shared DNA segments -- these are known as Start points/locations and End points/locations. This is easily achieved using 23andMe's FIA-TABLE VIEW; FTDNA's Family Finder - Chromosome Browser (we discussed these earlier) as well as Gedmatch's "One to One" comparison tool. (NOTE: Although not covered in this blog you can also find segment location information by downloading your CVS files from 23andme and FTDNA). 
So let's go back to me and my cousin ID.  Here is my 23andme.com FIA-Table View (below) showing the Start points and End points for the shared DNA segments between on chromosome #10. As before we only focus on chromosome #10 (second and third matching segment). Accordingly our the start point and endpoint for the second segment (the proposed matching Native American+European segment) is 73000000 to 11000000 and the third segment (the European one) is approximately 125000000 to 132000000 (European) on the chromosome paintings.:


Here (below) is FTDNA's Family Finder - chromosome browser showing another of my cousin's two shared DNA segments with me. Notice on chromosome #7 there is a box showing the Start location and End location of the shared DNA segment. (You hover your cursor over the DNA segment and the box will open):

Next here is my 23andMe FIA- TABLE VIEW (below) showing new cousin CF compared to me and my sibling JR. [NOTE: 23andMe reports the shared segment as 5cM-6cM but Gedmatch shows 7cM to 8cM. Since there are more than 1200 SNPs (well above 700 SNP threshold) for the segment AND  it is shared with my sibling JR and new cousins CF and RF (not shown), I'm confident the shared DNA segment is Identical by Descent (and NOT Identical by State)]:


Another great way to find the Start points and End points of shared DNA segments is by using Gedmatch.com's "One-to-One" tool. Here are the instructions below:
First on your Gedmatch.com home page (below), click on link "One-to-one compare":
On the next page (below), enter your kit number in the first field and your genetic match's kit number in the second field. Then select the circle "Yes" ... [NOTE: In the graph below, I say put put "1cM" number in "minimum segment cM size to be included" field but PLEASE LEAVE THIS FIELD BLANK (at default value) to ensure that you're comparing yourself to Identical by descent genetic relatives.] ... Finally press "Submit" button:
Here (below) is the Gedmatch One-to-one (cropped) for me and ID:

Here (below) is the Gedmatch One-to-one (cropped) for me and my sibling JR:

Here (below) is the Gedmatch One-to-one (cropped) for CF and sibling RF:

Here (below) is the Gedmatch's One-to-one for me and new cousin CF:

Here (below) is Gedmatch One-to-one showing me compared to CF's sibling RF:

Here (below) is the Gedmatch One-to-one showing my sibling JR compared to new genetic cousin CF:

Here (below) is the Gedmatch One-to-one showing my sibling JR compared to new genetic cousin RF:
As you can see, I now have the Start Points and End Points for my respective shared DNA segments. [Again Did you notice that Gedmatch's One-to-one tool showed DNA segments larger (7cM/8cM) than 23andme FIA (5cM-6cM)?] Now we're ready for the final Step 4-- CONFIRMATION OF ETHNICITY!

Step 4. Confirm ETHNICITY of shared DNA Segments using Gedmatch's "Paint the differences between 2 Kits using 1 Chromosome" to find Ethnicity of Shared DNA segments
Finally to complete the ECM process, you have to confirm the "ethnicity" of shared DNA segments as bounded by their Start points and End points. The best way to achieve this is by using Gedmatch excellent tool, "Paint the differences between 2 kits using, 1 chromosome." [Please read "Section II, Part c. Technical Notes..."]. I recommend using DODECAD WORLD9 since you easily distinguish the colors representing the ethnic components on this tool's display. Once you enter your information three bars will appear -- one for you, one for your match and one showing ethnic components shared between you and your genetic match. This tool also shows the Start points and End points which are located across the bottom of each chromosome bar. YOU ONLY SHARE THE DNA/ETHNIC COMPONENTS LOCATED BETWEEN APPROXIMATE START POINT AND END POINT OF SHARED DNA SEGMENTS. (NOTE: the start and end points are represented by "M" followed by numbers at the bottom of the three bars sort of like a ruler. Also admixture on the third bar between start points and end points that is NOT shared is indicated by the color black.) Here are instructions for using Gedmatch's "Paint the differences between 2 kits, using 1 chromosome" tool:
First on your Gedmatch homepage (below) click on the link "Admixture (heritage)":
Second, select "project" DODECAD" and click circle "Paint the differences between 2 kits, 1 chromosome," and then click on "Submit"
This (below) is how your Gedmatch page will look if you've made the right selections:
Once you're on the next page (below): 
(a) enter your kit number in "Enter Kit1" field;
(b) enter your genetic match's kit number in "Enter Kit2: field; 
(c) select calculator model; IMPORTANT: CHOOSE "WORLD9"
(d) Enter specific chromosome you share on with your genetic match (for me and ID it is chromosome #10; and for me, my sibling JR vs. new cousin CF and sibling RF it is chromosome #8); and
(e) click "Submit"
This is how the page look once you've made the correct selections. 

Here is what the Gedmatch "Paint the differences between 2 kits using 1 chromosome "display shows:
^^^^ Dodecad World9 Admixture legend
So let's go back to me and my cousin ID.  Here is my 23andme.com FIA-Table View (below) showing the Start points and End points for the shared DNA segments on chromosome #10 (and as before we only focus on chromosome #10's second and third matching segment). Accordingly our the start point and endpoint for the second segment (the proposed matching Native American+European segment) is Start point 73000000 to End point 11000000. As you can see the shared "ethnicity" is predicted to be both "Amerindian" (red) and "Atlantic-Baltic" (yellow) segment as seen in the image below:
Here (below) is continuation of previous image showing the third segment shared with me and my cousin ID on chromosome #10 @ Start point 125000000 to End point 132000000. The shared DNA component is "Atlantic-Baltic" (yellow) and "Southern Europe" (light green). However we don't share a segment that is Native American for me and African for him; the fact that we share African there is purely coincidental:

Next I turn to the shared DNA segment between me, my sibling JR and new cousins CF and RF on chromosome #8, where I suspected the ethnicity of the segment is "Southeast Asian" and possibly indicative of Malagasy ancestry. I will compare me and my sibling JR first (below). If you refer back to the chromosome browser tools mentioned earlier, you see that me and my sibling JR shares most of chromosome #8. Notice we share both Amerindian (red) and East Asian (orange) here. However if we apply the Start point (118000000) and End point (123000000) of the segments also shared with CF and RF, then we specifically share East Asian (orange) DNA segment at that location:

Next is the Gedmatch "Paint the difference..." comparison (below) between new genetic cousin CF and her sibling RF on chromosome #8.  If you refer back to the chromosome browser tools from earlier, CF and RF also share most of chromosome #8. However if we apply the Start point (118000000) and End point (123000000) of the segments also shared with me and my sibling JR, then we specifically share East Asian (orange) DNA segment at that location (Note: RF's "South Asian" (blue) is most likely East Asian related):


Now let's do siblings vs. cousins comparison of the shared DNA segment on chromosome #8.  
 Gedmatch "Paint the difference..." comparison (below) between me (TL Dixon) and new cousin CF on chromosome #8.  If we apply the Start point (118000000) and End point (123000000) of the segments also shared with me and my sibling JR, then we specifically share East Asian (orange) DNA segment at that location:

Gedmatch "Paint the difference..." comparison (below) between my sibling JR and new cousin CF on chromosome #8. If we apply the Start point (118000000) and End point (123000000) of the segments also shared with me and my sibling JR and CF's sibling RF, then we specifically share East Asian (orange) DNA segment at that location:

This Gedmatch "Paint the difference..." comparison (below) between me (TL Dixon) and new cousin RF on chromosome #8.  If we apply  Start point (118000000) and End point (123000000) of the segments also shared with me and my sibling JR and RF's sibling CF, then we specifically share East Asian (orange) DNA segment at that location:

Gedmatch "Paint the difference..." comparison (below) between my sibling JR and new cousin RF on chromosome #8. If we apply the Start point (118000000) and End point (123000000) of the segments also shared with me and my sibling JR and CF's sibling RF, then we specifically share East Asian (orange) DNA segment at that location:

Thus, as you can see, I've demonstrated how to use ETHNICITY CHROMOSOME MAPPING to both determine and confirm that (a) me and my known cousin ID share a Native American segment and two European ones on chromosome #10, which allows me to narrow down which branch of the family it comes from; and (b) me, my sibling JR, new relative CF and sibling RF all share (South)East Asian on chromosome #8, which may be indication of a shared Malagasy ancestor. 
SECTION II. 
ECM PITFALLS & TECHNICAL NOTES
(a) What happens when the ethnicity of two shared IBD DNA segments do not match?
First, "Ethnicity" classification can be controversial and rather arbitrary because it's largely based upon social constructions created by society, population geneticists and DNA companies "assigning" ethnicity admixture results via autosomal DNA tests, the latter of which we will discuss here. I will highlight major pitfall when using ECM to infer ethnicity of a DNA segment shared between two genetically-related individuals, which seems to particularly impact people with multi-ethnic backgrounds where racial lines tend to blur. Using ECM, I was comparing my cousin with his new genetic relative when I discovered that the ethnicity of their shared DNA segment did NOT seem to match! Is it possible the algorithm may assign you one ethnicity and your match another one on a shared DNA segment? Turns out, YES!  
DNA company's ethnicity admixture tests are of their own invention, As such definitions on ethnicity may vary based upon DNA companies proprietary formulas and algorithms for defining reference populations and assigning ethnicity. Basically, DNA companies create reference populations and assign biogeographical, ethnic or nationality labels to them. Then your DNA is compared to those reference populations. The labeling can be somewhat of a misnomer because the reference populations assigned an ethnicity label may be admixed with other populations. For example, a reference sample named "North African" may show a wide range of ethnic contributions from the Middle East, South Europe and African proper. This also means when DNA companies add more reference populations or decide re-assign ethnicity labels, you may end up with a lot of confusing, convoluted or conflicting admixture resultsAlso the way most DNA companies aggregate and assign your ethnic components can be problematic, too. In other words the algorithm will make a gamble call (which may be wrong) if your markers at that location cluster closely with two different populations. For more information on how the most popular personal genome companies assign reference populations and DNA, please see 23andMe Aggregating and AssigningFTDNA White Paper & Methodology  and AncestryDNA White Paper
Here is an example (below) with my cousin Douglass, an African-American, and his match Christopher, who is mostly of Euro descent. Using 23andmMe's Family Inheritance: Advanced (FIA) tool, we see that they share IBD DNA segments of 14cm on the X chromosome. (Note on the X-chromosome shared segments are fully identical between two males because males have one X chromosome; if there were two X chromosomes as with female it would be HIR):
Next I will post Douglas and Christopher's 23andMe Ancestry Composition. I will post the results in all estimate modes -- Speculative Estimate (50% accuracy), in which the admixture predicted has a 50% chance of being WRONG. So I will also post their Standard Estimate (75% Accuracy), and Conservative Estimate (90% accuracy).
Here (below) is Douglas's 23andMe Ancestry Composition (2013 version) showing that his X-chromosome (#23) is all Sub Saharan African in Speculative Estimate:
Maybe Christopher has Sub Saharan African on his X-chromosome? Let's take a look. This (below) is Christopher's 23andMe Ancestry Composition-Chromosome painting. As you can see his X-chromosome is predicted to be all European in Speculative Estimate:
Here (below) is Douglas's 23andMe Ancestry Composition (2013 version) showing that his X-chromosome (#23) is all Sub Saharan African in Standard Estimate:

This (below) is Christopher's 23andMe Ancestry Composition-Chromosome painting (2013 version). As you can see his X-chromosome is predicted to be all European in Standard Estimate:

Here (below) is Douglas's 23andMe Ancestry Composition (2013 version) showing that his X-chromosome (#23) is all Sub Saharan African in Conservative Estimate:
This (below) is Christopher's 23andMe Ancestry Composition-Chromosome painting (2013 version). As you can see his X-chromosome is predicted to be all European in Conservative Estimate:
Since I couldn't figure out what was going on besides a genotyping mistake, I decided to contact 23andMe to see what they would say about this. Here is 23andMe's response:

Hello TL Dixon,
My apologies for the long delay in responding. My understanding is that it is possible for Ancestry Composition to misassign small portions of ancestry. It is likely that for one of them, the segment was assumed based on neighboring SNPs on the X. Ancestry Composition analyzes your genome in small sections, where each section is classified as originating from one of the reference subregions, regions, or continents. A small percentage of a particular ancestry assignment that is inconsistent with your knowledge of your ancestry or inconsistent with a relative match at the same spot is potentially an misassignment among the thousands of windows analyzed by Ancestry Composition.
We hope you found this information helpful, and we additionally recommend reviewing the science behind Ancestry Composition at https://www.23andme.com/ancestry_composition_guide/
Best regards,
[redacted]

After I received the response from 23andMe, I asked them if they were going to "fix" the "mis-assignment" since the ethnicity prediction as displayed at that location was not the same. However 23andMe indicated that the segment shared was real or identical by descent but the ethnic assignment was different based on the way the algorithm aggregated and assigned ethnicity to that particular segment. In other words, something in Doug's admixture at that location caused the algorithm to assign African, and something in Christopher's admixture at that location caused the algorithm to assign his European. As you can see, ECM is not very helpful in this situation. Since both have African and European admixture I can't say that for instance that Christopher's matching segment must be African or that Doug's must be European. 
With 23andMe's last Ancestry Composition update, I noticed the issue with Douglas and Christopher was resolved. As you can see Douglas now have a EUROPEAN segment on his X-chromosome where he shares with Christopher:

Meanwhile Christopher's Ancestry Composition the European remained the same on the X-chromosome:

(b) Technical Notes About Chromosome Paintings:
  • Chromosome painting is just display only, and your genotyped results look nothing like that. Remember only a fraction of your genome is genotyped and the process involves testing ancestry information markers (aka known as SNPs) at fixed locations  [on a microarray chip] for each chromosome. 
  •  In other instances the segment at particular location could be displayed as European on your chromosome painting but as Native American on your genetic match's chromosome painting even though you both share the same DNA sequence there; 
  • Your chromosome painting will show admixture from both parents -- top and bottom of a single-bar display (ie Gedmatch) OR on both bars of double-bar display (ie 23andMe) EXCEPT when results are phased between parent and child (then mother's contributions are exclusively on top and father's on the bottom) or in situations where both parents come from very different backgrounds (ie one Japanese and one Nigerian).
  • If you've two different chromosome paintings (from same company or from separate sources) OR when you compare your chromosome painting to a genetic match, please be aware that matching DNA segments (and assigned ethnicities thereof) may not be displayed in the same bar for each one:
This is my cropped chromosome painting (below) from showing the location of two "Southeast Asian" (yellow) segments on my chromosome #3 (see circled segments). Notice how one Asian segment is on the top bar and the other is on the bottom bar:


And here is my chromosome painting (below) from my second 23andMe test. Notice now the Southeast Asian segments (yellow) are both on the top bar:

(C) Technical Notes about Gedmatch's "Paint the differences between 2 kits, 1 chromosome" tool:
  • Gedmatch's chromosome paintings and "Paint the differences..." tool are generally more granular than those offered by main DNA companies when it comes to ethnic components being displayed. For example your Gedmatch's SINGLE-BAR chromosome painting may show three or four ethnic components stacked on top of each other in the same location, while 23andMe's DOUBLE-BAR chromosome painting may show one solid ethnic category for that particular location (per bar). This means you would then have to sort out which of those ethnic component(s) are shared at that specific location, which may not so clear cut. However you may be able to identify the ethnicity of the shared DNA segments by observing and eliminating the ethnic components surrounding it. For example, if you and your genetic match's ethnicity admixture is majority European with small African and it appears you both share an African segment, then the "Paint the differences..." tool's third bar may show both European and African at that location. However if the African is not in black (meaning no DNA is shared) on the third bar, then chances are you and your match  share that "African" segment.   
  • The Start point and End point numbers are approximates and your results can vary widely among chromosome browsing/mapping tools so pay attention to where the DNA segment is located (on your chromosome browsing tools), which in turn may prevent you from looking at the wrong location on the "Paint the Differences...tool." 
  • Also pay attention to how low or high Start/End point number are because they may represent very different locations (also influenced by each chromosome pairs being a different length and SNP density). For example 1320000 (132 followed by four zeros) is probably between 10M and 20M, while Start/End Point number 132000000 (132 followed by six zeros) will be between 130M and 140M. However on a shorter chromosome without 130M and 140M, the start/end point is likely to be between 10M and 20M. 
  • Start points/End points as reported from chromosome browsers (23andme, FTDNA and  Gedmatch's One-to-one tool) may NOT match up with the M numbers located at bottom of each bar of Gedmatch's  "Paint the Differences..." tool. As well your shared DNA segment may not match up with the parameters of the Start point/End point of the shared DNA segment. This is because most chromosome browsing tools "measure" segments in centiMorgans, while Gedmatch's "Paint the Differences tool" still uses Megabases
  • Another problem is due to  Gedmatch.com being on a different genomic build (aka Reference Human Genome, a  digital nucleic acid sequence database representing an example of a human's set of genemade from several person's genomes) than the major DNA companies. This may cause such problems with the way Gedmatch interprets the location, size and reporting of DNA segments. For example a shared DNA segment detected by 23andMe's may be of a different size, or worst, not detected at all, on Gedmatch. According to ISOGG, "23andMe use Build 37. Family Tree DNA uses Build 37 for matching but Build 36 for segment boundaries in the Chromosome Browser. Raw data files are provided in both formats. Build 37 filled in quite a few gaps, and the number of base pairs in each of the chromosomes was longer in Build 37 as compared to Build 36. Consequently the cM totals per chromosome are lower for Family Finder than they are for 23andMe. GedMatch use Build 36. The latest version of the Human Reference Genome, Build 38, was released in December 2013." 
####

14 comments:

  1. Thanks TL for it. I havent paid much attention to the chromosomes of matches my grandparents and dad are sharing with but I'll start checking it out sometime

    ReplyDelete
  2. You're very welcomed Anthony. Please let me know what you discover.

    ReplyDelete
  3. Thanks! I didn't know anything about how to do comparisons or what "speculative", "standard", and "conservative" comparisons represent. That being said, if the "conservative" view is the most accurate, what does the 13.6% unassigned in my 23andMe ancestry composition represent? Is it possible to use this as a indicator of connections between me and my DNA relatives?

    ReplyDelete
    Replies
    1. Hi CRJ,
      It's not that the Conservative is more accurate per se, but rather the ethnicity estimates can be predicted as being from a specific [reference] population with more confidence. Usually this admixture which can predicted with 90% confidence. The unassigned admixture is then admixture that can't be called or predicted with any confidence above 90% --- this is common in admixed individuals when say there are several European components (British, Iberian, Italian) in the same location. However if you lower the threshold to 70% (Standard) then more admixture can be assigned specifically since one or two of those components may reach 70% but not 90%. ...

      Delete
    2. Hi CRJ,
      It's not that the Conservative is more accurate per se, but rather the ethnicity estimates can be predicted as being from a specific [reference] population with more confidence. Usually this admixture which can predicted with 90% confidence. The unassigned admixture is then admixture that can't be called or predicted with any confidence above 90% --- this is common in admixed individuals when say there are several European components (British, Iberian, Italian) in the same location. However if you lower the threshold to 70% (Standard) then more admixture can be assigned specifically since one or two of those components may reach 70% but not 90%. ...

      Delete
  4. I was just in Barnes % Noble this morning looking for something like DNA for Dummies. There was nothing even close. I don't even remember how I got to this site, but appreciate so much your sharing with others. This site looks even better than a book.

    ReplyDelete
  5. Very good. I found when I had got the contrast I was looking for using world9 I could get a more sustained contrast using EurogenesK13 because from the first one (as outlined by you) I could see that it was a European who was most coincident looking. The EurogenesK13 paint showed clearly that it was an Eastern European.

    ReplyDelete
  6. informative post! I really like and appreciate your work, thank you for sharing such a useful facts and information about pay determination strategies, keep updating the blog, hear i prefer some more information about jobs for your career hr jobs in hyderabad .

    ReplyDelete
  7. Hi Dixon
    before all this you should have your genome sequenced; isn't it?
    how do I do it? thanks

    ReplyDelete
  8. Hi Dixon
    before I do all this I should have my genome sequenced; Isn't it?
    how do I do it?
    thanks

    ReplyDelete
  9. Forgive me for being totally ignorant of the procss. I tested with FTDNA and AncestryDNA. I also uploaded to GEDCOM. I reached out to one of my relative match and they informed me that I needed to do HVR2 or Full-sequence testing in order to be able to locate a common ancestor. My question is, could I still go thru the processes you outlined above? Thanks in advance.

    ReplyDelete
    Replies
    1. Hi Shelley,
      Your relative match told you wrongly that you would have to take a full mito sequence test to locate a common ancestor. Mitochondrial DNA testing is completely separate and is used to determine the direct matrilineal line of your mother and her foremothers. Therefore you can't use it for this process (Ethnicity Chromosome Mapping). The mitochondrial does not have an "ethnicity" but it used to determine your maternal haplogroup assignment and genetic relatives only on the mitochondria. Most of those relatives are distant and difficult to trace. And FamilyTreeDNA is the only companies that find matches on your mitochondria. You can't use Gedmatch for these type of matches either.

      Delete
  10. Hi. I came across this very interesting article when looking for possible answers to two questions I have, the most relevant in this case being: how is it possible for two people to have a DNA match but no shared ethnicity? I think the answer in somewhere in your article but I’m not quite seeing it.
    Case in point: EGR and IR share 31cM in 4 segments, the largest being of about 12cM, yet they have no shared ethnicity. EGR also has several other matches of a similar nature. How can that be? It is highly unlikely that EGR has any connection to the places where these matches are from or have their ethnic roots.

    My other question, though not necessarily relevant here, is this: how is it possible to have a higher dna match with someone than you have with one of their parents?

    ReplyDelete