Workpapers Links

Click on workpapers Links (three bars) top right corner, to toggle links.

Ancestral Estimates By

Genotyping, Whole Genome Sequencing, and Chromosome Painting

A Work In Progress by Herbert Holeman, PhD.

Draft Workpaper Posted.

For discussion purposes, my DNA tests and findings are used as examples.

• Estimates of my ancestry at the continental level are dependable, but not so for regions with a continent.

• Ancestry estimates for me differ among Genotyping, Whole Genome Sequencing, and Chromosome Painting.

• The level of statistical confidence influences my ancestry estimate.


Estimates of my ancestry between genotyping and whole genome sequencing tests are reported in Table 4.

Continental Ancestry

Companies, such as 23andMe, My Heritage, and Ancestry typically use genotyping (microarray-based atDNA testing). These companies sequence around 700,000 positions in my genetic sequence. This process of genotyping is relatively inexpensive and relies on sequencing snapshots at just a few hundred thousand locations across my genome, about .02% of my DNA (some argue 23andMe, AncestryDNA and MyHeritage test less than 0.1% of the genome), and they use imputation to infer the SNPs not read. These companies use the imputation process to fill in the missing SNPs but the process is not perfect and can lead to errors. The MyHeritage Company explains imputation this way. "Think of imputing DNA as reading a sentence with some of the letters missing — there’s a good chance that you can infer the missing letters from context.

In contrast, the Nebula Genomics Company uses whole genome sequencing. Simply put, A genetic test that decodes 100% of your DNA is called Whole Genome Sequencing (WGS). As Nebula puts it, WGS includes the sequencing of all genes (coding regions), regulatory genomic regions, the Y chromosome (for males), and mitochondrial DNA. In my case, Nebula sequenced my genome at 0.4x coverage which corresponds to ~ 1.3 billion positions and results in one thousand times more data than tests using microarray-based genotyping.

Imputation Both testing strategies rely on the process of educated guessing known as imputation in their ancestry estimates. The companies, 23andMe, Ancestry, and MyHeritage, use imputation to expand their test results by inferring results for what they haven't tested based on the results they have tested. The company, Nebula Genomics, coverage at .4x is also a partial read of the genome (about 40%), which is supplemented by imputation to infer the sequence of gaps in the data for the remaining 60%.

Chromosomes Painting is yet another tool. It is means by which atDNA testing provides a way of looking at ancestry ancestry as shown in Figure 39. It displays the 46 chromosomes passed between generations in the form of 23 pairs. My interest is with the first 22 because they represent ancestries I match. Each of these autosomes pairs appear as one of the colored horizontal lines.

Figure 39
My Chromosomes Painted
Chromosome Painting

The multi-colored chromosome painting reveals the genetic mixing of my ancestry from different populations. The populations appear in the horizontal lines color-coded by descent. The long, unbroken stretches of color are evidence of recent ancestry, while the short segments suggest those of many generations ago. More recent sources of ancestry will have segments of that ancestry on more chromosomes. Moreover, those segments will be longer than of my ancestors of many generations ago. Of course, a significant limitation is that companies base their chromosome painter results on different databases. Thus chromosome paintings may not be consistent from one company to another.

Methodology differences are apparent in estimates of my ancestry at the continental level. My ancestry reported by all three technologies is depicted in Table 5.

Continental Ancestry

Difference in regional ancestry estimates within a continent are even more apparent limiting their usefulness. Estes warns, "Within continents, like Europe, Asia and Africa, there has been a lot of population movement and intermixing over time making the term 'ethnicity' almost meaningless." Dr. Rutherford's view is similar. "By inference, we are to assume that significant proportions of our deep family came from those places. But to say that you are 20 percent Irish, 4 percent Native American or 12 percent Scandinavian is fun, trivial and has very little scientific meaning."

An example of this is my Iberian ethnic ancestry estimates from genotyping, whole genome sequencing, and chromosome painting in Table 6.


Statistical confidence levels influence Estimating Ancestry. A confidence level refers to the percentage of all possible sample values expected to include the actual population value. A confidence level of 90% is a common choice in social research and expressed 90% CI. It is a range of values at 90% certainty to contain the true mean average value of the studied population. The chromosome painting values reported in Figure 39 and Table 5 are speculative calculations with a level of confidence of 50% by the company.

Depending on the company's level of confidence, its estimate of ancestry can vary. Table 7 is a chromosome painting prediction of my ancestral origin by the 23andMe Company. It depicts the speculative level of confidence (50%), which is the default, but with an option for users to select different confidence levels, including the 90% Confidence Level.

More in-depth answers about my genetic ancestors living far back in prehistory will surface in later workpapers on mtDNA and Y-DNA testing. These tests reach far back in matrilineal and patrilineal genetic prehistory.

Chromosome Painting

Copyright © 2018 Herbert P. Holeman, Ph.D. All Rights Reserved.