Trace Your Roots with DNA Summary

by Cheri Mello

Note: This was originally summarized for the Azores-DNA list. You may see references to Portuguese ancestry. Also, because FamilyTree DNA was chosen to be the testing company for the Azores project, there are more references to that company than the others. 

I bought this book and thought as I read each chapter I would summarize it here. If you are like me, you might have a vague understanding. If you are like that, I hope this works for you and you will get a better understanding of it all.

Trace Your Roots with DNA

by Megan Smolenyak Smolenyak and Ann Turner

(yes, her surname is doubled like that).

Introduction: Welcome to the World of Genetic Genealogy (pp. ix - xvi)

I really didn't get much out of this. It was saying how technology has grown so much and has changed the way we do genealogy (web sites; census images online). They created a word from genetics and genealogy that they call genetealogy (ge-neh-TEE-ol-o-gee). There's Y-DNA (traces your direct paternal line) and mtDNA (em-tee-DNA) that traces your direct maternal line.

Part 1: The Fundamentals

Chapter 1: If You're New to Genealogy (pp. 3 - 17)

Again, I didn't get much out of this (but if you are new, you might want to see if your public library has this book). They said do your homework and check home sources, don't believe everything you hear or read (even if everyone knows Grandma was the most honest person anyone ever met), capture what you learn (audio tape Grandma's story or video tape her story, or write it down (forms, software). Note on this from me (Cheri): I'm doing a project on my dad's American lines (not his Portuguese ones). I'm calling people on the phone and writing 2 or 3 pages of notes as I talk to them. If I don't immediately type it into the genealogy program, I do find that I'm starting to forget bits of the conversation in a couple of days. Back to the book.....Chart your course by using pedigrees (shows how it follows your dad's line or your mom's line), depend on descendancy charts (these are the genetic ones where your dad is a square and your mom is a circle and then show their sons as squares and daughters as circles); don't skip generations (although they use the term "reverse genealogy" which is tracing your family forward (which may need to be done if you are seeking a DNA donor); it's not all on the internet (you still need to use libraries and local sources); and surround and conquer (if you can't find out what you want on a particular ancestor, work the collateral lines: siblings, cousins, etc).

Chapter 2: Genetic Essentials (pp. 19-32)

Classical Genetics

Classifying and counting are called the Mendelian genetics. Mendel (working with his peas) began to see patterns in the way the traits were passed on. He noticed some were recessive and dominant. Recessive appears only when dominant was absent. Dominant doesn't equal the superior version. One trait is inherited from each parent and makes alternate versions. These are called alleles. As Mendel studied his peas and figured out the patterns, he started creating different combinations (height, color, etc). This is called the law of independent assortment. Today, scientists know some characteristics are linked together and some take more than 2 alleles (blood type has 3 alleles: A, B, and O). Some are codominant (AB blood type) and some are sex-linked.

Types Galore - Blood Types, Phenotypes, and Genotypes

Classical genetics are limited to traits that can be observed or measured (called phenotypes). Then they explained blood types. Example (mine, Cheri): My dad is type B blood, but I am type A. Since I inherited from each parent, I must have received an A from my mother and an O from my father. Since B is dominant in my father, his phenotype is B (it is what is observed or appeared in his blood type test), but his genotype must be BO.

Molecular Genetics

There are "testing companies" for the genealogical market. They don't deal with any traits at all. (Family Tree DNA, (or FTDNA for short), is the company that was chosen for the Azores DNA project).

DNA is the "Molecule of Life"

DNA = DeoxyriboNucleic Acid. It instructs blood cells to make hemoglobin to deliver oxygen to your cells.

The Basics of Bases

DNA is a long, skinny molecule that strings together simple units called bases or nucleotides. The bases are adenine, guanine, thymine, and cytosine, which are abbreviated as A, G, T, and C. Basically, these 4 codes make up a very detailed owner's manual for humans. The Human Genome Project established the order of all the DNA bases. The Human Genome Project is a scientific project to map all the markers on


The DNA Copy Machine

DNA is the double helix shape. Adenine pairs with thymine (A - T), and cytosine pairs with guanine (C - G). These are called complimentary pairs. The cell makes a copy before it divides in 2 and the bases (A, T, C, and G) line up with the help of DNA Polymerase. Scientists have learned to mimic the action. This is called Polymerase Chain Reaction (PCR). This helps scientists read DNA from very small samples.

Chromosomes are Package of DNA

DNA is packaged (along with a few proteins) into individual structures called chromosomes. The chromosomes are inside the nucleus. The whole set of chromosomes are called genomes. All of chromosomes come in pairs, except X & Y (I thought X & Y made a pair??). The first 22 pairs (or 44 chromosomes) are called autosomes. The last or 23rd pair (number 45 & 46) is called the sex chromosomes. Two X's make a female, an X Y combo makes a male.

If You're a Man

You get the X chromosome from your mom and your Y from your dad. Your dad got his Y from his dad. And he got his Y from his dad, etc. This conveniently follows the surname line in many cultures (but not Portuguese!) The X in males comes from the mother, but she may have passed on the X from either her father or her mother (or a patchwork called recombination). 

If You're a Woman

There is more representation from more ancestors. (There's a chart here in the book). By the time the level of your greats are reached, much of the DNA has become scrambled. Because of shuffling and recombination, you can't tell which gene came from whom. So women can't be in surname projects (except by getting male relatives to participate). But there's another type of DNA that comes down the female line called mitochondrial DNA and it's in Chapter 4. 

Genes are Stretches of DNA with a Mission

Gene = the fundamental unit of heredity. DNA is a recipe, blueprint, or parts list. There are 1,000 genes on an X chromosome, but only 27 on the Y (a-ha! Is this what is the problem with men? Just kidding). The X chromosome covers a broad range of structure and function. The Y chromosome is like a switch: if it appears, then the baby will be male. 

One Man's Junk is Another Man's Treasure

There are long stretches of DNA between genes with no known function. 95% or more does not code for anything. It's non-coding or junk DNA. It does not have any effect on personal traits or medical conditions (as it doesn't code). It just records your ancestral history. Mutations occur freely, preserving evidence. It is used to study the migrations of ancient people. The testing companies (like Family Tree DNA or FTDNA, for short) use markers from this non-coding or junk portion of DNA. More closely related people will have more similarities than more distantly related people. 

Marking up the Map

Locus (Latin) means to mark a position (like on a DNA molecule). Plural is loci. A genetic marker allows scientists to flag a particular locus and study it more. Genetic differences are called polymorphisms.


Mutations = change, which can be good, bad, or indifferent. Since genealogical markers use junk DNA, the changes are indifferent. Changes are caused by agents (mutagenics), such as radiation, viruses, or uncommon chemicals. These are induced mutations and are rare (x-raying your teeth does not count). Most mutations are spontaneous - like DNA made an "oops!" when it doesn't replicate correctly. DNA makes a copying error about once in every 50 million bases. Without mutations, there'd be no comparisons to figure out. Somatic mutations occur in the body and can't be passed down. Germ line mutations happen when the sperm and egg form, making the mutations inheritable and track the connections between generations.

=====================END OF PART 1===========================

Part II: Testing Options Explained

Chapter 3: Male Bonding: Y Chromosome (pp.35-38)

Note: Don't skip this because you are female.

The Source of All Y

Theoretically, the Y chromosome could be traced from every living man back to one man. Going back in time, the father of the father of the father, becomes a smaller and smaller number. It will all converge (coalescence) on one grandfather.


The branches lead to the "Most Recent Common Ancestor" (MRCA). He is 

nicknamed Y-Adam. He's not the first man, since he had to get his Y chromosome from his father. Y-Adam had his peers, was born in Africa, probably between 60,000 - 100,000 years ago and had at least 2 sons. They left African and migrated all over the world.

Practically Perfect

If the Y chromosome were duplicated perfectly, then every man in the world would have an identical Y chromosome. But it's tweaked a little bit sometimes when passed down. More and more sequencing of the Y chromosome allows scientists to pinpoint differences. Brothers might have a slight difference in their Y's and first cousins would have more of a difference. There is an optimal mutation rate that preserves Y for maybe 10-20 generations back. This is the Short Tandem Repeats (STRs). It's like a stutter. A short pattern (2-5 of the bases) is repeated a certain number of times: GATA could be GATAGATAGATAGATAGATA. Sometimes the enzyme that repeats loses its place and there will be only 4 repeats or 6 repeats. This copying "oops" happens 2/1000 (1/500 or 0.2%) on average.

What's the Catch?

Mutations are random. The Y chromosome may be preserved for hundreds of 

generations. Two brothers can have different markers, as no one know when the "oops" will happen. So, more than one test marker needs to be used: 10, 20, 30. Brothers would match on most markers. The law of averages says that two men could have a common ancestor within a certain time frame. The Y chromosome won't show WHO the ancestor is, only that you share a common ancestor at some point in time.

Define the Purpose

The report is a bunch of numbers. More and more people are testing and the DNA databases have more and more samples. Most people like the surname projects (Congrats to those who have already submitted their DNA to the Azores project, as surname projects won't work for us, since the Portuguese could take either surname).

Inset: Do You Really Want to Know?

How would you feel if the result wasn't what you were expecting?


The Y chromosome is not linked to the surname (adoption, infidelity, 

illegitimacy (pai incognito for us Portuguese), aliases, and name changes would all cause this). 85% carry the surname down the lines.

Y - Chromosome Objectives

One doesn't have to do the straight paternal line. If you want to do a cousin line, then you will want a cousin who is associated with that surname. Most surname projects want to validate the written paper trail, prove a connection, or find a possible connection to somewhere else. More can be learned about the deep ancestry of an area by studying multiple surnames (like the Azores project).

Do You Have a Match?

Most people hope for a match, tracing their roots to a common ancestor. It can establish that you really are related. Of course, the opposite exists. If you are trying to eliminate a certain branch, then you hope for a mismatch. Testing answers questions or moves you closer to an answer.

Inset: What is a Match?

There are the names of the markers (such as: DYS390) with its results (such as 24, which is the number of repeats). Different companies put the markers in different orders, so if you are comparing FTNA with Relative Genetics, make sure you are matching up the markers as well. Haplotype is the complete set of results on whatever markers were tested. 


Abe: 14-12-24-11-13-13

Bob: 14-12-23-11-13-13

Carl: 14-12-24-11-13-13

Dan: 14-14-22-10-12-13

Abe and Carl are perfect matches and related; Bob may be closely related; Dan isn't related.

Very Common Common names use DNA to help narrow down their range. DNA could 

differentiate branches. They can find distant cousins or rule out some branches (and thus save research time!)

Or Very Rare Having a rare surname may lead one to think that they are all related somewhere. DNA can prove that (or disprove it). This is where the DNA of the neighbors could help out (like in a geographical project like the Azores!)

Why Two Samples?

One test result has no one to compare to. So at least 2 people need to participate. More the distantly related the 2 people are, the deeper the chain of evidence. If the 2 people disagree on 1 marker then there was a mutation, but we don't know which person had the mutation. So a third person would have to submit their DNA. But the mutation will mark one branch of the descendancy chart. If the 2 people are quite different, then one has to rule out a non-paternity (that is, if they thought they were related in the first place).

Uncertain Paternity

Tales of adoption or illegitimacy fall in this category. If the family story says that all the Howrys, Hauris, Haurys, and Howreys were descended from the same ancestor, but great-grandpa was really a Hamilton, then DNA can show you whether to continue looking under the Howrey name or go look under Hamilton.

Geographic Origins

This is more challenging than others. Similar names are needed, and a large sample of participants. Then you can tell which groups are related and which aren't (this was actually a long story on the Irish Glennons and how he went from county to county tracing those with similar names). In our case, some Portuguese have Miller instead of Mello, Marshall instead of Machado, or King instead of Rei. DNA testing would let those Millers know which Mellos they should go looking at in their research.

Name Changes

The name may be confusing as it evolves over time (or changed upon entering another country). Y-DNA's fastest growing purpose is to substantiate the name change or uncover unknown ones.

Mystery Matches

If your DNA matches someone out of the blue, then there could be a non-paternity, sound-alike names (Laymon & Lehman), Anglicized names, or patronymics. Also, the Most Recent Common Ancestor might be pretty far back.

When the Paper Trail is Confusing

We deal with the name, date, and place. When there are 2 people who you think might be 1 person, then DNA can be used to determine this.

Why Y?

Surname projects are the most common form of DNA testing. The Y chromosome goes straight down the paternal line. The mutation rate is just about right to trace. It can't prove descendancy from a certain ancestor, only that 2 people are related. DNA is another form of evidence to reinforce the traditional paper trail.

Chapter 4: Maternal Legacy: Mitochondrial DNA (pp. 59-74) 

Mitochondrial DNA is in some ways parallel to the Y chromosome, except it tracks the straight maternal line.

A Different Kind of DNA

This DNA is in the cytoplasm of a cell. The mitochondria breaks down food molecules. If more energy is needed, the mitochondria splits in two.

Passing the Torch to the Next Generation

The egg has as many as 100,000 mitochondria. They all come from the egg, which comes from the mother's egg, which comes from her mother's egg, etc. So it's the bottom line of the pedigree chart.

Mitochondrial Eve

Theoretically, everyone could trace his or her ancestry back to one woman, Mitochondrial Eve. She was born somewhere in Africa, about 120,000 - 200,000 years ago, long before Y-Adam. She wasn't the first or the only woman of the time period, but everyone goes back to her. The differences we see today in mitochondria are due to mutations.

References Supplied Upon Request

The results are a long string of letters, between 340 - 1,100 bases. Everyone is compared with a single reference point and only the differences are reported. There is a very small number of these differences or polymorphisms. The total mitochondrial molecule is 16,569 bases long and fits on 3 pages. It is called the Anderson or the Cambridge Reference Sequence (CRS). The CRS is not the same as mitochondria Eve, or even close. It is one of the more common patterns in Europe.

Everyone's a Little Bit Hyper

The mitochondria DNA makes a circular molecule. They started "base 1" at the center of the "D-loop." The D is for displacement or control region. Usually it is called the Hypervariable Region, or HVR. The D-loop is a spacer. Mutations cause no harm. Polymorphisms (differences) show up within the D-loops. The mutation rate is very low.

Test Reports

Scientists sequence several hundred bases in the Hypervariable Region. Some companies list the actual sequence and some point out where you differ from the CRS. The report may list the mutations, but it is more correct to say "differences" or "polymorphisms." If you see 16224C 16311C, then that means you have base C at positions 16224 and 16311. (If you want to see the entire sequence, go to The complete set of polymorphisms is your haplotype. The mtDNA polymorphisms come as a package deal, not a mixture from different ancestors.

Compared with Whom?

There is a common maternal ancestor back in time. It could be recent or thousands of years ago. It has been found in the Iceman. With mtDNA testing, the mitochondrial DNA changes more slowly than the Y-chromosome so the time frame broadens. It is harder to get a match with mtDNA. Comparisons between 2 specific people can show if there is a connection.

Proof of Principle

Three objectives: For genealogists to study their maternal ancestry, for population geneticists to gain insight into the structure of early colonial populations, and for geneticists to determine the mutation rate of mtDNA.

Just a Coincidence?

Scientists have found 460 unique haplotypes on positions 16024 and 16365. If more bases were included, the diversity would increase further.

Mailing List Launch

There's one at:

Mistaken Identity

This was a story example. Could Nancy Scott's maiden name have been Kirkland instead of Perkins? They needed more proof, found a sister and tested her descendant.

Were They Sisters?

Another story example. Two Kelly's in the same place. They found direct maternal descendants of both women and had them tested. They had identical HVR1 mutations. The chance of both mutations matching by accident are very, very small.

Inset: HVR1 and HVR2

The Hypervariable Region covers about 1100 bases. Base 16569 is next to base 1 (it's a circular molecule). The highest numbers, in the 16,000 range, were studied first and called HVR 1. HVR2 covers the lower end numbers. The notation 315.1C means there's an extra C (cytosine) compared to the CRS. Deletions are listed with a minus sign: 523-524-. The CRS has a rare value at 263G. Some testing companies only test for HVR1 and some test for both, and some offer a choice of what you want to be tested on. It you have a common haplotype with HVR1, then you will want to test for HVR2 so there won't be so many coincidental matches.

Identifying the Father by Identifying the Mother

Can't determine the parents because it might be 1 of 2 brothers? So if you use mtDNA you can figure it out.

Worth the Effort

An ancestor with many wives and many kids...mtDNA can tell you which kid goes with which mother. MtDNA's uses aren't as obvious. 

The Broader Picture

DNA tests can place your ancestors in a global framework.

Chapter 5: Around the World: Geographic Origins (pp. 75-100)

Where your ancestors journeyed...left tracks on the mtDNA and Y chromosome.. The markers' mutation rate helps sort out close vs. distant relations. But the mutation rate is too high to preserve signals from ancient times. It can mutate down one in a generation and then increase back one in another generation. So it is hard to determine ancestral haplotype back then. Some mutations are rare and are carried down to the present generation.

One of a Kind

Unique Event Polymorphisms (UEPs): the mutation rate so low that the mutation can be treated as a one-time event. For mtDNA, the small set of mutations is just a slight change of meaning. For Y-chromosome, the slow mutations occur in junk DNA. One common type is Single Nucleotide Polymorphism (SNP pronounced "snip"). It's where one base is replaced with another (A for C). There is 1 chance in 50,000,000 that a change will occur at any locus between one generation and the next.

Who - What - When - Where?

If your father had a mutation on his Y-chromosome, or your mother on her mtDNA, then you are the only person in the world with this UEP. If this change happened in ancient times, then there would be a large number of people with the UEP. The UEP is found in the highest concentration near the point of origin. Who is a specific, but unnamed person; What is a UEP; Where is determined by large scale studies; When is pooling data from many people.

Which Came First?

If you have markers that test positive at the locations, a pattern can be established to show which one is older than which (the main marker had a mutation creating a new marker, etc. There's a chart here). When the chronological record is combined with the geographical hotspots for each pattern, a map of the male migration routes forms.

DNA "Groupies"

Haplotype: describes 1 individual. Haplogroup: a cluster of people sharing the same UEP. Their ancestry converges on the founding father or mother who first had the mutation. There are many distinct haplotypes within the haplogroup.

"Hi - I'm a J. What's Your Haplogroup?"

There's an outline for the descendants of Y-Adam that shows the branching order. It's the full phylogenetic tree:

Is a Best Guess Good Enough?

Knowledge of you haplogroup isn't necessary for genealogical projects. They can establish broad geographic categories, such as European or African. If there's a family story of Native American, the haplotype numbers can tell you which way to look. The haplotype also narrows down the range of SNP tests you'd have to take. The founding father of the haplogroup had a certain haplotype that may have a marker that increases or decreases, but clusters around a central tendency. If you compare your haplotype with the most common values in a haplogroup and the more closely it is to the model, the more certain you can be about your haplogroup. and www.ybase.

org and

Inset: Sample Y-DNA Haplogroup Descriptions

E3a: African

I: (I1 and I1a): northwestern Europe (Vikings)

J2: northern part of the Fertile Crescent, spread through central Asia, the Mediterranean, and south into India. It's in Jewish populations (so for those with the Sephardic possibility, you can now have scientific proof).

R1a: Eurasian Steppes north of the Black and Caspian Seas. Central and western Asia and India and Slavic populations of E.Europe.

R1b: Most common haplogroup in Europe (of course!). This haplogroup contains the Atlantic modal haplotype.

Q3: strictly assoc. with the Native American population.

In Search of Deep Roots

Why the formal test for the haplogroup, when R1b is the most common European one? Because there are many rare and unique haplotypes (are you paying attention to the endings of -group and -type?) and if you have no close matches then you have a rare haplotype or you are underrepresented in the databases at this time. You can search your markers at:,,, and

Out of Africa

Where in Africa did the family originate? The Y-DNA for African-Americans is called PatriClan. Three out of every 10 don't find a match. African ancestry uses a UEP called "Y Alu Polymorphism" or YAP. Once the basic haplogroup is found then the African Ancestry company checks the database for Short Tandem Repeats (STRs) for more specific haplotypes. This database has 9000 samples from 60 populations in West and Central Africa. There's also an mtDNA version of this test as well.

MtDNA Haplogroups: The Daughters of Eve

Bryan Sykes' "The Seven Daughters of Eve" says 95% of Europeans can trace their ancestry back to just 7 women. These 7 haplogroups are: H, J, K, T, U, V, and X. (Sykes gave them names, such as H is Helena). mtDNA uses this alphanumeric system, starting with the capital letters above. Then it's subdivided like U5a1a. The mtDNA and the Y haplogroups do not correlate with each other. There's a map of "Eve's" markers and a URL:

Testing the Slow Markers on mtDNA

To uncover the small differences, there's an enzyme that chops the DNA into smaller pieces. This enzyme is only on certain sites on the DNA. If there's a mutation where the enzyme would cut, then the enzyme can't cut the DNA there. This is the Restriction Fragment Length Polymorphisms (RFLPs, pronounced riff-lipps). These RFLPs make up formal definitions for the mtDNA haplogroups. It's really good for Native Americans. Now, a newer method: HVR mutations. They correlated the HVR results with RFLP testing and found that the mtDNA haplogroups make a better guess for the haplogroup (because it mutates slower) than the Y haplogroups. If you want your deep maternal ancestry, then test only HVR1. If you add HVR2, then you increase your haplotype diversity. This is good if you need to confirm a connection between 2 people who match on the HVR1 test. You can look up your haplotype at:

Inset: The 7 Daughters of Eve

These are the imaginative bios of the founding mothers of the European haplogroups. But the geography and archaeological stuff is true.

H: (Helena) the most widespread in Europe. 20,000 years ago near the border of France and Spain.

U: (Ursula) lived in Greece 45,000 years ago

T: (Tara) lived 17,000 years ago near the Mediterranean

K: (Katherine) lived about 15,000 years ago near the southern slopes of the Italian Alps

V: (Velda) lived about 15,000 years ago near the Iberian peninsula

J: (Jasmine) lived in the Fertile Crescent during the ice ages and then moved to Europe. (This is the middle east or Mesopotamia). (No time frame given).

X: (Xenia) lived 25,000 years ago and spread to Europe and the Americas.

From Haplogroup to History

This section was a whole example. A brother & sister from Puerto Rico were researching their genealogy and hit a road block on the maternal line. So the sister submitted her mtDNA and had more mutations than anyone in the genealogy DNA list had ever seen. She went to the table of motifs web site and found an African haplogroup with 8 of her mutations. Her brother found documents where a white ancestor asked permission from Spain to bring slaves to Puerto Rico in the 1600's.

Are You Descended From a Cherokee Princess?

(No, the Indians didn't have princesses). If the Indian line came from the maternal line, the mtDNA test can show you which one of 5 haplogroups it could be. Trace Genetics has 4000 Native American records on file and can tell which tribe you come from. If you have a negative result on the mtDNA for Nat. Amer., then the family story isn't true or it wasn't clear as to which line to test.

The Rest of the Story

The Y and the mtDNA are only 2 lines and a tiny fraction of your total ancestry, but they do trace a straight path. There are still 3 billion other bases on DNA to still explore. UEPs can occur anywhere at anytime. The differences are the SNPs (Single Nucleotide Polymorphism) and they occur in junk DNA. The chance is 1 out of 50 million that a SNP will occur at any particular locus. But SNPs have been happening before man migrated out of Africa can be anywhere on earth. These kinds of SNPs aren't' useful. If a SNP shows up in the settlement of the Americas, there's only a 50% chance he'll pass the SNP down to a child and that child has only a 50% chance of passing it down, etc. But SNPs do catch on and appear in a significant portion of the population. SNPs are "Ancestry Informative Markers" used by DNA Print Genomics ( The SNPs are selected because they provide clues about geography. If a whole set of SNPs points it the same direction, then most of your ancestry came from that area. C is common and T is rare. Testing larger numbers of markers helps, but the confidence interval is large. If your Maximum Likelihood Estimate is 35% Nat. Amer., then it is 35% plus or minus 15% error (gee, it sounds like the obitos when someone dies and they are "mais ou menos" 60 years).

Expect the Unexpected

DNAPrint Genomics was the first company to test Ancestry Informative Makers.  There are 4 regions: Sub-Saharan, African, Indo-European, East Asian, and Nat. American (I count 5, so I don't know why they said 4). Then it went to a story example of a man who swore he was nothing but Pennsylvania German. His test showed he was 21% East Asian. This could be attributed to the invasion of the Huns or more ancient migrations. Then there was the story of the African guy whose test showed 0% African!  Many wanting Nat. Amer. ancestry find east Asian results. The test only goes so far back into your genetic make up (like 5 generations) because of the 50-50 chance that the specific allele is passed down (then how did the guy figure out the Huns?)

An Embarrassment of Riches

Each test develops one part of the picture. 

Critics Corner

Some critics say genetic genealogy is misleading at best (only a small part of your tree) and harmful at worst (oversimplified, false notation of race, or identity problems!)  It only helps with one branch and may only reflect the heritage back a few generations. "Genetealogy is still in its infancy, and those of us who are already practicing it have made our peace with the inevitable learning curve and growing pains associated with being a bit of a pioneer."

Ch. 6: Next of Kin: Close Relationships (pp. 101-126)

Some family members won't talk about what happened 50-60 years adoption, illegitimacy, a marriage someone would rather forget. But they do seep out as an insinuation. Some specialized close relationship testing may enable you to remove the road block.

100 Percent

Liver cells look nothing like nerve cells but they all have the same DNA. The revitalization continues...your skin cells are creating new ones, your old blood cells are being replaced, etc.

Capturing DNA

It's everywhere...down the drain when you brush your teeth, lick a stamp, or toss a soda can. Trace amounts are hard to deal with, but with the right conditions, enzymes, and supplies, DNA can recreate itself in a test tube (called PCR or Polymerase Chain Reaction). Scientists can't duplicate the entire DNA in all the chromosomes all at once, but only short segments. For human identification they use the variable regions. Only 0.1% accounts for all our differences and there are about 3 million differences, all occurring in junk DNA

Inset: DNA from Hair

Scientists use mtDNA but it requires the root of the hair and the best results require 5 - 10 hairs with the root in tact.

DNA Profiles

All 3 million differences don't need to be checked. The FBI uses only 13 makers called CODIS (COmbined Dna Index System). This is the database of DNA profiles of criminal offenders (wow, Cold Case Files is really starting to make sense!)  The DNA profiles are like a DNA fingerprint or DNA signature. They are a series of numbers with no meaning, kind of like the UPC codes at the market. The markers in CODIS are the STRs (Short Tandem Repeats). They are on the autosomal chromosomes and come in pairs. CODIS makers have 7 - 20 alleles that most people have uncommon values on. The CODIS markers have name like D3S1358 and you could have a 14 and a 15. A 14 from 1 parent and a 15 from the other. You can't tell with parent gave you which result. They are simply listed in numerical order.

How Do Forensic Scientists Use These Numbers?

The Royal Canadian Mounted Police run the frequencies of the alleles and the alleles in combination. If you run the values for the D3S1358 above, you will find that 13% of the population has the 14 marker and 29% has the 15 marker. But only 7.7% has both together. The odds will change with more markers added. With the addition of more markers, the slimmer the chances are that someone will match you on all of them. On just 5 markers there's less than a 1 in a million chance that someone matches you on all. The stats for all 13 markers is 1 in almost 3 quadrillion. The DNA profile doesn't tell you about personal traits, it's just a unique combination of numbers.

How are Individual DNA Profiles Used?

They can be used to convict the guilty, exonerate the innocent, or identify remains. The Innocence Project uses DNA from old evidence to show proof of innocence. 138 prisoners have been exonerated by the end of 2003. Many of the cold cases are being re-examined with DNA (like in Cold Case Files!). DNA profiles are also used in large-scale disasters and mass graves.

How is Paternity Testing Done?

They use the same set of markers that are used for identity testing. All alleles came from one parent or the other. If you have an allele that came from neither parent, then it came from someone else. (There's a chart with a dad, mom, and kid illustrating this). The case will be stronger if some of the alleles are rare in the population. The frequency that they appear is the paternity index. There is also a maternity test. This could be for an adoptee to find the birth mother.

Adoption Issues

There are free or low-cost registries, but they rely on correct info. If the info is lost, forgotten or deliberately changed, then DNA can help. Identigene started a DNS registry at Another resource is

Adoptions and Paternity Testing: Why Bother?

This was a complete story example. Some people say that they just knew when they met that they found their parent. This was the story of a man who had open heart surgery and wanted to find his daughter so she could know her medical history. But there was some confusion so they used DNA test to resolve it. Moral of the story: There is no room for error in reuniting families.

Sibling Similarities

If one parent is deceased, you can use DNA to test for other close relationships. You share 50% of your DNA with your parents (or with your children). Your full siblings also share 50% of the parents DNA. This is an average, not an exact number. There are many charts here showing how the average is about 50%. 

More Markers

The more markers, the better. So if the overall average is 50% for parent/child, it'll be 47.9% for sibling testing. For 13 markers there is 413 combinations. One parent could be homozygous (2 copies of the same allele). A sibling pair might not have something in common on any given marker. But siblings should end up with 50% of their alleles in common, much more than the general population. If some alleles are rare, this helps the case.

Inset: SNPs vs. STRs

SNPs are either bi-allelic polymorphisms which are present or absent and STR as multiple alleles. One SNP is not as informative as one STR. More than 100 SNPs = 13 STRs.

The Romanovs: You Decide

This explained about the Russian royal family killed in 1918 and the burial site discovered in 1991. The remains were commingled with a father, mother, male doctor and 3 kids. They used mtDNA and Prince Philip of England as the next of kin to identify who was who.

Other Close Relationships

The next closest are the grandparents, uncles/aunts (if connected by blood, not marriage), and nieces/nephews. The share 25% of DNA on average. If there is a mother and both paternal grandparents, then paternity calculations can be highly suggestive. Half-siblings, half-uncles, etc., reduce all percentages by 50%.

Cousin Marriages

States have laws about marriages ( First cousins share more of their DNA than a person at random, but not all cousin marriages are harmful. If the common grandfather had a harmful gene that doesn't affect him (it's a recessive gene), and the wife (unrelated) also has a recessive gene, then there's only a 50% chance that the grandfather will pass it on to his son who will pass it on to his son. By the time you do this, it's a 6.25% chance of the recessive showing up. 

Did You Inherit Any Genes from Your Famous Ancestor? 

If your great-grandmother had a certain voice (a trait with a genetic component) there's no guarantee that it survived the journey down to you. The farther back you go in time, the less chance you have of inheriting the trait. By the time you go back 10 generations, it's quite possible you haven't inherited anything. It's less than 1/1000. But every gene had come down to you for thousands of generations (this is getting confusing). 

Close Kin Mysteries

There are paternity/maternity tests, siblingship tests, and avuncular (aunt/uncle) tests. By comparing the frequencies of the selected alleles of the grandkid with the grandparents, aunts, uncles, cousins, etc, probabilities of relatedness can be calculated. These tests are $450 - $650. 

Selecting a Laboratory for Paternity Testing

There's the legal version and the "curiosity" version. It's the same test, but the legal version needs a 3rd party to collect the sample and use proof of identity so the test can be submitted in court. The American Assoc. of Blood Banks has an accreditation program for paternity testing. Forty labs are accredited throughout the country.  Some smaller labs send their samples to the accredited labs and some smaller labs have more staff to spend the time talking to the customer. Lack of accreditation is not a flaw for curiosity tests, but for legal purposes, go to an accredited lab. 

================END OF PART 2==============================

Part III: How to Do It Yourself

Chapter 7: Joining or Running a Project (pp. 129-142)

The Joys of Joining

The authors feel that DNA has reached it's tipping point, which is good for newcomers. There's a reasonable chance that your surname or another branch is being tested.

Finding a Surname Project

There's over 1000 surname projects (a look at Family Tree DNA now (Feb. 2006) shows 2776 surname projects. The book was published only in 2004!) US's top 5 surnames: Smith, Johnson, Williams, Jones, and Brown. There are also surname projects for unusual surnames as well. To find a project, you can use the internet. The National Genealogical Society's "News Magazine" announces various projects. Some society newsletters and family association newsletters publish them. Newsweek and The Wall Street Journal run articles on DNA and genealogy. You can Google "surname DNA" (replacing the word surname with a real surname). You'll find links to projects and/or messages about the projects. For project listings, there's If you can find no obvious links, you can go to the web sites of the testing companies (,, and You can also try variations of your surname or just the first few letters of your surname to get more matches. (Surnames don't work for Portuguese as they seem to change so often).

Joining a Project

Look at the scope of the project, the results to date and how they are shared, if there are any special requirements, and then join the study. Most web sites have a link for ordering the kit from the testing company. When you use the link, you are part of that study. The administrator/manager will be notified that you ordered. If you join a project, you can get a price break of 25% to 40% less than those that test on their own. The testing companies accept credit cards and some let you be invoiced.

Special Requirements

As DNA testing becomes more established and becomes more mainstream, formalities have developed. The 2 most typical extra requirements are submit the info about your earliest known ancestor and how you (or the person you are testing) is related, generation by generation. This gives more value to everyone in the study and enables you to find a new cluster of cousins. Most administrators post this info on the web sites. The 2nd extra requirement is a brief consent form that allows the company to notify you if they find any matches in their database. Not signing it practically defeats the purpose of testing. Some administrators have a secondary consent to post some of your family data on the internet. Usually it's a disguised code or the name of the earliest known ancestor. 

They also what to make sure that they have the right participants (correct line of descent), acknowledgment that there may not be any matches (especially in the project's early stages), who pays, and a timely response. Some now have an indemnification clause for legal claims. None have been made as of publication of the book.

When There's No Web Site

You can check mailing lists and message boards. Family Tree DNA's web site lets you communicate with the administrator. Other companies are going to offer this feature in the near future. (In case you didn't know, Katherine Hope Borges is the Administrator for the Azores DNA and she is on this list).

Launching Your Own Project

Most projects are surname focused. There is a 4 step process for managing a project (if you are going to branch off and start one on your American lines, I suggest you read this book).

The Project Management Process

1) Define the purpose

2) Select the test and vendor

3) Recruit participants

4) Interpret, report, and maintain

This is cyclical usually. You can go back to step 1 with an unexpected finding or to step 2 by someone upgrading their test. A lot of time is on steps 3 and 4 and there are 2 separate chapters dealing with that.

1) Define the purpose - Why? Why are you testing? What are you hoping to learn?

2) Select the test and vendor - Certain tests are available from certain vendors (like African American). There are new companies entering the market all the time. If you want to use different types of tests at different times, think about that before selecting a company. If you are using the Y-Chromosome testing, more markers equal greater cost, but equals more meaningful results. Beyond the tests: Turnaround time - how long does it take to process?; Responsiveness - when emailing or calling, do you get a response? How quickly?; Reporting - besides the standard list of numbers, is there additional analysis online or via hard copy? Are there web site templates to use for reporting?; Database Access - Is there a match making service? How large is it's database?; Management Tools - Does the administrator have to play middleman? Can participants order kits directly?; Cost & Payment Options - Check, credit card, order online, send to someone else at another address?; Sample Retention - how long does the company keep the samples? If upgrading the test can the same sample be used?; Special Services - Can the administrator preorder a batch of kits for a reunion? Does the company have forensics for hair & envelopes? What makes this company special from the others?; Pushing the Envelope - are they leaders or followers? If you want to be an administrator or manager, join the Genealogy-DNA-L list and ask for advice.

3) Recruit Participants - Two skills: find the correct people and deal with them. That is chapter 9.

4) Interpret, Report, and Maintain - This mundane analysis produces the "Ahas!" that will be the life of the project. The administrator will decide how to share the results. That is Chapter 10.

NOTE from Cheri: The notes on the following chapters are not as long as other chapters. If you are interested in starting your own project, then you need to read this book. 

But here's a real brief rundown of what Katherine goes through for us.

Chapter 8: Finding Prospects (pp. 143-168)

When it comes to DNA studies, more is always better.

Two Approaches

You can find them or they can find you. Definition: Reverse Genealogy: the detective work involved in tracing lines from the past to the present. Broadcasting: Make it easy for participants to find you.

Reverse Genealogy vs. Broadcasting

Common surnames use more broadcasting. If you need a certain participant (like when they did Thos. Jefferson & Sally Hemings) you'd use reverse genealogy. 

Reverse Genealogy: Following the DNA Trail

You play detective and find unknown cousins in the process.

Y-DNA: Looking for Lucases

Example of how the males were traced forward so a Y sample of DNA could be obtained.

Inset: Candidate Tie-Breakers:

Someone who has the surname you are researching, someone with a lot of relatives, and someone who is interested and enthusiastic.

mtDNA: A Soldier's Tale

Identifying a soldier's remains from the Korean war.

Reverse Genealogy Guidelines

The tactics in this chapter are helpful for searching in the past 50 - 150 years.

Remember to Surround and Conquer

Include relatives, friends, etc. You work your way back to the target ancestor and then forward in time to the ancestor's descendants.

Remember the Women!

They can participate by proxy (get an uncle, male cousin, etc). Interesting tidbit: Most genealogists are women (63 - 72% depending on which survey you read). Project administrators: 33% female and the women are more likely to run multiple surname projects (Katherine does).

Choose Your Initial Target Wisely

Use the most recently born, the one with the most unusual surname, and a male. Ex: Children born between 1850 - 1870, use the 1870 one. 

When Necessary, Go Backward to Come Forward

Usually we do genealogy in a linear fashion...start with ourselves and move backwards. Reverse genealogy works forward. But zigzagging works too. Go back, come forward, and go back again. Their example: 1930 census - 1900 census - 1969 (SSDI) - 1930 census - 2003 online phone book. Sometimes the lines you are using die out (daughtering out if you are looking for Y-DNA and petering out if you are looking for mtDNA).

Follow the Trail of the Deceased to Find the Living

You can find someone as a kid in the 1930, but can't find them today. You need to work with someone who has died, because it's easier to get data on them. They suggest the SSDI, online state death indices, and obits. If you can't find one person, look for their parents, spouses, or siblings.

Best Resource for Reverse Genealogy

Most people start by checking,,,, and The authors took 10 cases and figured out how much they used each source. They used censuses 80% of the time (then used the surround-and-conquer method) so they could find someone recent who may be alive. Collections of online family trees (Ancestry,, and FamilySearch) was 70%. They acknowledged there are errors, but enough info (or a submitter's email address) that can lead you to someone. Online phone directories was 60% (;; The Social Security Death Index (SSDI) was 50% (they used them both at (which allows wildcard searches) and FamilySearch which allows surname variations). Also, they used online state vital records 40% (Ancestry has 24 state indices; some states are starting their own; has a list; or the FHLC by state, then county, then vital records); search engines (good for unusual names); and newspapers, cemeteries, county-based web sites. They suggest (well, there's always,,, and


Good for a broad project, such as anyone with the given surname.

Broadcasting Guidelines

This is more straightforward. You'll want to leave lots of traces to make yourself findable, such as using the message boards. You can join lists and advertise, but you might wish to contact the list administrator and follow the rules. Some lists are very particular about announcing the costs of tests. You can get names from surname and locality boards, make a web site (that way you can share all the info at once instead of lots of individual emails).

Broadcasting Resources

The authors went polling DNA administrators to find out what they used and which was most effective. Most used a combination of techniques to find people for their projects.

NOTE: Again, the notes on this chapter are not as long as the others. This is more for starting your own project. Some of it is applicable to us (read the heading called Resistance).

Chapter 9: Contacting and Courting Participants (pp. 169-182)

People who are joining your project will have some questions. 

Excuse Me, May I Borrow Some DNA?

This will become easier as genetic genealogy goes more mainstream. 

It's All About Trust

This is the key. You will need patience. Explain what you know about their ancestors. Take your time.

The Initial Contact

Explain what you're trying to accomplish. Give out the web site, mail info, etc. They probably won't participate on 1st contact. You will get a lot of questions, hopefully.

Questions and Misconceptions

Have some basic articles. DNA 101 on the Blair surname is a good overview (for us too!) . Two common misunderstandings: 1) Woman can participate (No they can't, as they don't have a Y-chromosome). Women have to get a male relative to give their DNA (I have to get my dad as he's the one with the Portuguese Y-chromosome). 2) DNA testing will tell you the whole family tree, complete with names and dates attached (it only can put you in a time frame and that you connect to someone within a certain amount of years).


There will be objections, such as some people are suspicious and will think you are trying to scam them. Some people say they will test, but never order the test kit or never return it. Some people will say that you are trying to cut corners of the traditional research. Some people will worry that it will reveal medical info. Or it can go into a criminal database. (Remember, this is "Junk DNA" that doesn't show these characteristics). Most people worry about privacy. Most project managers have confidentiality policies in the consent forms. Some projects are completely private, or some are password based, or some use codes or the name of the earliest known ancestor. A few people still worry about needles, but the testing company that we use (and most companies now) use the swab of the mouth. If you want to see a guy swab his mouth, go to . Some people will want an automatic match with someone else. As the database builds, that may happen, but when projects are young (as the Azores one is), it will take some time. I (Cheri) must commend those people who are the pioneers of the Azores project that have already done their DNA and are willing to bide their time waiting for others to join so they may connect to new cousins). Some people don't like surprises. Imagine if you find a non-paternal event or that your haplogroup shows you're not European! 

Money, Money, Money

Prices are coming down. It sounds expensive, but how much money have you spent in ordering film, making copies, gas for the car to get to the FHC, and places you have gone (especially if you have gone to the Azores?) This can tell you where to look first (Note from Cheri: I think that as the database builds over time, those who have no clue what island they are from and have exhausted their paper trail (or don't want to follow the paper trail) can take a DNA sample and will be able to find someone similar so they will know where to look for their ancestors' baptismal record). 

Who Pays?

Some people do it themselves, some split the cost with a relative, some people pay for their relative (that's my dad..he'll say if I want to know, then I can pay for it!), some (usually surname) studies have family association or funds to pay for DNA tests, etc.

A Little Help From My Friends

For more ideas, join the Genealogy-DNA-L at Rootsweb or look at

Chapter 10: Interpreting and Sharing Results (pp. 183-211)

After a person DNA tests, they will want to know if they match to anyone else. They can look in the DNA project itself, testing company databases, public access databases, or published technical literature.

Mitochondrial DNA Analysis

Some companies notify participants if they have matches with another.

mtDNA Databases

There's the mtDNA Test Results Log Book ( and the Mitochondrial DNA Concordance ( You would use the 2nd one by searching with the number, e.g. 16293. On mtDNA you want exact matches, as mtDNA changes very slowly. 


This is the technical literature at the National Library of Medicine, where you can search for your haplogroup or background. Although it is technical, the discussion section

explains it more simply.

Y Chromosome Analysis

Your string of numbers: 13-24-16-10-14-15-11, etc. is your haplotype. You'll compare results with a cousin (or whoever) and then maybe with everyone else in the surname study (we'll be comparing within the island group). Some testing companies will tell you if you match with someone else who is not in the project (if some lost soul joined the Marshall surname project and didn't know that Marshall was an Anglicized version of Machado, you will be notified of that person). Of course you and others will have to sign the release (when you first send in the test) so you can get each other's e-mail addresses.

Y-DNA Databases

* Y-chromosome Haplotype Reference Database ( was developed for forensics, but they have stuff about frequency of haplotypes. This originated in Germany, so there's lots of European countries in it. 

* was the first one made for genealogists, sponsored by DNA Heritage. They have statistical summaries about the frequency of the alleles on the markers.

* was sponsored by FamilyTree DNA but will accept test results from any company. Anyone can add their own results. You can search by name and haplogroup. It also calculates the genetic distance for you.

* from the Sorenson Molecular Genealogy Foundation was built out of the 12,000 + (as of 2003) male participants. The names, dates, and places of ancestors prior to 1900 are given. All you have to do is enter your DNA markers.

Beyond Databases

Google, reading the PubMed, some testing companies will give you more info, etc.

Genetic Distance

When you get a close but not perfect match you will need to calculate the genetic distance. If you have a genetic distance of 0, then you have a perfect match. Then there are the mutations and an explanation here (good thing we have Katherine to figure this stuff out!)

Most Recent Common Ancestor (MRCA)

If there are 1 or 2 mutations, you can make a guess within a certain range. This way you can decide if you are going to chase this paper trail or that one. 

MRCA Calculation Factors

They are the mutation rate (overall average is 0.2% or 1/500), number of markers (the more the better as you can have more confidence in the calculations, although, if you have a rare haplotype, you could get away with less), and number of generations (depending on who you read you can get 20 or 25 years makes a generation depending on culture. 35 years has also been mentioned), along with some prior knowledge.

MRCA Calculation

You enter the mutation rate and number of markers. Adding more markers reduces the number of generations to the Most Recent Common Ancestor (MRCA). The authors made this calculator and it is at:

Ancestral Haplotype: Bridging the Mutation Gap

If 2 descendants have identical haplotype, then the ancestral haplotype will be easy to figure out. As mutations occur and more people join the project then you will be able to establish the various branches.

More Tools, More Toys!

This is pretty much stuff that Katherine can do to make the web page fancy or whatnot.

Sharing Results

Via public access databases or Web sites, or emailing the participants. More on each below.


You can do private emails until the project gets to be too big. Then you can post via message boards or mailing lists, or newsletter. 

Adding to Databases

Again,,, the mtDNA Test Results Log Book, and

Web Sites

This is probably has the widest audience. They give lots of outstanding web sites if you want to build your own DNA web site. Most will have the participants clustered by haplotype with mutations somehow highlighted. Many also have the pedigree info of the participants as well.

==================END OF PART 3============================

Part 4: The Future

Chapter 11: What's Next? (pp. 215-235)

The authors state that they are biased here, as they are pro-genetic genealogy. They say that it started at a crawl, has gathered momentum and is about to spread like wildfire.

Growing Popularity of Genetic Genealogy

Dick Eastman predicts that genetic genealogy will become commonplace. There has been an initial reticence, but as success stories start to come in, then more and more people will join projects. They will understand that genetic genealogy is not the same as the DNA testing for criminal or medical purposes. Privacy is maintained as usually the names of ancestors who lived more than 100 years ago are posted. DNA will prove to be a valuable tool to support or strengthen evidence and will be used to steer researchers in the right direction.

Inset: Preserving Samples

Testing prices are coming down, but for those who consider them too steep or can't test as many people as they like, an option exists. Kits for as low as $25 can be obtained now to collect the samples now and mail it in later for the full DNA profile.

Expanding Scope of Projects

Many start with Y-chromosome testing, then mtDNA and venture out. Some have a genetic pedigree (see: Within a few years, DNA testing will become routine.

Faster and Cheaper

The process will be quicker and prices will be lower (greater demand equals more competition equals lower prices). The prices will eventually plateau, but the cost per marker will come down and features will increase.

Inset: Easier Recording and Reporting

Genealogy software programs will have DNA features in them. For now, if you want to know which of your distant cousins is on your Y or mtDNA lines you can use this program:

What DNA Project Managers Want

Established mutation rates, explain the relationship to other kin, etc. Each is explained in the headings below.

More and Better Markers

The initial testing which began with 10-12 markers will probably be phased out and tests will have no fewer than 20 markers (lower markers give too many wrong conclusions). Other markers may start to be used and scientists may be able to get more out of the current markers. Right now, the Most Recent Common Ancestor (MRCA) is a calculated guess and the future may be able to pinpoint an exact relationship. There will be more research into marker mutation rates. 

What About mtDNA and SNPs?

mtDNA will begin testing the entire circular strand (it has begun now on FamilyTree DNA), but will show some medically informative regions. The SNP (Single Nucleotide Polymorphism) will allow become easier to determine the haplogroup and as they get more samples, the subgroup (which area of Europe) you are from. 

Enter Autosomal Testing

There will be more markers (300!) which may be able to test all ancestry across all branches (Molecular Genealogy Research Project = MGRP) by the Sorenson Molecular Genealogy Foundation. They hope to get enough data from 100,000 people to be able to tell how close or distant 2 people are related to each other.

Inset: DNA Adoption Banking

Adoptees and birth parents participate by doing a cheek swab and signing a release.

Age of the Database

The need to compare with others has evolved. (Most of these were mentioned in previous chapters).,,, As these databases grow, they offer more features such as searching by haplogroup, genetic distance report, add the name and date of the farthest back known ancestor. These databases are free and will probably remain that way. 

Inset: Scientists and Genealogists Working Together

If ever asked to support a scientific study by offering your genealogical research, please do it. There is always SMGF's (above). (you give them a mouthwash sample). They are also seeking family clusters to participate, so get a few kits and have many people test. 

THIS IS THE END OF THE BOOK!! I hope your brain has gone from a really dirty glass to one with some water spots on it (or that's what my brain feels like now). This wasn't a hard read and it's broken up into several small sections within each chapter. If your library doesn't have this book, try or your favorite bookstore at the mall.

© Kathy Andrade Cardoza 2019