Yes, but what EXACTLY is a centiMorgan?
19 September 2021
Ah, the humble centiMorgan. The pesky cM.
That semi-upper-case abbreviation (or should I say aberration…) that causes much head-scratching among DNA test takers (and genealogy students!).
I don’t know about you, but when I first took an autosomal DNA test and began using genetic genealogy in my research, one of the things I couldn’t quite get my head around was what EXACTLY a centiMorgan was. I was perfectly happy interpreting shared matches using cM size and availing myself of the various shared cM charts and utilities, but beyond that, it was all a bit murky.
As one of those ex-teacher types, who still has to understand everything to the point of being able to explain it to someone else, it really niggled me that I couldn’t quite grasp what it was. I feared being asked the question ‘so what exactly IS a centiMorgan?’ when talking to people about DNA tests, because I knew I couldn’t give an answer that really satisfied me. My DNA bookshelf has books explaining its use in genealogy but not its actual meaning, but bizarrely, my science books on pure genetics don’t mention centiMorgans at all.
Even now, if you google what is a centiMorgan, you tend to just find one of two main responses:
• A quick definition describing it as a unit of ‘genetic measurement’, then a long explanation of the various expected shared cM amounts when using DNA in genealogy; the higher the number, the closer the match etc
• The story of Alfred Sturtevant and his brainwave in Thomas Hunt Morgan’s ‘Fly Room’, resulting in the first genetic map. This would often be followed by an somewhat baffling explanation (to the layman) about the closeness of genes on a chromosome.
Neither of these really gave me the answer I was searching for. What do the two things have in common? I would read all of these and still think, yes but what exactly is it?
Granted, as I take out my wooden ‘Kings and Queens’ ruler from time to time and measure across a cake tin, I don’t stand there existentially pondering what exactly a centimetre is because I know that somewhere, the definition will have the word ‘metric’ and it will likely give me an alternative physical distance in terms of metres or inches.
But look up the definition of centiMorgan, and the uninitiated gets a deeply perplexing and frankly unsatisfactory explanation such as:
A unit of measure of genetic recombination frequency. One cM is equal to a 1% chance that a marker at one genetic locus will be separated from a marker at another locus due to crossing over in a single generation.
So what exactly IS a centiMorgan? Answer: It’s complicated.
The bottom line is, you really don’t have to have a geneticist’s understanding of a centiMorgan to be able to successfully use DNA test results in your research. You certainly need to understand something about the nature of inheritance and be confident in using shared centiMorgan totals, with an awareness of segment sizes, but you really don’t need to get hung up on it as a concept!
But if you, like me, like to tick all the boxes when learning something new, let’s take a walk from the beginning.
Despite the fact that we humans come in all shapes and sizes, our DNA is actually at least 99% identical. In terms of the raw DNA sequence or ‘code’, we are 98% genetically similar to a chimpanzee, 90% to a cat, 82% to a dog and we even share some genes with a banana!
But it’s actually only the tiny less-than-one-percent that we differ from other human beings which we examine when we take an autosomal DNA test. ‘Shared’ DNA with a genetic relative is within this tiny percentage. So when we say we ‘share’ 50% of our DNA with a parent, it means that – within this small percentage of differing DNA – we share 50% of the same gene variants (or alleles) at specific points. So whilst we have exactly the same genes (in the same positions, on the same chromosomes, with the exception of genetic disorders) as any other human being, it’s the variant part of the genome we compare with other genetic relatives.
Therefore whilst you might have read that we ‘share’ 50% DNA with both a parent and a banana, it is really talking about different things! The confusion really lies in our use of the word ‘share’.
OK. So we know we have 50% of the same gene variants as each of our parents. This is because half of our chromosomes came from our father and half from our mother.
Genetically healthy individuals have 23 pairs of chromosomes (22 pairs of autosomes and 1 pair of sex chromosomes XX or XY). We number the autosome pairs from 1 – 22, and we each receive a paternal Chromosome 1 and a maternal Chromosome 1, a paternal Ch2 and maternal Ch2 and so on.
The way we ‘receive’ these is through the fusing of the sperm cell and the egg cell during reproduction. The sex cells (haploid) contain only 1 copy of each chromosome, so that, when they fuse, the fertilized egg will have the correct number of copies – 23 pairs.
Nature’s gift of variation to offspring and of fun and enjoyment to genetic genealogists, is expressed in this single set of chromosomes in a cell!
In order that offspring are not genetic clones of each other and the parents (and so are susceptible of being wiped out by the same disease), two processes occur during meiosis – crossing over (aka recombination) and independent assortment.
Picture the germ cell (a sex cell producing sperm or eggs). It’s a typical cell, containing two sets of 23 chromosomes floating about quite happily. When the cell is about to divide to form haploid daughter cells (meiosis), the chromosomes find their ‘partners’ – Chromosome 1 received from the person’s father finds Chromosome 1 received from the person’s mother and so on. How amazing is that! These are known as homologous pairs (homo- same, logous- location, value or structure). These pairs then couple together in a process known as ‘rough pairing’, where they might exchange or ‘cross over’ chunks of genetic material. They then line up in the middle of the cell in their pairs, before splitting up and moving to either side ready to divide into two new cells.
I like to imagine this a bit like a giant dance hall. The dancers hold onto each other and dance a few bars of the song close together, then they move into the middle of the room, some of them swing each other round to swap places, then they separate to either side.
Some of you will be old enough to remember the heady days of when Great Britain didn’t score ‘nul points’ in the Eurovision Song Contest, but actually blooming well won it. Back in 1981, Bucks Fizz were the victors with a song called Making Your Mind Up. Gasps of delight reverberated around the continent when the boys whipped off the girls’ skirts midway through the song (to reveal a little shorts number…the Eurovision wasn’t THAT kind of show). Well, imagine these dancers took it a step further. Mike Nolan didn’t just whip off Cheryl Baker’s skirt, he put in on himself and gave her his trousers!
So it is with chromosomes! What you are eventually left with (after this process has occurred twice) is a haploid sex cell with a single unique version of chromosomes 1 – 22 made up of different combinations of the homologous chromosomes inherited from both parents, plus an X (which may also have recombined with the other X) or a Y (which will not recombine with the X). The typical ‘daughter’ (sperm or egg) cell will always receive exactly 23 chromosomes, but owing to recombination and independent assortment, the proportions of paternal and maternal genetic material in each one will differ.
This is why the only guaranteed percentage in inheritance is that we inherit exactly 50% from each parent. A genetically healthy individual will receive a full copy of each of the 23 chromosomes from each parent.
BUT, that 50% from our father is made up of a combination of the chromosomes he inherited from HIS parents. Thanks to recombination and independent assortment, the 50% we receive might not be exactly 25% from our paternal grandmother and 25% from our paternal grandfather, but 22% and 28% or 15% and 35%, for example.
Ok, so what the heck does this have to do with centimorgans?
Each time a chromosome pair like Mike and Cheryl exchange an item of clothing, it is called a recombination event. Each of these recombination events will create ‘segments’ of DNA. The beginning and end points of these segments (the loci) mark the locations where a crossover occurred. So the Chromosome 1 (Ch1) you inherit from your father might be made up of alternating segments from the two ‘Chromosome 1’s he inherited from his parents. In other words, your paternal Chromosome 1 will be made up of blocks of DNA passed down from both your paternal grandfather and paternal grandmother. As you, in turn, pass these blocks down to your children, and they pass them down to their children, they will diminish in size, as you only pass on 50% of your genes to the next generation. So your grandchildren will only share very small blocks of DNA with your paternal grandparents as there are four generations – or meiotic events – between them.
As you can see in the image below of a single meiotic event, the father’s and the mother’s two copies of Chromosome 1 have been inherited from their parents.
The father received the brown Ch1 from his father (which he in turn had received from HIS parents), and the blue Ch1 from his mother (which she in turn had received from HER parents; you get the picture?). The mother received the green Ch1 from her father and the pink from her mother.
Thanks to recombination, the daughter’s two copies of her Chromosome 1 are a combination of her father’s parents and her mother’s parents (her grandparents).
The first part of her paternal Ch1 in brown was inherited from her paternal grandfather, but then there was a recombination event and you can see the crossover point where her grandfather’s DNA data stops, and the paternal grandmother’s DNA continues to the end. In her maternal Ch1, there were 2 recombination events: the first segment was inherited from her maternal grandfather, then there is a crossover to her maternal grandmother’s DNA, then another crossover back to her grandfather’s DNA.
You can see, then, that it is not exactly 25% from each grandparent.
To understand the connection of such recombination events to centiMorgans, we do need to turn briefly to history. Thomas Hunt Morgan (1866-1945), an embryologist and early geneticist, spent 17 years breeding and studying inheritance patterns in fruit flies, Drosophila melanogaster. During this time, he and his team were able to prove the theory of chromosomal inheritance and confirm the presence and linkage of genes on chromosomes.
In doing so, they also realised that the appearance of unexpected combinations of traits in fly offspring not seen in either of their parents, must mean that certain genes normally inherited together are somehow ‘crossing over’ on the chromosome; that somewhere in between two genes coding for two traits, a cut was made on the chromosome, and genetic material switched sides, resulting in ‘recombinant’ offspring.
It was one of his students, the young Alfred Henry Sturtevant, who worked out (in a flash of brilliance one night) that genes must also be linked in a series. He also figured out that the frequency of unusual or recombinant offspring in a generation was likely to be relative to the distance apart of two genes on a chromosome. The further apart the two genes, the more recombinant offspring were produced. The closer together, the fewer recombinants, as this must mean there was less space for a cut, and therefore a crossing over, to occur. Put differently, the higher the percentage of recombination, the further apart the two genes on the chromosome, the longer the section of DNA.
Thomas Hunt Morgan’s illustration of crossing over (1916)
At that time, no-one actually knew how long a chromosome was so Sturtevant determined that each 1% drop in the frequency of crossovers occurring between two genes was the equivalent of one map unit of distance. In 1913, Sturtevant went on to create the first ever ‘map’ depicting the rough position of genes, relative to one another on a chromosome, and the units were later named, by Haldane, in honour of Thomas Hunt Morgan. One Morgan has a crossover value of 100% and so one centiMorgan has a crossover value of 1%. The centiMorgan (a 100th of a Morgan) was born.
The work of scientists such as Chargaff, Watson, Crick and Franklin lead to the eventual discovery of the double helix structure and its ‘rungs’ or base pairs. This, and the subsequent developments in genome sequencing, means that we can now physically map chromosomes (by looking at the base pairs: A-T, C-G) as well as genetically map them (based on linkage and recombination).
Both of these techniques are still used in genome mapping today. Whereas the very earliest genetic maps used genes as ‘landmarks’, modern maps use much more specific markers, such as areas on the chromosome which typically show variability in a population, such as an SNP or ‘snip’ (single nucleotide polymorphism).
There are three main ways that DNA now can be measured: by counting the number of base pairs in a stretch of DNA, by counting the SNPs in a stretch of DNA or by calculating the likelihood of recombination along a stretch of DNA in centiMorgans.
Whilst all these techniques have their specific functions, genetic mapping using centiMorgans is much more useful for us in genetic genealogy as we are mainly interested in how chunks or segments of DNA have been passed down from our ancestors, and where crossovers may have occurred.
Weird though they are, centiMorgans give us the best indication of how ‘related’ two people with shared DNA are. The longer the stretch of linked genes (or segments) we share with someone, the closer we are likely to be related because the distance between us and the shared ancestor must be fewer generations away. My 1st cousin and I will both share large segments of DNA inherited from our common set of grandparents, but my 4th cousin and I may only share very small segments of DNA, passed down from our common set of 3X great grandparents, if any at all.
Another reason for using centiMorgans is to do with the ‘terrain’ of a chromosome. CentiMorgans do not neatly convert to base pairs or ‘megabases’ (a million base pairs), like you might, for example, convert centimetres to inches.
It is generally suggested that 1 cM is roughly the same as around 1 million base pairs or 1 Mb, but the physical length of a centiMorgan can vary depending not only on which chromosome it is on, but also its position on a chromosome!
The ends of chromosomes are much more likely to recombine than areas nearer to the centre (the centromere), so the rate of recombination at the ends will be much higher. This means that 1 cM at the end of one chromosome might only be 50,000 base pairs long, but further up near the centre, it might have to be 5 million base pairs long in order to carry the same chance of recombination!
What does that mean in practical terms?
Let’s say you and Bob share a segment of DNA on a chromosome. This matching segment is 25 million base pairs long. Whilst that describes its physical length, it does not take into account whether or not it lies in an area of chromosome likely to recombine. If it lies in a quieter area near the centromere, it’s more likely to be fairly static and it may have remained the same for many generations (so won’t differentiate between close and distant matches). If it lies in a busier area near the end, it is much more prone to recombination so its value as a comparison tool is much greater. The distance in base pairs won’t tell us this, but if we measure it in centiMorgans, it becomes more relevant: 25 Mb near the centre might only be 5 cM, but 25 Mb at the end could be 40 cM. Going back to the point above, then, one centiMorgan near the centre would have to be physically much longer in base pairs to carry the same ‘worth’ as one centiMorgan at the end.
So it’s not helpful to think of centiMorgans in terms of physical size.
There are some good analogies out there though. The one I particularly like compares centiMorgans and base pairs to drive time. Say you wish to drive to pick your friend up who is 5 miles away. Here in the open Lincolnshire countryside with an average speed of 50 – 60 mph, it might take between 5 and 10 minutes. If the 5 mile stretch was across was from central Ealing to Heathrow in London and I set off at rush hour, it might take me well over an hour!
Similarly a nice leisurely 5 mile bike ride through Lincolnshire’s flat smooth country lanes is likely to take considerably less time than the same distance over a rugged paths in the Pyrenees. In all of these cases, the distance is exactly the same, but the time taken to cover that distance has to take into consideration the terrain, the traffic, the roadworks, the traffic lights, the snow, the wandering sheep etc.
What is important to me is knowing what time to set off to meet my friend in order to meet at the specified hour. So in this case, the time the journey will take is ultimately more important than the physical distance.
This is the same with centiMorgans and base pairs. A shared 25 Mb in the wide open ‘Lincolnshire’ of the chromosome is very different from a shared 25 Mb at the bustling, jam-packed ‘West London’ near the end.
What I really need to know is how many centiMorgans the shared segment is, as this will give me a better idea of how close the relationship is, just as an estimation of the time a journey may take based on the surrounding environment will give me a better indication of when to set off than just knowing the distance. So the centiMorgan score has more value to me than the physical distance.
CentiMorgans themselves are an empirical measure (based on observation or experience. From Latin empiricus (n.) “a physician guided by experience”). To get this data, genetic researchers compared DNA from thousands of parents and children and counted the average number of times a section of DNA was likely to recombine in one generation (one meiosis from parent to child). By analyzing the findings, they worked out that a single germ (sex) cell has on average 36.4 recombinations. Where there are 22 (+1) chromosomes, that’s at least one exchange of material per chromosome.
This average number can be said to be the ‘Morgan length’ of the chromosomes (as a ‘Morgan’ is a 100% chance of a crossover in a generation and there is an average of 36.4 crossovers). Multiply this number by 100, and you have the centiMorgan length of the chromosomes, so 3640 cM. Double this, as we have two copies of the chromosomes, and we each have an approximate average of 7280cM in our genome. (Note, this figure varies slightly between DNA testing companies).
In actual fact, the female recombination rate tends to be much higher than males and the rate can also vary between populations, but in general, we take the average to be around 36.4. Having to take into account males and females when working with shared centimorgan amounts between genetic matches would add a whole new level of complexity, but still it’s worth bearing in mind.
The DNA test companies sequence this approximate average of 7280 cM of your DNA and look for any sections which exactly match others in the database. The longer the segment and the greater amount of segments you share with someone, the closer the match. A 3640 cM match shares half your genome and is therefore going to be your parent or child.
We can say then that the centiMorgan is technically and originally a unit of probability of recombination, but it is also used to imply segment size.
Thankfully, DNA testing companies take all of the above into account when comparing our test results with others in the database! They look at segment sizes, locations and SNPs in a segment. If a genetic match to us exceeds the set company thresholds, then it is deemed a relevant match and a relationship estimate is provided. The hardest work is done for us!
Tools like the Shared CentiMorgan Project are invaluable in helping you interpret your genetic matches in terms of the centiMorgans shared once you get your DNA test results back. If you want to take your research up a level, then chromosome mapping using a tool such as DNA Painter, will help you visualise which segments of DNA you received from which ancestor and the various crossover points along the chromosomes.
There are many super-useful videos, magazine articles and blogs out there explaining how best to use centiMorgan values in comparing your matches and how to start out with chromosome mapping.
For those of you still thoroughly discombobulated by the abstract concept of the centiMorgan, don’t worry! You are not alone. My advice is just keep on working with them and think of them as units of relative genetic distance for now. It will come! Thankfully, the only other place you might come across such a weird unit of probability is in quantum theory and we don’t need to use that (yet!) in genealogy.
Let’s end by looking again at one of those seemingly complicated definitions. This one is taken from the ISOGG website:
A centiMorgan (cM) or map unit (m.u.) is a unit of recombinant frequency which is used to measure genetic distance. [not strictly a measure of physical length but we use it to denote distance]
It is often used to imply distance along a chromosome, and takes into account how often recombination occurs in a region. A region with few cMs undergoes relatively less recombination. [a relative length that considers how many crossovers are likely to have occurred in a chunk of DNA. A ‘quieter’ area has fewer crossovers]
The number of base pairs to which it corresponds varies widely across the genome (different regions of a chromosome have different propensities towards crossover). [a busier area is like crossing Paris at rush hour compared with encountering a lone goat when crossing the deserted Russian Steppes]
One centiMorgan corresponds to about 1 million base pairs in humans on average. [but only on average]
The centiMorgan is equal to a 1% chance that a marker at one genetic locus on a chromosome will be separated from a marker at a second locus due to crossing over in a single generation. [there’s a one in a hundred chance that a split will occur between two places on the chromosome when the parent creates a sex cell]
I hope it is all a little clearer! Perhaps the next time you view your DNA match list, those little cMs will no longer fill you with fear and puzzlement.
If you start to panic, just think about those dancing chromosomes!
And if that isn’t surreal enough, here’s a strangely hypnotic video I found by Lighthouse Rock of a chromosome rave….
Fancy a dance, anyone?
Bettinger, Blaine T. (2019) The Family Tree guide to DNA testing and genetic genealogy. 2nd ed. Cincinnati, Ohio: Family Tree Books.
Holton, Graham S. ed. (2019) Tracing your ancestors using DNA: a guide for family historians. Barnsley: Pen & Sword Family History.
Skwarecki, Beth (2018) Genetics 101. Avon, Massachusetts: Adams Media.
Crow, James F. (2004) ‘Haldane’s ideas in biology with special reference to disease and evolution’. In: Dronamraju, Krishna R. ed. Infectious disease and host-pathogen evolution. Cambridge: Cambridge University Press. pp. 14-15.
International Society of Genetic Genealogy. (2020) ‘centiMorgan.’ In: International Society of Genetic Genealogy Wiki. https://isogg.org/wiki/CentiMorgan : accessed 18 September 2021.
Oxford University Press. (2007) ‘Morgan unit.’ In: A dictionary of genetics. Oxford: Oxford University Press. https://www.oxfordreference.com/view/10.1093/oi/authority.20110803100209740 : accessed 18 September 2021.
Lewis, Ricki. (2018) A Common Ancestry Metric Is Based On a Century-Old Discovery by a 19-Year-Old: CentiMorgans Explained. DNA science [blog] 29 November. https://dnascience.plos.org/2018/11/29/a-common-ancestry-metric-is-based-on-a-century-old-discovery-by-a-19-year-old-centimorgans-explained/ : accessed 18 September 2021.
Pearl, Jonny. DNA Painter. https://dnapainter.com/ : accessed 18 September 2021.
MedicineNet. Medical Definition of Centimorgan (cM). https://www.medicinenet.com/centimorgan_cm/definition.htm : accessed 18 September 2021.
Snir, Ran. (2019) What exactly is a centimorgan? Legacy Family Tree webinars [webinar] 07 September. https://familytreewebinars.com/webinar/what-exactly-is-a-centimorgan/ : accessed 18 September 2021.
Williams, Amy. (2020) What is a centiMorgan? Hapi-DNA. [blog] 30 September 2020. https://hapi-dna.org/2020/09/what-is-a-centimorgan/ : accessed 18 September 2021.
Harper, Douglas. (2021) ‘empirical.’ In: Online Etymology Dictionary. https://www.etymonline.com/word/empirical : accessed 18 September 2021.
J. Craig Venter Institute. Genetics and genomics timeline 1910. http://www.genomenewsnetwork.org/resources/timeline/1910_Morgan.php : accessed 18 September 2021.
Scitable. Thomas Hunt Morgan: the fruit fly scientist. https://www.nature.com/scitable/topicpage/thomas-hunt-morgan-the-fruit-fly-scientist-6579789/ : accessed 18 September 2021.
Cold Spring Harbor Laboratory. Alfred Henry Sturtevant (1891-1970). https://dnalc.cshl.edu/view/16297-Biography-11-Alfred-Henry-Sturtevant-1891-1970-.html : accessed 18 September 2021.
Khan Academy. Discovery of the structure of DNA. https://www.khanacademy.org/science/high-school-biology/hs-molecular-genetics/hs-discovery-and-structure-of-dna/a/discovery-of-the-structure-of-dna : accessed 18 September 2021.
Your Genome. How do you map a genome? https://www.yourgenome.org/facts/how-do-you-map-a-genome : accessed 18 September 2021.
Williams, David O., Hart, Graham, Graff, Jamison, Studley, Kevin, Bartlett, Jim, Nisbett, Adam. (2021) Re: What is the definition of centiMorgan in simple terms?. Facebook group discussion. Genetic genealogy tips & techniques, 04 February. https://www.facebook.com/groups/geneticgenealogytipsandtechniques/posts/1085802015216831/ : accessed 18 September 2021.
Haines, Jonathan L. Mapping The Comparison Of Genetic And Physical Distance. https://medicine.jrank.org/pages/2487/Mapping-Comparison-Genetic-Physical-Distance.html : accessed 18 September 2021.
Penn State University. Chromosome behaviour and gene linkage. https://wikispaces.psu.edu/display/Bio110Leap/Chromosome+Behavior+and+Gene+Linkage : accessed 18 September 2021.
Images: Vector. Business avatars. https://www.freevector.com/business-avatars-with-smiling-faces-31250# : accessed 18 September 2021.
Images: Vector. DNA. https://www.vecteezy.com/free-vector/biology : accessed 18 September 2021.
Images: Diagram. Chromosomes crossing over. 1916. Thomas Hunt Morgan. A critique of the theory of evolution. Princeton: Princeton University Press. p. 132. https://commons.wikimedia.org/wiki/File:Morgan_crossover_1.jpg : accessed 18 September 2021.
Images: Photograph. Landscape. 2004. Maarten Pedroli, photographer. https://www.freeimages.com/photo/landscape-2-1554553 : accessed 18 September 2021.
Images: Photograph. Traffic congestion. 2005. Richard Styles, photographer. https://www.freeimages.com/photo/traffic-congestion-1450371 : accessed 18 September 2021.
Images: Photograph. Mountain lake. 2007. Constantin Jurcut, photographer. https://www.freeimages.com/photo/mountain-lake-4-1361997 : accessed 18 September 2021.
Lighthouse Rock. (2018) Dancing chromosomes 1 hour loop. [You Tube video] 04 December. https://www.youtube.com/watch?v=PVNST_bo080&t=750s : accessed 18 September 2021.