Bad statistics in intelligent design

s Darwin's theory of evolution really on the verge of being overthrown? David Gelernter, a professor of computer science at Yale University, thinks so, but the arguments he presents purporting to debunk “Darwinism” in the spring 2019 issue of the Claremont Review of Books show that creationists have a lot of work to do.

In an article titled Giving Up Darwin, he reviews three recent books by intelligent design advocates at the Discovery Institute. Gelernter writes “The origin of species is exactly what Darwin cannot explain.” And let's be honest here: the alternative, intelligent design, is a euphemism for creationism. Invoking an extraterrestrial origin for life just puts us back to square one on a different planet. As much as they deny it, finding evidence for the existence of God is the basic motivation for intelligent design arguments.

Ribbon 3D structure of a protein (rendered in Chimera)

The statistical argument is a familiar one by now. Imagine, Gelernter says, a protein 150 amino acids long. There are 20 common amino acids, nine of which are essential for humans. He asks:

What are the chances that we can mutate our way to a useful new shape of protein? We can ask basically the same question in a more manageable way: what are the chances that a random 150-link sequence will create such a protein?
. . .
The total count of possible 150-link chains, where each link is chosen separately from 20 amino acids, is 20¹⁵⁰. In other words, many. 20¹⁵⁰ roughly equals 10¹⁹⁵, and there are only 10⁸⁰ atoms in the universe.

Starting from an estimate that the chance of a random sequence of 150 amino acids forming a stable protein that performs some useful function is 1 in 10⁷⁷ (a dubious assumption), and an estimate that throughout history there have been about 10⁴⁰ bacteria, he concludes

The odds against blind Darwinian chance having turned up even one mutation with the potential to push evolution forward are 10⁴⁰ × (1 / 10⁷⁷) — 10⁴⁰ tries, where your odds of success each time are 1 in 10⁷⁷—which equals 1 in 10³⁷. In practical terms, those odds are still zero.

Sigh. These statistics are based on a willful misunderstanding of basic biochemistry. It's not worth spending too much time on this; it's not a controversial topic among biochemists, even during our most heated lunchtime debates. But Claremont is a pre-eminent journal for conservative intellectualism. It's not necessary to know the technical details, though I'm including some, but it is important for conservatives to understand why this argument is fallacious. Otherwise, they risk having their scientific literacy called into question.

The argument is easily refuted. Let's break it down into three steps: (1) speciation; (2) mutation of one protein into another; and (3) formation of the first protein.

Step 1: The origin of species

Speciation is easy: Speciation is not some magical phenomenon—it is simply the point at which differences in DNA between two groups of animals are so great that the progeny can't survive and reproduce. It's also not a solid line; some closely related species, like lions and tigers, or horses and donkeys, can produce living offspring, but they are often infertile. The reason is well understood: different numbers of chromosomes produce fatal errors during cell division. Speciation is a product of genetic drift and other factors that change the chromosomes, and not the great mystery creationists make it out to be.

Unlike the sexes, of which there are two (and only two), there is not a perfectly sharp line between different species. When genetic drift is sufficient to prevent two kinds of animals from successfully reproducing, they are different species, by definition.

Step 2: Protein mutations

Step 2 is also easy: it is well understood that in modern humans there are 20 amino acids, but only five or six different classes. In case you're interested, they are:

Acidic = aspartate, glutamate
Basic = arginine, lysine, histidine
Aromatic = tyrosine, phenylalanine, tryptophan
Neutral non polar/slightly polar = glycine, alanine, cysteine, leucine, isoleucine, valine, methionine
Neutral polar = asparagine, glutamine, serine, threonine
Helix-breaking = proline.

(Structural biologists call helix-breaking amino acids a separate class, but others don't.)

Very often, a mutation substitutes an amino acid with one in the same class, such as glutamate for aspartate. This is called a conservative substitution. A conservative substitution usually yields a protein with similar or identical functionality to the original. Biologists use this fact all the time to create distance maps, in which they assume (as a first approximation) a fixed rate of random mutation to classify different species.

The probability that changing a single amino acid at random in a 150-amino acid protein would produce a fully functional protein is, at a minimum, one in six, or 17%. We know this from countless experiments: site-directed mutagenesis is an old technique that I've used many times. Sometimes, making a non-conservative mutation changes the function; a conservative one rarely does. The only caveat is to avoid introducing a proline, which puts a kink in the alpha-helix and prevents the protein from folding correctly.

Proteins evolve from other proteins by many mechanisms, all of which are consistent with the theory of evolution. They are well established empirically. They include gene duplication, recombination, retrotransposition, RNA splicing, and the classical genetic drift so beloved of population geneticists. I'm not going to explain these, but the bottom line is that a mutation that disrupts the function of an essential protein is not necessarily fatal, because the other copy takes over (thanks in part to gene duplication and to this marvelous invention called sex, which gives us two copies of everything).

This means that new proteins are not made from scratch, but by repurposing existing ones. We can see this happening by noting that some proteins come in many forms that have nearly identical properties. Protein kinase C has eleven. UDP-glucuronosyltransferase has sixteen. And there are many proteins that have entirely different functions but retain strong sequence homology to some other protein, indicating that they evolved from a common ancestor.

Mutations are very common. In the human population there are 180 mutations that cause pyruvate kinase deficiency and 228 mutations in presenilin-1, a protein involved in Alzheimer's disease. It's probably safe to assume that in early times there was a lot more duplication, missense mutation, and other inefficiencies going on.

The simplest type of mutation is a change in a single DNA base. We call these SNPs (single nucleotide polymorphisms). A SNP (pronounced “snip”) is a point mutation that is found in at least 1% of the population. A SNP occurs roughly once every 300 nucleotides, so there are on average 10,000,000 SNPs—ten million mutations—in every man, woman and child. Some of them have a big effect on fitness, which is to say survival, often under specific conditions, such as when the person has malaria, or after a traumatic brain injury. So when Gelernter tries to distinguish between “minor” and “major” mutations (which he says are rare and fatal), it's a distinction without a difference. You'd have to ignore vast amounts of evidence to claim that proteins don't mutate. Proteins mutate a lot; that's why there are so many different ones.

Step 3: Where did the first protein come from?

So, how about step 3? Given that there are only six chemical classes of amino acids, if a protein sprung fully formed out of nothingness all at once, the chance of it being some specific 150-amino acid protein would be one in 6¹⁵⁰, or about one in 5.28×10¹¹⁶. That's a low probability.

Of course, there might not have been six classes to start with. It's highly likely that in the early environment, there might have been only one or maybe two amino acids. If there were only two—say, cysteine and glycine—we have 1 in 1.4×10⁴⁵. Still improbable.

But who ever said the first protein just popped out of thin air, like Athena popping out of Zeus's forehead, with 150 amino acids, modified with disulfide bonds, crosslinking, and phosphorylation, as we see today? Few if any biochemists think this; it's a classic strawman (or straw-protein) argument, and the statistical argument that is based on it simply falls apart.

Suppose the first proteins were only ten amino acids long. If all six types of amino acids were present, we could have 66 million functionally different proteins. If only two were present, we'd only have 1,024. Those little proteins would be highly inefficient by today's standards, but chemists have found that even short peptides can bind tightly to each other, and they can evolve surprisingly quickly. If a peptide binds two things, it holds them in close proximity to each other and they can react spontaneously. It's a form of simple catalysis.

This isn't some ad hoc argument; we rely on the binding property of peptides in common techniques like phage display, where even five or six amino acids are enough to produce strong binding. The body uses short peptides as hormones and signaling molecules, and they're extremely potent. People measure selective binding of peptides every day in the lab: it's actually a big industry nowadays.

Although a peptide can bind avidly to other molecules, simple non-enzymatic catalysis by a peptide is very slow by modern standards. This creates what is called evolutionary pressure: a cell containing a peptide that, by chance, happens to catalyze the reaction a bit better will out-reproduce its chums.

As for where that first little inefficient protein came from, maybe you could call that a gap. But if I were a creationist, I'd give up looking for gaps in evolutionary biology. New ideas are always welcome, but the evidence that proteins evolve and that species are created by evolution is overwhelming.

The flaw in the creationists' argument isn't with their arithmetic, but with their assumptions. Statistics are a useful way of reasoning logically. But, just as with any other kind of logical reasoning, if you start from a false assumption, you can reach any conclusion you want.

may 19 2019, 4:44 am. last edited jun 05 2019, 5:51 am. minor edits jun 24, 2019 8:52 am

Bad statistics in intelligent design

Step 1: The origin of species

Step 2: Protein mutations

Step 3: Where did the first protein come from?

Related Articles