Science of naming and categorizing genes:
In September 2012, the ENCODE Project Consortium announced the results of the second phase of research designed to identify and catalog all the functional elements in the human genome.1 I reported on this announcement in the September 6, 2012 episode of RTB’s Science News Flash podcast, and suggested that the ENCODE results may well be one of the most important scientific achievements in my lifetime, or at least in my time as a professional biochemist.
Since the initial sequencing of the human genome, many skeptics and evolutionary biologists have asserted that the most compelling evidence for human evolution—and the most potent challenge against intelligent design/creationism—is the vast amount of junk DNA in the human genome. And yet, with the results of the ENCODE Project, these arguments evaporate. We can no longer consider the human genome a vast wasteland of junk, but an elegant system that displays sophistication in its architecture and operation, far beyond what most evolutionary biologists ever imagined.
When announced, the ENCODE results generated quite a bit of Internet chatter. Many skeptics asserted that the media and design proponents overhyped and misconstrued the project’s discoveries. Shortly thereafter, several papers appeared in the scientific literature highlighting ENCODE’s “flaws.”2 Many evolutionary biologists hope these critiques will undermine the project’s conclusion that 80 percent of the human genome, at minimum, contains functional DNA elements.
Yet, very good reasons exist for thinking that the ENCODE Project’s results are still valid. In fact, I have written several articles that provide a detailed response to each criticism. I conclude that the charges against ENCODE lack technical merit and appear to be motivated by philosophical considerations more than anything else. To read my response to these criticisms click on the below links:
- “Responding to ENCODE Skeptics”
- “Do ENCODE Skeptics Protest Too Much?” (part 1, part 2, and part 3)
It is interesting to note that not everyone in the scientific community agrees with the ENCODE skeptics. Molecular geneticist John Mattick, executive director of the Garvan Institute of Medical Research in Australia, believes in ENCODE’s validity.3 In a recent article, Mattick and his coauthor, Marcel Dinger, argue, like me, that the criticisms of ENCODE are unwarranted technically and are motivated by non-scientific considerations.
One of the chief criticisms leveled at ENCODE relates to its use of a causal definition of function to determine functionality within the human genome. That is, a sequence element in the genome possesses function if it performs an observationally or experimentally identified role. ENCODE skeptics argue that this definition is faulty; instead, the project should have relied on sequence conservation (the so-called selected effect definition) as a way to measure function.
According to the selected effect definition, sequences in genomes can be deemed functional only if they evolved under evolutionary processes to perform a particular function. Once these sequences are evolved, the effects of natural selection make them resistant to change because, at this point, any further alteration would compromise the function of the sequence and, consequently, be deleterious. Reduced survivability and reproductive success would then eliminate organisms possessing deleterious sequence variations from the population. Hence, functional sequences are those under the effects of selection. And based on a selected effect definition of function, only 10 percent (not 80) of the human genome could be considered functional.
Mattick and Dinger decry the weakness of the selected effect definition. They argue that the genome’s regulatory regions are much more malleable than the selected effect idea suggests, retaining function in the face of mutational changes. Thus, sequence conservation (one way to detect selection at work) cannot be a valid marker of function. Mattick and Dinger propose and defend differential transcription, an alternative measure of function. They note that during the course of development, the vast majority of the human genome (and the genome of other mammals) is “differentially transcribed in precise cell-specific patterns” to generate RNA molecules with a regulatory role. It is interesting to note that this is a causal definition of function, meaning it relies on cause-and-effect relationships.
In response to Mattick and Dinger’s definition of function, ENCODE skeptics claim that transcription of the genome is noisy (random, and arbitrary). As such, transcription cannot be viewed as an indicator of function. In a previous article, I offer a response to this challenge. So, too, do Mattick and Dinger. They state:
Assertions that the observed transcription represents random noise…is more opinion than fact and difficult to reconcile with the exquisite precision of differential cell- and tissue-specific transcription in human cells…4
ENCODE skeptics also complain that the results of the project don’t make sense in light of the C-value paradox. This paradox states that most of an organism’s genome consists of DNA that doesn’t code for proteins or regulate gene expression. Researchers have long held that the non-coding DNA serves no real purpose—they view it as useless junk, vestiges of evolutionary processes.
The concern among ENCODE skeptics is that if the project’s conclusion is valid, then most, if not all, of the human genome contains functional DNA. Thus, the human genome contains very little junk DNA, which would constitute an absurdity in light of the C-value paradox. Therefore, the project’s results cannot be correct according to ENCODE skeptics.
The C-value paradox also explains why organisms less sophisticated than humans have larger genomes. That is, genome size is due to junk DNA and has no relationship to an organism’s complexity. Again, if the ENCODE results are correct, then this phenomenon has no explanation.
But as Mattick and Dinger point out, the large genome sizes of relatively simple organisms appear to stem from duplications of extensive genome regions (a phenomenon referred to aspolyploidy). To put it differently, the ENCODE conclusions are fully compatible with the C-value paradox.
Also, Mattick and Dinger rightly point out that the ENCODE skeptics appear to be motivated by non-scientific factors.
There may also be another factor motivating the Graur et al. and related articles (van Bakel et al. 2010; Scanlan 2012), which is suggested by the sources and selection of quotations used at the beginning of the article, as well as in the use of the phrase “evolution-free gospel” in its title (Graur et al. 2013): the argument of a largely non-functional genome is invoked by some evolutionary theorists in the debate against the proposition of intelligent design of life on earth, particularly with respect to the origin of humanity. In essence, the argument posits that the presence of non-protein coding or so-called ‘junk DNA’ that comprises > 90% of the human genome is evidence for the accumulation of evolutionary debris by blind Darwinian evolution, and argues against intelligent design, as an intelligent designer would presumably not fill the human genetic instruction set with meaningless information (Dawkins 1986; Collins 2006). This argument is threatened in the face of growing functional indices of noncoding regions of the genome, with the latter reciprocally used in support of the notion of intelligent design and to challenge the conception that natural selection accounts for the existence of complex organisms (Behe 2003; Wells2011).5
In other words, there are many in the scientific community who are concerned that the results of the ENCODE Project play right into the hands of creationists and intelligent design proponents—and that’s a reason for dismissing the ENCODE conclusions.
It is safe to say there are scientists who accept the ENCODE project even though the results undermine what many consider to be the most compelling argument for biological evolution—while at the same time highlighting the elegant design of the human genome, a system befitting the work of the Creator. (RTB)
*** Will Myers