The Cavalier Daily
Serving the University Community Since 1890

The death of a hypothesis

In science everything must be questioned, but there seems to be a dangerous tendency to raise certain concepts above reproach. The hypothesis has achieved an almost mythical status as it is inculcated into young minds through the concept of the scientific method, yet this entity has inherent problems in its construction. Within science, there is an underlying struggle between hypothesis-driven and data-driven methods that threatens our traditional view of science.

Hypothesis-driven methods are based upon the scientific method, where one observes phenomena, forms a hypothesis, tests that hypothesis and draws conclusions. In data-driven methods, data is amassed non-selectively and then analyzed to yield emergent patterns that provide insight into the system being studied. Hypothesis-driven methods have been the dominant paradigm in science since the 17th century, but as more complex problems are uncovered, the limitations of such methods become increasingly apparent.

Hypothesis-driven science often entails reductionism; one only evaluates the hypothesis, ignoring the rest of the associated system, which may lead to a false conclusion. This requires a control condition, but there is rarely a perfect control. Any given control is simply another system that may have anomalies or change during the course of analysis. Throughout history, scientists have been rewarded for dispensing with conventional notions of a given area or having the insight to see data from a different point of view.

Hypothesis-driven science, however, is inherently counterproductive in this regard because it encourages a dualist approach: Either the hypothesis is supported or unsupported, precluding the possibility of evaluating other viewpoints. This frequently leads to a confirmation bias for individuals pursuing a given hypothesis, further constraining progress in a given field as aberrant data may be discarded. Success becomes measured by validating, not invalidating a hypothesis, neglecting negative results and resulting in hypotheses retrofitted to data.

In contrast, data-driven methods take a holistic approach to evaluating data such that an interpretation is developed through the process of analysis as opposed to forcing an interpretation onto a dataset. This is advantageous because it prevents issues of confirmation bias and retrofitting yet is able to cope with complex systems. Furthermore, the issue of poor controls is avoided because the controls are fully characterized. Biologists already have employed such strategies in technologies such as microarrays, which can monitor expression levels of thousands of genes in a single experiment and even the genetic linkage experiments of the late 1980s and 1990s, which harnessed the power of analyzing large quantities of data to find patterns.

In addition, the boom of data-driven fields like epidemiology, scientific imaging and especially bioinformatics demonstrate the emergence of these methods. The need for carefully safeguarding data is reduced, as it is the analysis that is most important, likely resulting in better distribution of information in the form of databases. This free flow of information will allow better cooperation among scientific groups, something that is often prevented by the structure of hypothesis-driven cultures where emphasis is placed on data, not analysis. Although discovery will depend on computing power and algorithms, these in silica methods will only improve with advancing technology.

The sheer complexity of certain systems, such as cancer progression, warrants a methodology capable of encompassing multiple interpretations of data in parallel. Cancer, even tissue-specific cancer, is a broad, ill-defined category of cellular signaling aberrations manifested as a group of immortalized cells. Often it is not simply a mutation in one protein, but several mutations in various proteins that lead to this condition, making the system quite complex as the mutated proteins often have multiple downstream effectors. These signaling pathways are not linear constructions, but act in a web-like fashion, with complex methods of positive and negative feedback often providing exquisite regulation of function.

It is tantamount to scientific hubris to posit that the human mind can possibly grasp the complexity of cancer signaling without the aid of data-driven methods; there are simply too many permutations. Many suggest that data-driven and hypothesis-driven methods should be married together to result in perfect harmony. Unfortunately, intrinsic to this marriage would be an irreconcilable difference between holism and reductionism. Science must cast off the crutch of the hypothesis if it is to elucidate the causes of complex phenomena.

Michael McDuffie can be reached at mm9kn@virginia.edu.

Michael McDuffieCavalier Daily Science Columnist

Local Savings

Puzzles
Hoos Spelling
Latest Video

Latest Podcast

Since the Contemplative Commons opening April 4, the building has hosted events for the University community. Sam Cole, Commons’ Assistant Director of Student Engagement, discusses how the Contemplative Sciences Center is molding itself to meet students’ needs and provide a wide range of opportunities for students to discover contemplative practices that can help them thrive at the University.