How the Replication Crisis Led to Open Science

Remember those basic concepts you learned for the grade-school science fair? Well, I’m going to let you in on a little secret: many of them are used in university-level research. For example, I remember hearing about students replicating their previous year’s projects in middle school and high school. At the time, it felt pointless. What more could you learn from repeating something you’ve already done before?

This was definitely due to my childhood bias against science fair. Every year it felt like I was just making up a project and not actually learning anything. Science fair was mandatory until 9th grade, so high schoolers choosing to build on existing projects seemed like a monumental waste of time. Thankfully, pursuing science in college changed my perspective.

College-level science classes are chock full of research articles, all of which follow a standardized format. For instance, authors use the introduction section to cite background research and explain the purpose of their study. Over time, I recognized researchers citing their own previous publications in the introduction. It’s like adult version of replicating a science fair project, but this time I could read an explanation about why the replication was necessary. Uncovering a new scientific concept is exciting, but its significance may dissipate if others can’t get similar results. Whether it’s science fair or academic research, repetition is necessary to cement the validity and importance of an experiment’s outcome.

The Replication Crisis

270 psychologists in the early 2010s replicated 100 studies from 2008, drawing inspiration from replication attempts in cell biology.1 The biology replications faced issues, including insufficient details about the procedure and results, leading to successful replications less than 25% of the time. The psychologists tried circumventing this, often by collaborating with the original papers’ authors, but they maintained that “a large portion of replications produced weaker evidence for the original findings.”

Despite having the necessary background knowledge, combing through the original publications, contacting the respective researchers, and replicating each experiment to the best of their ability, they still wound up with largely inconclusive results. 

This was from only one year of psychology publications.

Low replication rates aren’t restricted to the natural sciences, either. In 2015, the US Federal Reserve Board examined 59 publications from 13 influential economics journals.2 Again, despite assistance from the original authors, only 29 studies (49%) had successful replications. Seeing differences between a study’s original results and a replication’s results is possible, but poor replication rates, year over year, across disciplines suggests a deeper, systemic problem. This issue, referred to as the replication crisis, likely stems from the widespread use of questionable research practices (QRPs).

Lemonade vs. Iced Tea 

Let’s say I went to a nearby park and asked about 100 people if they prefer lemonade or iced tea. I hypothesized lemonade would be the more popular choice since everyone I know prefers lemonade. 

The first 10 people were split down the middle, with 5 picking lemonade and 5 choosing iced tea. Definitely not what I was expecting. A few minutes later, 10 soccer players from a nearby game ran up and all chose lemonade. Now, out of 20 people total, only 5 picked iced tea! Satisfied with the results of my small sample, I packed up my things and headed home.

This is an example of two QRPs: confirmation bias and optional stopping. I used the soccer players’ choices as confirmation that my hypothesis (based on my bias towards lemonade) was correct even though the first 10 people were split half and half. Stopping my experiment before reaching 100 people lead to a false positive conclusion, so replication with a new sample of 20 people may yield different results.

Jess’s Replication

I told my friend Jess about my experiment, and she had some issues with my methods. Her family’s annual reunion was around the corner, and they anticipated about 100 people in attendance; the perfect opportunity for Jess to replicate my study with the appropriate sample size. 

Caught up in organizing the reunion, she decided not to waste time coming up with a hypothesis. Handing out drinks and logging the data on her phone would be good enough, right? After everyone went home, she looked at her results and noticed that about 70 people chose iced tea and 30 people chose lemonade. 

Jess marched over to my house the next day claiming her larger sample size supported her hypothesis about people preferring iced tea. Annoyed by how smug she looked, I probed her for more information and learned that she also engaged in a QRP: hypothesizing after her results were known (HARKing). Jess asserted that her results supported her hypothesis even though she didn’t have one before collecting data.

Implementation of HARKing, optional stopping, and other QRPs resulted in many published papers with positive, statistically significant results having little to no significance when replicated. Conducting research with as much transparency as possible not only reduces the chance of a researcher using QRPs, but also makes it immensely easier to replicate studies. Luckily, open science addresses this exact issue.

The Solution

At its core, the open science movement strives to “increase openness, integrity, and reproducibility” in published research.4 One of the first steps in the open science process is preregistration: researchers use an online platform to log their hypotheses, study protocol, and anticipated statistical analyses before collecting data. All preregistered information has the date and time of submission and is accessible to peer-reviewers as well as other professionals in the field. Additionally, researchers can upload a preprint of their article—an accessible, near-final version of their paper before it’s peer-reviewed for publication. 

Widespread use of open science methods at the time of the psychology, biology, and economics replication studies may have led to a higher likelihood of obtaining results similar to the original papers. All that accessible information clarifies the experimental process for everyone involved and ensures that studies are conducted as honestly as possible.

All that said, just because a replication (or replications) led to different results doesn’t always mean that the initial conclusions were incorrect; science is imperfect, and differences in results could stem from a multitude of variables. At the end of the day, it’s important to remember that the goal of research isn’t to be right, it’s to uncover the truth behind things we can’t yet explain. Unexpected results may spur new research questions and lead to advancements of equal or greater importance. Addressing the replication crisis with open science is still relatively new but it’s a fantastic start to maintaining scientific integrity in the modern era.


References and related websites
  1. Estimating the reproducibility of psychological science
  2. Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say “Usually Not”
  3. Measuring the Prevalence of Questionable Research Practices With Incentives for Truth Telling
  4. Center for Open Science Mission Page
  5. PLOS Open Science Page
  6. Vox Unexplainable article and podcast on the psychology replication crisis