Logo

The Upworthy Research Archive

The Upworthy Research Archive is an open dataset of thousands of A/B tests of headlines conducted by Upworthy from January 2013 to April 2015. At the time of release, it is the largest open-access collection of randomized behavioral studies openly available for research and education. We hope it doesn’t stay that way for long (see below if you wish to contribute data).

Project announcement: Announcing the Upworthy Research Archive: Help us advance human understanding by studying this massive dataset of headline A/B tests

What can I do with the Upworthy Research Archive?

We hope this dataset will be used in three ways: to conduct academic research, to serve as an resource for educators, and to inform the implementation of A/B tests by organizations.

We expect that this dataset will help advance knowledge in many fields, including:

How can I learn more about Upworthy?

We suggest the following references:

venn diagram of Upworthy's focus on awesome, meaningful, visual content

What is the structure of the data?

About the Archive contains a full description of the data, references and slides.

Confirmatory Research with the Upworthy Research Archive

Multiple comparisons and overfitting represent serious risks to scientific understanding with a dataset of this size. By doing the extra work of supporting cross-validation, we hope to maximize the amount of highly-credible science that results from this this dataset.

We are providing an Exploratory Dataset of 4,873 experiments to support academic research and teaching. For researchers who plan to conduct confirmatory research that tests hypotheses, we are keeping a larger Confirmatory Dataset of 22,743 experiments in reserve.

Here’s the process for accessing the data:

  1. Request the exploratory dataset. Everyone who asks will receive this data.
  2. Use the exploratory data to develop a registered report that includes a confirmatory hypothesis and analyis plan
  3. Submit the registered report for peer review to an academic journal
  4. When the journal agrees to publish the resulting paper, send us confirmation, along with the accepted registered report
  5. We will the provide you with the confirmatory dataset
  6. We ask all researchers to share your final analysis code with us, so we can write a paper about how the initial set of researchers used the Upworthy Research Archive

What is a Registered Report?

To learn more, see the Center for Open Science introduction to registered reports.

Generally, Registered Reports are form of “results-blind” peer review. A journal will evaluate the submission in terms of the appropriateness of the analysis strategy for addressing the theoretical question.

With the Upworthy Research Archive, researchers can use the Exploratory Dataset to understand the structure of the data and write code to analyze it. Journals will then review the scientific merit of the Registered Report, and if they agree to publish it, the code will be run on the Confirmatory dataset to produce the final results.

What journals accept Registered Reports?

To date, 242 academic journals have published Registered Reports. The full list can be found here, under the “Participating Journals” tab.

Our process is slightly different from traditional Registered Reports in that we will act as the data managers. This is not the first example of this model; the Attitudes, Identities, and Individual Differences (AIID) Study and Dataset successfully pioneered the use of a central Confirmatory dataset for Registered Reports.

What if my favorite journal is unfamiliar with Registered Reports?

Some journals may not be familiar with this process. Different disciplines have adopted Registered Reports at different rates. If an editor has a question about the process, ask them to email Charlie Ebersole (cebersole@virginia.edu). Our colleagues at the Center for Open Science have offered to provide resources and guidance to editors as needed.

What if my Registered Report is rejected by the journal?

You’re welcome to submit to another journal using the same process, but if at this point you decide you’d like to produce exploratory research, we can share the confirmatory dataset.

Before sharing the confirmatory dataset for exploratory research, we will need you to agree not to share this data with anyone before we decide to end the Registered Reports initiative. You will also need to acknowledge in any paper that your analysis is exploratory.

To prioritize confirmatory research, we will delay support for exploratory research until a first round of confirmatory research has been reviewed.

Sounds great—how do I get started?

For access to the Exploratory Dataset, email Charlie Ebersole (cebersole@virginia.edu).

I operate a publisher and want to add to the archive by donating our historical A/B tests

We live in a time of unprecedented behavioral research by news publishers, advertisers, and tech companies. By donating your historical A/B tests, you can contribute to education and to breakthroughs across multiple scientific fields.

Our team can help you assess the potential scientific value of your archives and chart a privacy-preserving way to contribute to open knowledge. Please contact J. Nathan Matias <nathan.matias@cornell.edu> if this interests you.

How can I use the archive with my students?

J. Nathan Matias has developed a first draft of open educational resources for undergraduate classes. Please contact Nathan if you are interested to use the archive for your class.

Advisory Board

Acknowledgments

Kevin, Nathan, and Marianne are grateful to the following people and organizations for providing key support and input in the creation of the Upworthy Research Archive.

Disclaimer

While we are very grateful to Good/Upworthy for donating this data for scientific and educational purposes, this project is independent from the company. This website does not speak for Good/Upworthy in any way.