Scientific data disappearing at alarming rate


© Science Photo Library

Research data are rapidly being lost to science as time passes, Canadian researchers have confirmed. As individual researchers are not preserving their data for posterity, there is a pressing need for tougher rules on data-sharing in public archives, the team concludes.

Governments, funding agencies and journals are already introducing policies to ensure that research data are available on public archives. They are increasingly concerned that authors are often unable or unwilling to share their data, making them poor stewards of their research, particularly in the long-term.

In a systematic analysis of data availability over time, the Canadian team, led by Timothy Vines from the University of British Columbia, confirmed that the older the article, the harder it was to recover the data. They report that broken emails and obsolete storage devices were the main obstacles to data sharing.

To avoid data storage media and different research community practices confusing the results, the team focused on recovering data from one specific area: articles containing morphological data from plants or animals that made use of a particular analysis. This consisted of 516 articles published between 1991 and 2011.

The team found at least one apparently working email for 74% of papers, either in the article itself or by searching online. After requesting the research, they received 101 data sets (19%) and were told that another 20 (4%) were still in use and could not be shared. So, in total, 23% of data sets were still usable or extant. 

For papers where the authors gave the status of their data, the odds of a data set being extant fell by 17% a year since publication. What’s more, the odds that the team could find a working email address for the first, last or corresponding author on a paper fell by 7% a year.

‘I’m surprised the numbers are not higher,’ says Peter Murray-Rust of the department of chemistry at the University of Cambridge, UK. He estimates ‘data decay’ at around 50% a year. And this is worse in chemistry than in life sciences, he adds. ‘Chemists hate sharing data,’ he says. ‘For example, in computational chemistry and materials science, essentially no primary data is published.’ An important step to help this problem is a radical re-thinking of how graduate students manage their data, he suggests.


Related Content

Chemistry World podcast - February 2014

5 February 2014 Podcast | Monthly

news image

This month, alternatives to animal testing and exploring actinide chemistry

April 2014

1 August 2014 Letters

news image

Cell culture queries, memories of Harwell and spelling P

Most Read

Coated nanoparticles show Alzheimer's promise

12 September 2014 News and Analysis

news image

Gold nanoparticles functionalised with amino acid polymer inhibit the growth of amyloid fibres associated with neurodegenerat...

Computer simulations point to formamide as prebiotic intermediate in ‘Miller’ mixtures

16 September 2014 News and Analysis

news image

Electric field may have provided more than just energy for primordial chemistry

Most Commented

US genomics lead being lost to China

17 September 2014 News and Analysis

news image

NIH senior leaders are sounding the alarm bells, saying the US's pre-eminence in genomics research is under threat

The trouble with boycotts

29 August 2014 Critical Point

news image

Cutting academic ties with a censured state can do more harm than good, says Mark Peplow