Key points are not available for this paper at this time.
Previous articleNext article FreeEditorialData ArchivingMichael C. Whitlock, Mark A. McPeek, Mark D. Rausher, Loren Rieseberg, and Allen J. MooreMichael C. Whitlock, Mark A. McPeek, Mark D. Rausher, Loren Rieseberg, and Allen J. Moore1. Department of Zoology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada (former Editor‐in‐Chief, The American Naturalist) ;2. Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire 03755 (Editor‐in‐Chief, The American Naturalist) ;3. Department of Biology, Duke University, Durham, North Carolina 27708 (Editor‐in‐Chief, Evolution) ;4. Department of Botany, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada (Chief Editor, Molecular Ecology) ;5. Centre for Ecology and Conservation, School of Biosciences, University of Exeter, Cornwall Campus, Penryn TR10 9EZ, United Kingdom (Editor‐in‐Chief, Journal of Evolutionary Biology) PDFPDF PLUSFull Text Add to favoritesDownload CitationTrack CitationsPermissionsReprints Share onFacebookTwitterLinked InRedditEmailQR Code SectionsMoreScience depends on good data. Data are central to our understanding of the natural world, yet most data in ecology and evolution are lost to science—except perhaps in summary form—very quickly after they are collected. Once the results of a study are published, the data on which those results are based are often stored unreliably, subject to loss by hard drive failure and (even more likely) by the researcher forgetting the specific details required to use the data (Michener et al. 1997). Moreover, most data are never available to the broader community, even after publication of the results; in most cases this unavailability becomes permanent following the eventual death of the researchers involved. In ecology and evolutionary biology, we are losing nearly all of this important legacy. Yet these data, even after the main results for which they were collected are published, are invaluable to science, for meta‐analysis, new uses, and quality control. With the increasing use of meta‐analysis to summarize multiple studies, it has become clear that necessary summary statistics are often not published. In many cases, a study can be used only if the original data are available to the meta‐analysts. Furthermore, data often can be used in ways beyond the questions that sparked their collection; for example, many studies contain information that can serve later as a baseline for detecting population trends, even decades later. The availability of data for published studies also allows error checking, making science more open and letting us more rapidly reach accurate conclusions. Finally, papers that have had data archived are more useful to—and more cited by—other scientists. One study found that papers that archived their data were cited 69% more often than papers that did not (Piwowar et al. 2007). Data that are properly archived are saved for posterity, and archives also function to preserve data in a usable form for the original authors. Moreover, if data sets are put into a readily interpretable format while the methods and structure of the data are foremost in the scientists’ minds, those data can be used later more easily by those scientists and others. The example of GenBank shows the value of the availability of data for all of these reasons. The modern synthetic use of DNA sequence data would not be possible without the near‐universal use of GenBank as a public archive. Moreover, GenBank would not be nearly as complete as it is without the communal decision to archive all DNA sequence data, a decision initially introduced by journals. For these reasons and perhaps others, a survey has shown that over 95% of scientists in evolution and ecology think that data should be publicly archived (S. Carrier, J. Greenberg, H. Lapp, R. Scherle, A. Thompson, T. Vision, and H. White, unpublished manuscript). To promote the preservation and fuller use of data, The American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology will soon introduce a new data‐archiving policy. The policy has been enacted by the Executive Councils of the societies owning or sponsoring the journals. For example, the policy of The American Naturalist will state: This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species. This policy will be introduced approximately a year from now, after a period when authors are encouraged to voluntarily place their data in a public archive. Data that have an established standard repository, such as DNA sequences, should continue to be archived in the appropriate repository, such as GenBank. For more idiosyncratic data, the data can be placed in a more flexible digital data library such as the National Science Foundation–sponsored Dryad archive at http: //datadryad. org. When the policy is fully in place, authors will archive the data required to support the conclusions in their published paper, along with sufficient details so that a third party can reasonably interpret those data correctly. In most cases this will require a short additional text document, with details specifying the meaning of each column in the data set. The preparation of such shareable data sets will be easiest if these files are prepared as part of the data analysis phase of the preparation of the paper, rather than after acceptance of a manuscript. The data should be saved usually at the individual level, although what is most important is that the data be saved in a way that makes the most sense for later usability. Summary statistics (like means and standard deviations) are not sufficient, because they wouldn’t provide enough information for later analysis. At the same time, data in their rawest form, such as videotapes, field notebooks, or sequencing trace files, are not required. For example, if a study used videotape to determine the time required for animals to make a choice in a T‐maze, the data should be recorded with the time for each subject. The data‐archiving policy is designed to address several concerns that some researchers may have about data sharing. To protect the ability of individual researchers to use the data that they have collected, the policy allows an embargo period after publication. While the data will be entered into an archive at the time of publication, the data may be restricted from public view for up to a year. This allows the original researcher time to publish other papers based on the data set. The policy also allows longer embargo periods at the discretion of the editor in exceptional cases. In addition, the requirement is only for data that have already been used in the publication in question; other data from the same research project that have not yet been described in a publication need not be archived. Finally, data that are particularly sensitive, such as location information for endangered species subject to poaching, should not be archived in a publicly accessible format. Human subject data should be anonymized (see the recommendations of the National Human Subjects Protection Advisory Committee 2002). Throughout the history of ecology and evolution, enormous quantities of valuable data have been lost to future science, for a variety of technical and cultural reasons. There are no longer any meaningful technical barriers to long‐term storage, and just as in the case of DNA sequence data, it is time for the culture of our shared use of data to evolve. With this general data policy, we think that science will reap great benefits for generations to come. Literature CitedMichener, W. K. , J. W. Brunt, J. J. Helly, T. B. Kirchner, and S. G. Stafford. 1997. Nongeospatial metadata for the ecological sciences. Ecological Applications 7: 330–342. First citation in articleCrossrefGoogle ScholarNational Human Subjects Protection Advisory Committee. 2002. Recommendations on public use data files. http: //www. aera. net/humansubjects/NHRPACFinalPUDF. pdf. First citation in articleGoogle ScholarPiwowar, H. A. , R. S. Day, and D. B. Fridsma. 2007. Sharing detailed research data is associated with increased citation rate. PLoS ONE 2 (3): e308, doi: 10. 1371/journal. pone. 0000308. First citation in articleGoogle Scholar Previous articleNext article DetailsFiguresReferencesCited by The American Naturalist Volume 175, Number 2February 2010 Published for The American Society of Naturalists Article DOIhttps: //doi. org/10. 1086/650340 Views: 3349Total views on this site Citations: 126Citations are reported from Crossref © 2010 by The University of Chicago. PDF download Crossref reports the following articles citing this article: Riko Kelter The Bayesian simulation study (BASIS) framework for simulation studies in statistical and methodological research, Biometrical Journal 20 (Jan 2023): 2200095. https: //doi. org/10. 1002/bimj. 202200095Libby Liggins, Vanessa Arranz, Heather E. Braid, David Carmelet-Rescan, Joane Elleouet, Ekaterina Egorova, Michael R. Gemmell, Simon F. K. Hills, Lyndsey P. Holland, Emily M. Koot, Alexandra Lischka, Kimberley H. Maxwell, Laura J. McCartney, Hang T. T. Nguyen, Cory Noble, Pamela Olmedo Rojas, Elahe Parvizi, William S. Pearman, Jenny Ann N. Sweatman, Te Rangitākuku Kaihoro, Kerry Walton, J. David Aguirre, Lucy C. Stewart The future of molecular ecology in Aotearoa New Zealand: an early career perspective, Journal of the Royal Society of New Zealand 52, no. sup1sup1 (Jul 2022): 92–115. https: //doi. org/10. 1080/03036758. 2022. 2097709Habeeb Ibrahim Abdul Razack, Jesil Mathew Aranjani, Sam T Mathew Clinical trial transparency regulations: Implications to various scholarly publishing stakeholders, Science and Public Policy 49, no. 66 (Aug 2022): 951–961. https: //doi. org/10. 1093/scipol/scac041Alexa J. Halford, Thomas Y. Chen, Lutz Rastaetter Data needs to be a priority, Frontiers in Physics 10 (Dec 2022). https: //doi. org/10. 3389/fphy. 2022. 1061681Wayne M. Jurick, Lindsey Messinger, Anna Wallis, Kari A. Peter, Sara Villani, Michael J. Bradshaw, Holly P. Bartholomew, Michael Buser, Srđan G. Aćimović, Jorge M. Fonseca, Kerik D. Cox PATHMAP (Pathogen And Tree fruit Health MAP): A Smartphone App and Interactive Dashboard to Record and Map Tree Fruit Diseases, Disorders, and Insect Pests, PhytoFrontiers™ 2, no. 44 (Dec 2022): 331–338. https: //doi. org/10. 1094/PHYTOFR-06-22-0070-TATara A. Pelletier, Danielle J. Parsons, Sydney K. Decker, Stephanie Crouch, Eric Franz, Jeffery Ohrstrom, Bryan C. Carstens phylogatR: Phylogeographic data aggregation and repurposing, Molecular Ecology Resources 22, no. 88 (Jul 2022): 2830–2842. https: //doi. org/10. 1111/1755-0998. 13673Daniel Berner, Valentin Amrhein Why and how we should join the shift from significance testing to estimation, Journal of Evolutionary Biology 35, no. 66 (May 2022): 777–787. https: //doi. org/10. 1111/jeb. 14009Lisa M. Komoroske, Kim Birnie-Gauvin Conservation Physiology of fishes for tomorrow: Successful conservation in a changing world and priority actions for the field, (Jan 2022): 581–628. https: //doi. org/10. 1016/bs. fp. 2022. 04. 016Christopher J. Lortie The early bird gets the return: The benefits of publishing your data sooner, Ecology and Evolution 11, no. 1616 (Jul 2021): 10736–10740. https: //doi. org/10. 1002/ece3. 7853Chloé Schmidt, Colin J. Garroway The population genetics of urban and rural amphibians in North America, Molecular Ecology 30, no. 1616 (Jun 2021): 3918–3929. https: //doi. org/10. 1111/mec. 16005Rebekah D. Wallace, Charles T. Bargeron, Joseph H. LaForest, Rachel L. Carroll The Life Cycle of Invasive Alien Species Occurrence Data, (Apr 2021): 308–324. https: //doi. org/10. 1002/9781119607045. ch49Jesse M. Alston, Jessica A. Rick A Beginner's Guide to Conducting Reproducible Research, The Bulletin of the Ecological Society of America 102, no. 22 (Jan 2021). https: //doi. org/10. 1002/bes2. 1801Liwei Zhang, Liang Ma Does open data boost journal impact: evidence from Chinese economics, Scientometrics 126, no. 44 (Feb 2021): 3393–3419. https: //doi. org/10. 1007/s11192-021-03897-zSarah E. Bush, Daniel R. Gustafsson, Vasyl V. Tkach, Dale H. Clayton A Misidentification Crisis Plagues Specimen-Based Research: A Case for Guidelines with a Recent Example (Ali et al. , 2020), Journal of Parasitology 107, no. 22 (Mar 2021). https: //doi. org/10. 1645/21-4Andrew B. Neang, Will Sutherland, Michael W. Beach, Charlotte P. Lee Data Integration as Coordination, Proceedings of the ACM on Human-Computer Interaction 4, no. CSCW3CSCW3 (Jan 2021): 1–25. https: //doi. org/10. 1145/3432955Arieh Bomzon, Graham Tobin Scholarly Publishing and Scientific Reproducibility, (Sep 2021): 185–211. https: //doi. org/10. 1007/978-3-030-66147-2₉Loren Rieseberg, Emily Warschefsky, Bridget O’Boyle, Pierre Taberlet, Daniel Ortiz‐Barrientos, Nolan C. Kane, Benjamin Sibbett Editorial 2021, Molecular Ecology 30, no. 11 (Dec 2020): 1–25. https: //doi. org/10. 1111/mec. 15759Vassiliki Betty Smocovitis, Daniel I. Bolnick, Christopher M. Moore, and Patricia L. Morse Trends and Transitions in 150 Years of The American Naturalist, The American Naturalist 196, no. 66 (Oct 2020): 663–678. https: //doi. org/10. 1086/711418Benjamin Sibbett, Loren H. Rieseberg, Shawn Narum The Genomic Observatories Metadatabase, Molecular Ecology Resources 20, no. 66 (Nov 2020): 1453–1454. https: //doi. org/10. 1111/1755-0998. 13283Sara E Miller, Lisa N Barrow, Sean M Ehlman, Jessica A Goodheart, Stephen E Greiman, Holly L Lutz, Tracy M Misiewicz, Stephanie M Smith, Milton Tan, Christopher J Thawley, Joseph A Cook, Jessica E Light Building Natural History Collections for the Twenty-First Century and Beyond, BioScience 70, no. 88 (Jul 2020): 674–687. https: //doi. org/10. 1093/biosci/biaa069Christine L. Borgman Bibliographie, (Jan 2020): 349–411. https: //doi. org/10. 4000/books. oep. 14792Beth A. Reinke, David A. W. Miller, Fredric J. Janzen What Have Long-Term Field Studies Taught Us About Population Dynamics? , Annual Review of Ecology, Evolution, and Systematics 50, no. 11 (Nov 2019): 261–278. https: //doi. org/10. 1146/annurev-ecolsys-110218-024717Michael Gruenstaeudl Why the monophyly of Nymphaeaceae currently remains indeterminate: an assessment based on gene-wise plastid phylogenomics, Plant Systematics and Evolution 305, no. 99 (Aug 2019): 827–836. https: //doi. org/10. 1007/s00606-019-01610-5John H. Porter Evaluating a thesaurus for discovery of ecological data, Ecological Informatics 51 (May 2019): 151–156. https: //doi. org/10. 1016/j. ecoinf. 2019. 03. 002Iain Hrynaszkiewicz Publishers’ Responsibilities in Promoting Data Quality and Reproducibility, (Nov 2019): 319–348. https: //doi. org/10. 1007/164₂019₂90Dan Sholler, Karthik Ram, Carl Boettiger, Daniel S Katz Enforcing public data archiving policies in academic publishing: A study of ecology journals, Big Data & Society 6, no. 11 (Mar 2019): 205395171983625. https: //doi. org/10. 1177/2053951719836258Ayesha I. T. Tulloch, Nancy Auerbach, Stephanie Avery-Gomm, Elisa Bayraktarov, Nathalie Butt, Chris R. Dickman, Glenn Ehmke, Diana O. Fisher, Hedley Grantham, Matthew H. Holden, Tyrone H. Lavery, Nicholas P. Leseberg, Miles Nicholls, James O’Connor, Leslie Roberson, Anita K. Smyth, Zoe Stone, Vivitskaia Tulloch, Eren Turak, Glenda M. Wardle, James E. M. Watson A decision tree for assessing the risks and benefits of publishing biodiversity data, Nature Ecology & Evolution 2, no. 88 (Jul 2018): 1209–1217. https: //doi. org/10. 1038/s41559-018-0608-1Jip J. C. Ramakers, Antica Culina, Marcel E. Visser, Phillip Gienapp Environmental coupling of heritability and selection is rare and of minor evolutionary significance in wild populations, Nature Ecology & Evolution 2, no. 77 (Jun 2018): 1093–1103. https: //doi. org/10. 1038/s41559-018-0577-4Sébastien Renaut, Amber E Budden, Dominique Gravel, Timothée Poisot, Pedro Peres-Neto Management, Archiving, and Sharing for Biologists and the Role of Research Institutions in the Technology-Oriented Age, BioScience 68, no. 66 (May 2018): 400–411. https: //doi. org/10. 1093/biosci/biy038Tianyi Zhang, Yong-Cheol Lee, Seungwon Yang Web-Based Data Federation, Archiving, and Curating of Construction Activity and Operation Sounds, (Mar 2018): 253–261. https: //doi. org/10. 1061/9780784481264. 025Keith P. Lewis, Eric Vander Wal, David A. Fifield Wildlife biology, big data, and reproducible research, Wildlife Society Bulletin 42, no. 11 (Jan 2018): 172–179. https: //doi. org/10. 1002/wsb. 847Michael C. Dietze, Andrew Fox, Lindsay M. Beck-Johnson, Julio L. Betancourt, Mevin B. Hooten, Catherine S. Jarnevich, Timothy H. Keitt, Melissa A. Kenney, Christine M. Laney, Laurel G. Larsen, Henry W. Loescher, Claire K. Lunch, Bryan C. Pijanowski, James T. Randerson, Emily K. Read, Andrew T. Tredennick, Rodrigo Vargas, Kathleen C. Weathers, Ethan P. White Iterative near-term ecological forecasting: Needs, opportunities, and challenges, Proceedings of the National Academy of Sciences 115, no. 77 (Jan 2018): 1424–1432. https: //doi. org/10. 1073/pnas. 1710231115John H. Porter Scientific Databases for Environmental Research, (Sep 2017): 27–53. https: //doi. org/10. 1007/978-3-319-59928-1₃Lisa Harper, Jacqueline Campbell, Ethalinda K S Cannon, Sook Jung, Monica Poelchau, Ramona Walls, Carson Andorf, Elizabeth Arnaud, Tanya Z Berardini, Clayton Birkett, Steve Cannon, James Carson, Bradford Condon, Laurel Cooper, Nathan Dunn, Christine G Elsik, Andrew Farmer, Stephen P Ficklin, David Grant, Emily Grau, Nic Herndon, Zhi-Liang Hu, Jodi Humann, Pankaj Jaiswal, Clement Jonquet, Marie-Angélique Laporte, Pierre Larmande, Gerard Lazo, Fiona McCarthy, Naama Menda, Christopher J Mungall, Monica C Munoz-Torres, Sushma Naithani, Rex Nelson, Daureen Nesdill, Carissa Park, James Reecy, Leonore Reiser, Lacey-Anne Sanderson, Taner Z Sen, Margaret Staton, Sabarinath Subramaniam, Marcela Karey Tello-Ruiz, Victor Unda, Deepak Unni, Liya Wang, Doreen Ware, Jill Wegrzyn, Jason Williams, Margaret Woodhouse, Jing Yu, Doreen Main AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture, Database 2018 (Sep 2018). https: //doi. org/10. 1093/database/bay088Laurence Packer, Spencer K. Monckton, Thomas M. Onuferko, R. in research, Insect Conservation and 11, no. 11 (Jan 2018): B. James W. Brunt, Mark S. the of Ecological Data in the United BioScience (Oct 2017): M. The of Data Why to Share Data, Applications in Plant Sciences (Oct 2017): H. C. L. from and of America, Science no. 22 (Nov E. S. J. F. M. Promoting transparency in conservation science, Conservation Biology 30, no. 66 (Jun R. J. Jason R. and new for Journal of (Oct P. A. Thomas and in Ecology and (Jun H. Jessica Promoting transparency in evolutionary biology, and The no. 44 (Oct H. S. J. Promoting transparency in evolutionary and Ecology no. 77 (Jun the of Public Data for Long-Term Population Biology no. 44 (Apr M. Bryan A. A new data archiving policy for no. 22 (Mar A. K. Interactive Data with ACM no. 22 (Feb C. Whitlock, L. M. M. Charles W. Fox, Mark A. McPeek, Allen J. Moore, Mark D. Rausher, Loren H. Rieseberg, Michael G. G. A Data Policy for Long-Term Trends in Ecology & Evolution no. 22 (Feb S. C. L. P. R. Jessica A. K. Joseph A. Natural history and Journal of no. 11 (Nov M. Ethan P. White The of Code in Ecology, Trends in Ecology & Evolution no. 11 (Jan D. Data archiving no. 11 (Jan J. T. D. Mark Journal of Ecology Data Journal of Ecology no. 11 (Dec Rieseberg, Editorial Molecular Ecology no. 22 (Jan G. E. B. A. Public Data in Ecology and Biology (Nov A. H. R. Pierre Daniel T. Andrew D. C. J. W. W. R. Grant, Michael Gustafsson, Michael P. Charles J. Matthew Andrew P. J. Daniel Pelletier, F. James S. E. Marcel E. Visser, David F. J. for Long-Term Trends in Ecology & Evolution 30, (Oct K. Ecological data Ecological Informatics (Sep E. White, L. Daniel Noble, James C. Daniel B. Reproducible research in the study of (Aug C. Liggins, the time or the the in publicly available data, Molecular Ecology (Jun H. has your science data and the human a Science no. 22 (Nov Data Sharing to What Human the Science ONE (Mar C. A. S. A. Pelletier, F. D. E. C. Bryan C. Carstens The evolution of data Molecular Ecology no. 66 (Mar Marcel S. What Data ONE no. 22 (Feb Thomas and to (Dec James A J science and data more valuable research and no. 11 (Sep H. the of ecology and evolution can from other Frontiers in Ecology and Evolution (Nov T. Kim and for the (Oct F. Michael R. R. Moore, William J. The of to Data, PLoS ONE (Oct D. North and Journal of the Biological of the United Kingdom no. 66 (Aug Jason B. Evolutionary for the of selection and of the Royal Society Biological Sciences (Aug Stephanie E. A for the preservation of and other data in with and the of a The (Jul W. Fox, J. K. Thompson, into a new of Ecology no. 22 (Mar G. A. M. Lisa E. E. Michael D. E. B. A. Public Data to PLoS Biology no. 11 (Jan H. L. G. T. J. Moore, Renaut, Diana J. The of Research Data with Article Age, Biology no. 11 (Jan Data publication and (Oct Marcel What Data Journal (Jan I. Evaluating and in and Data, (May can and increased transparency in science, Code for Biology and no. 11 (Feb H. What we about the of in A case study of to in evolutionary biology, Biological (Dec
Whitlock et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: