CRESCYNT Toolbox: Data Repositories – Estate Planning for your Data

“Hypotheses come and go but data remain.”    – Ramon y Cajal

Taking care of our data for the long term is not just good practice, allowing us to share our data, defend our work, reassess conclusions, collaborate with colleagues, and examine broader scales of space and time – it’s also estate planning for our data, and a primary way of communicating with future scientists and managers.

egg-gold

Here are some great options for long-term data storage, highlighting repositories friendly to coral reef science.

First, there are some important repository networks useful for coral reef data – these can unify standards and offer collective search portals: we like DataONE (members here) and bioCaddie (members here).

KNB – the Knowledge Network for Biocomplexity offers open and private data uploads; ecological orientation. DataONE network.

NOAA CoRIS: Coral Reef Information System – often free to use and can accept coral reef related data beyond NOAA’s own data; contact them first.

BCO-DMO – Biological and Chemical Oceanography Data Management Office – if you have an NSF grant that requires data storage here, you’re fortunate. Good data management guidelines and metadata templates, excellent support staff. Now a DataONE member.

Dataverse – supported by Harvard endowments. There are multiple organizational dataverses – the Harvard Dataverse is free to use. bioCaddie member.

Zenodo – free to use, supported by the European Commission (this is a small slice of CERN’s enormous repository for the Large Hadron Collider). Assigns dois. We invite you to include the “Coral Reef” community when you upload. bioCaddie member.

NCBI – the National Center for Biotechnology Information is very broadly accepted for ‘omics data of all types. A bioCaddie member.

DataCite – not a repository, but if you upload a dataset at a repository that does not assign its own doi’s, you can get one at DataCite and include it when publishing your datasets.

We’ve not listed more costly repositories such as Dryad (focused on journal requirements) or repositories restricted to institutions. What about other storage options such as GitHub, Amazon Web Services, websites? Those have important uses, but are not curated repositories with long-term funding streams, so are not the best data legacy options.

eggs-stacked-imagesMost of these repositories allow either private (closed) or public (open) access, or later conversion to open access. Some have API’s for automated access within workflows. These are repositories we really like for storing and accessing coral reef work. Share your favorite long-term data repository – or experiences with any of the repositories listed here – in the comments.

CRESCYNT Toolbox: Data Repositories – Estate Planning for your Data

WELCOME to CRESCYNT – the Coral Reef Science and Cyberinfrastructure Network

The Coral Reef Science & Cyberinfrastructure Network (CRESCYNT) is a multi-tiered and multidisciplinary network of coral reef researchers, ocean scientists, cyberinfrastructure specialists, and computer scientists, and we invite you to join us. Scope of Sciences within EarthCube

As an EarthCube Research Coordination Network, our goals are to foster a dynamic, diverse, durable, and creative community; to collectively consider and develop standards and resources for open data, research documentation, and data interoperability while making best use of work already accomplished by others; and to offer input to those groups within EarthCube who will ultimately create the data architecture for all of EarthCube. Along the way CRESCYNT expects to collect and share community resources and tools, and to offer training opportunities in topics prioritized by our members through widely accessible formats such as webinars and their recordings. We will also work to nurture unforeseen collaborative opportunities that emerge from our integrated collective work.

Because the coral reef community has exceptionally diverse data structures and analysis requirements needed to forward integrative science, it is an exemplar for cyberinfrastructure-enabled advances to other geosciences communities. The CRESCYNT network is working to match the data sources, data structures, and analysis needs of the coral reef community with current advances in data science, visualization, and image processing from multiple disciplines to advance coral reef research and meet the increasing challenges of conservation. The network has begun to assemble to coordinate, plan, and prioritize cyberinfrastructure needs within the coral reef community.

Workflows within CReSCyNT: participants to nodes to collective project outputsThe structure of CRESCYNT is a network of networks, currently including 18 disciplinary nodes and 7 technological nodes, where each network node represents an area of coral reef science (disciplinary nodes: e.g., microbial diversity, symbiosis regulation, disease, physiology & fitness, reef ecology, fish & fisheries, conservation & management, biogeochemistry, oceanography, paleontology, geology) or an area of computer science or technical practice (technological nodes: e.g., visualization, geospatial analysis & mapping, image analysis, legacy & dark data, database management). These nodes may expand, coalesce, or divide to meet the needs and interests of the subdisciplinary communities, while maintaining connections to CRESCYNT through node coordinators and ongoing network activities. We invite you to become a member of CRESCYNT, join one or more nodes that would advance your own work, collaborate on shared resources and tools for the coral reef community, and ensure that the data architecture and cyberinfrastructure of EarthCube will meet the needs of the coral reef community, and that broader data interoperability within EarthCube will benefit both coral reefs and our ability to answer complex questions.

PLEASE VISIT OUR WEBSITE at http://crescynt.org to enroll in CRESCYNT, join a node, work on tasks, discuss data and research priorities, and help determine the future shape of cyberinfrastructure for supporting coral reef research and other geoscience work. This collaborative work is supported by the National Science Foundation’s EarthCube initiative.

WELCOME to CRESCYNT – the Coral Reef Science and Cyberinfrastructure Network