Science is a team sport. Collaborations allow us to ask more ambitious science questions, but also intensify the need to connect disparate datasets across scales of time and space. Solving data interoperability challenges requires technological solutions not yet in place, so we’re taking the initiative to review potential solutions.
A platform from EarthCube is some time and distance away, but we have a chance to start assembling tools already at hand and in use for coral reef research workflows and do some testing. The process also helps us ground the ideal in the practical.
What are some criteria for a great infrastructure platform?
Ideally, solutions are: 1) modular, so when an improved tool is available it can be incorporated without restructuring the system; 2) free or low cost, so solutions are sustainable for most research labs; and 3) open source, allowing continued development from multiple disciplines and directions. However, we also want to start where people are, with the tools we’re already using – many of these are less than ideal but we make them work. That’s our starting place, and we want to hear about all of your tools.
It is tempting to set up a workbench for the challenge of analysis alone, but in a coral reef research lab we immediately crash into the realities of group data collection, field and lab work, physical specimens, and intersecting projects. All of these characteristics create additional layers of challenge. In the long run, infrastructure should help capture data and metadata generation at the source, and ease tracking, analysis, and replicability.
Good infrastructure solves more problems than it creates in compliance, skill demand, and management. An effective system helps graduate students and postdocs develop robust skills in managing data into the future, with guidelines that work for people, labs, and collaborators. Interoperability challenges must be solved for datasets that range from remote sensing to ecological surveys to bioinformatics work. Data cleaning, analysis, visualization, and mapping must be supported in flexible ways to clearly communicate research insights.
We have an opportunity to construct a preliminary array of tools and workflows on a cyberinfrastructure workbench for coral reef research, so if this is going to be a broadly useful start we need to know what you already like using and what more you wish you had. Dream big to start with, and along the way we’ll acknowledge the distinction between perfect and good-enough solutions.