Metadata Dream Team responds to request for recommendations for coral reef data

people in a room at a table working together
CRESCYNT-DDStudio participants at UCSD Scripps, 2018-08-15. Clockwise from top right (back): Ilya Zaslavsky, Zachary Mason, Samantha Clements, Karen Stocks, Ted Habermann, David Valentine, Hannah Ake, Eric Lingerfelt, Gary Hudman, Sarah O’Connor, Ouida Meier. Not pictured: Gastil Gastil-Buhl, Stephen Richard, Tom Whitenack.

When two major workshops concluded by the EarthCube CRESCYNT Coral Reef Science and Cyberinfrastructure Network in March 2018, there were some interesting clear outcomes in addition to the practical training and data exploration goals accomplished. The workshops were both structured around Data Science for Coral Reefs. At the end of the first, focused on Data Rescue and data management, participants decided that the most important new topic they learned about was metadata and its uses. At the end of the second, focused on Data Integration and Team Science, people had realized how essential writing good metadata was for being able to make datasets at disparate scales work together well. The metadata lessons were important emergent outcomes, and participants asked that data and metadata experts get together, use the data challenges that arose, and recommend some metadata practices and standards that would work for the coral reef community and its very broad range of data types, repositories, and pre-repository research, storage, sharing and analytical metadata needs.

We were luckily able to do exactly that with one final workshop. Through a jointly staged CRESCYNT-DDStudio workshop, we pulled together a group of metadata experts, coral reef data managers, representative scientists, and the EarthCube Data Discovery Studio’s scientists and software developers focused on metadata enhancement for finding and using data.

Special guests included Ted Habermann (Metadata 2020 project, co-author of “The influence of community recommendations on metadata completeness”; Stephen Richard (experience with and metadata standards authoring); coral reef data experts Gastil Gastil-Buhl (Moorea Coral Reef LTER), Hannah Ake (BCO-DMO), and Sarah O’Connor and Zachary Mason (NOAA NCEI’s user metadata writing interface and CoRIS), the three biggest formal repositories for coral reef research data in the US or sponsored by NSF; Eric Lingerfelt, the EarthCube Technical Officer; guests from Scripps; and DDH team members Ilya Zaslavsky, Karen Stocks, Gary Hudman, David Valentine, and Tom Whitenack with their broad and integrative metadata, software, and domain expertise. Ilya and Karen kindly hosted the group at UCSD’s San Diego Supercomputer Center and Scripps Institution of Oceanography.

Important outcomes from the workshop were mutualistic for the two projects. For CRESCYNT, they included cross-mapping an essential set of metadata (as defined by appropriate community repositories) to web standards and producing a draft ISO metadata profile for coral reef data at two levels of dataset access: (1) discovery and sharing (a simpler form with freeform text entry in many of the fields), and (2) understanding and usability at the workbench level (a more detailed form with options to supply more highly specified fields). We will finish writing these and offer them to the coral reef community for feedback and potential adoption.

For the Data Discovery Studio (formerly known as Data Discovery Hub), important outcomes included exploration of the use of the enhanced metadata at different repositories and in science use cases (including the coral reef use case), a deep dive into focusing the future trajectory of Data Discovery Studio, and some initial planning for an upcoming data science competition that will involve the coral reef data (details to be announced). Read more about DDStudio and its broader work, and be on the alert for a Data Discovery Science Competition in January 2019!

We gratefully acknowledge the generosity of our hosts, workshop travel support from NSF, the active work and engagement of our participants, and the organizations that allowed their employees time to attend and contribute to this collective effort.

>>>Go to the blog Masterpost or the CRESCYNT website or NSF EarthCube.<<<

Metadata Dream Team responds to request for recommendations for coral reef data

New Opportunities – Spring 2018

We’ve been busy lately, including following up on the Data Rescue and Data Integration workshops with more materials to share (you can already access most of them here!), and wanted to share some new opportunities.
EarthCube All Hands Meeting
The EarthCube All Hands Meeting will be held June 6-8, 2018 in Washington, DC (our abstracts here). We are very excited to be able to bring a coral-reef-science guest or two to this meeting through reimbursement of travel and registration costs. Two people who work with coral reef data from a repository perspective will also attend with us as part of the CRESCYNT community. Please email us if you’re interested! (Registration is $250 before May 11)
CoralNET Software Update
CoralNET software for automated analysis of coral reef benthic imagery is getting a revision, and its developers would like your input. What feature improvements and new developments would you like to see? Add your feedback to this thread. More background here. (If you haven’t tried it yet, this is the time to do it – fresh users are a great source of important feedback.)
Nat’l Academies Workshop on Interventions to Increase Resilience of Coral Reefs


Attendance is free and open to the public, online or in person.

View the agenda and register.


>>>Go to the blog Masterpost or the CRESCYNT website or NSF EarthCube.<<<

New Opportunities – Spring 2018

On Preserving Coral Reef Imagery and Related Data – by James W Porter

James Porter’s coral photo monitoring project in Discovery Bay

In preparation for an upcoming Data Science for Coral Reefs: Data Rescue workshop, Dr. James W. Porter of the University of Georgia spoke eloquently about his own efforts to preserve historic coral reef imagery captured in Discovery Bay, Jamaica, from as early as 1976. It’s a story from the trenches with a senior scientist’s perspective, outlining the effort and steps needed to accomplish preservation of critical data, in this case characterizing a healthy reef over 40 years ago.

Enjoy this insightful 26-min audio description, recorded on 2018-01-04.


Transcript from 2018-01-04 (lightly edited):

This is Dr. Jim Porter from the University of Georgia. I’m talking about the preservation of a data set that is at least 42 years old now and started with a photographic record that I began making in Discovery Bay, Jamaica on the north coast of Jamaica in 1976. I always believed that the information that photographs would reveal would be important specifically because I had tried other techniques of line transecting and those were very ephemeral. They were hard to relocate in exactly the same place. And in addition to that they only captured a line’s worth of data. And yet coral reefs are three dimensional and have a great deal of material on them not well captured in the linear transect. So those data were… I was very consistent about photographing from 1976 to 1986.

But eventually funding ran out and I began focusing on physiological studies. But toward the end of my career I realized that I was sitting on a gold mine. So, the first thing that’s important when considering a dataset and whether it should be preserved or not is the individual’s belief in the material. Now it’s not always necessary for the material to be your own for you to believe in it. For instance, I’m working on Tom Goreau, Sr.’s collection which I have here at the University of Georgia. I neither made it nor in any way contributed to its preservation but I’ve realized that it’s extremely important and therefore I’m going to be spending a lot of time on it. But in both cases, the photographic record from Jamaica, as well as the coral collection itself – those two activities have in common my belief in the importance of the material.

The reason that the belief in the material is so important is that the effort required to capture and preserve it is high, and you’ve got to have a belief in the material in order to take the steps to assure the QA/QC of the data you’re preserving, as well as the many hours required to put it into digital format. And believing in the material then should take another step, which is a very self-effacing review of whether you believe the material to be of real significance to others. There’s nothing wrong with memorabilia. We all keep scrapbooks and photographs that we like – things relating to friends and family, and times that made us who we are as scientists and people. However, the kind of data preservation that we’re talking about here goes beyond that – could have 50 or 100 years’ worth of utility.

Those kinds of data really do require them to be of some kind of value, and the value could either be global, regional, or possibly even local. Many local studies can be of importance in a variety of ways: the specialness of the environment, or the possibility that people will come back to that same special environment in the future. The other thing that then is number two on the list – first is belief in the material – second is you’ve got to understand that the context in which you place your data is much more important to assure its survival and utility than the specificity of the data. Numbers for their own sake are numbers. Numbers in the service of science become science. It is the context in which you place your data that will assure its future utility and preservation.

Continue reading “On Preserving Coral Reef Imagery and Related Data – by James W Porter”

On Preserving Coral Reef Imagery and Related Data – by James W Porter

CRESCYNT Data Science for Coral Reefs Workshop 1 – Data Rescue


We’re extremely pleased to be able to offer two workshops in March 2018 at NCEAS. The first is CRESCYNT Data Science for Coral Reefs Workshop 1: Data Rescue. Apply here.

When: March 7-10, 2018
Where: NCEAS, Santa Barbara, California, USA

Workshop description:

Recommended for senior scientists with rich “dark” data on coral reefs that needs to be harvested and made accessible in an open repository. Students or staff working with senior scientists are also encouraged to apply. Topics covered on days 1 and 2 of the workshop will cover the basic principles of data archiving and data repositories, including Darwin Core and EML metadata formats, how to write good metadata, how to archive data on the KNB data repository and elsewhere, data preservation workflow and best practices, and how to improve data discoverability and reusability. Additionally, participants will spend approximately 2 days working in pairs to archive their own data using these principles, so applying with a team member from your research group is highly recommended.

The workshop is limited to 20 participants. We encourage you to apply via this form. Workshop costs will be covered with support from NSF EarthCubeCRESCYNT RCN. Participants will publish data during the workshop process, and we anticipate widely sharing workshop outcomes, including workflows and recommendations. Because coral reef science embodies a wide range of data types (spreadsheets, images, videos, field notes, large ‘omics text files, etc.), anticipate some significant pre-workshop prep effort.

Related post: CRESCYNT Toolbox – Estate Planning for Your Data


>>>Go to the blog Masterpost or the CRESCYNT website or NSF EarthCube.<<<

CRESCYNT Data Science for Coral Reefs Workshop 1 – Data Rescue