CRESCYNT Toolbox – EarthCube Sci-Tech Matchup – Lightning Talks

Get a fast intro to new Ready-for-Science EarthCube Tools!

We’ve helped arrange a series of lightning talks that will feature tools developed by EarthCube “building block” projects for direct use by scientists. Many EarthCube-built tools are designed to serve as internal components of an EarthCube platform. Other tools were built for scientists as direct end users, and a collection of these are now ripe for adoption. Will some of these help you with your research work?
The current collection will be shown over a span of two online sessions. Find log-in details for Wed., Feb. 15 and Fri., Feb. 17 (no RSVP required – just show up).
Join us!

Wednesday, Feb. 15, 2017
4-5pm EST / 1-2pm PST / 11am-12pm HST – login link HERE
GeoDataspace/GeoTrust, Tanu Malik
ECOGEO Virtual Machine, Elisha Wood-Charlson
LinkedEarth, Julien Emile-Geay
OntoSoft, Yolanda Gil
Flyover Country, Amy Myrbo

Friday, Feb. 17, 2017
4-5:30pm EST / 1-2:30pm PST / 11am-12:30pm HST – login link HERE
CHORDS, Mike Daniels
SuAVE, Ilya Zaslavsky
CINERGI, Ilya Zaslavsky
X-DOMES Ontology Registry, Janet Fredericks
X-DOMES SensorML Registry, Janet Fredericks
iSamplesIGSN, Kerstin Lehnert
GeoDeepDive, Shanan Peters
Digital Crust, Shanan Peters
ECITE, Sara Graves
Earth System Bridge, Scott Peckham

UPDATE: All of the videos from the first rounds of talks are now on the EarthCube YouTube channel – here’s the playlist. Slides are now accessible at the EarthCube Tools Inventory, including additional presentations.

Advertisements
CRESCYNT Toolbox – EarthCube Sci-Tech Matchup – Lightning Talks

CoralNet: deploying deep learning in the shallow seas – by Oscar Beijbom

coralnet_oscar-beijbom

Having dedicated my PhD to automating the annotation of coral reef survey images, I have seen my fair share of surveys and talked to my fair share of coral ecologists. In these conversations, I always heard the same story: collecting survey images is quick, fun and exciting. Annotating them is, on the other hand, slow, boring, and excruciating.

When I started CoralNet (coralnet.ucsd.edu) back in 2012 the main goal was to make the manual annotation work less tedious by deploying automated annotators alongside human experts. These automated annotators were trained on previously annotated data using what was then the state-of-the-art in computer vision and machine learning. Experiments indicated that around 50% of the annotation work could be done automatically without sacrificing the quality of the ecological indicators (Beijbom et al. PLoS ONE 2015).

The Alpha version of CoralNet was thus created and started gaining popularity across the community. I think this was partly due to the promise of reduced annotation burden, but also because it offered a convenient online system for keeping track of and managing the annotation work. By the time we started working on the Beta release this summer, the Alpha site had over 300,000 images with over 5 million point annotations – all provided by the global coral community.

There was, however, a second purpose of creating CoralNet Alpha. Even back in 2012 the machine learning methods of the day were data-hungry. Basically, the more data you have, the better the algorithms will perform. Therefore, the second purpose of creating CoralNet was quite simply to let the data come to me rather than me chasing people down to get my hands on their data.

At the same time the CoralNet Alpha site was starting to buckle under increased usage. Long queues started to build up in the computer vision backend as power-users such as NOAA CREP and Catlin Seaview Survey uploaded tens of thousands of images to the site for analysis assistance. Time was ripe for an update.

As it turned out the timing was fortunate. A revolution has happened in the last few years, with the development of so-called deep convolutional neural networks. These immensely powerful, and large nets are capable of learning from vast databases to achieve vastly superior performance compared to methods from the previous generation.

During my postdoc at UC Berkeley last year, I researched ways to adapt this new technology to the coral reef image annotation task in the development of CoralNet Beta. Leaning on the vast database accumulated in CoralNet Alpha, I tuned a net with 14 hidden layers  and 150 million parameters to recognize over 1,000 types of coral substrates. The results, which are in preparation for publication, indicate that the annotation work can be automated to between 80% and 100% depending on the survey. Remarkably: in some situations, the classifier is more consistent with the human annotators than those annotators are with themselves. Indeed, we show that the combination of confident machine predictions with human annotations beat both the human and the machine alone!

Using funding from NOAA CREP and CRCP, I worked together with UCSD alumnus Stephen Chan to develop CoralNet Beta: a major update which includes migration of all hardware to Amazon Web Services, and a brand new, highly parallelizable, computer vision backend. Using the new computer vision backend the 350,000 images on the site were re-annotated in one week! Software updates include improved search, import, export and visualization tools.

With the new release in place we are happy to welcome new users to the site; the more data the merrier!

_____________

– Many thanks to Oscar Beijbom for this guest posting as well as significant technological contributions to the analysis and understanding of coral reefs. You can find Dr. Beijbom on GitHub, or see more of his projects and publications here. You can also find a series of video tutorials on using CoralNet (featuring the original Alpha interface) on CoralNet’s vimeo channel, and technical details about the new Beta version in the release notes.

CoralNet: deploying deep learning in the shallow seas – by Oscar Beijbom