CRESCYNT Toolbox: Learning to love R more (or R is for reproducible)

caribbean_reef_shark_wikimediacommons_albertkokWe are driven to learn like sharks: constantly take in new flows, or die. In a recent workshop, when coral reef scientists were asked: “How many of you use R?” 60% raised a hand. To: “How many of you are comfortable with and love using R?” only about 15% kept a hand up.

Here’s where to go to learn to love R more.

rlogoYou likely already know of the R Project, free and open source software for statistical computing and graphics. You may already know of the reliability of the Comprehensive R Archive Network or CRAN repository, favored by many over other potential sources of community-generated code because of their metadata and testing requirements; it now hosts over 9,300 packages (sorted by date and name).

You may also know the elegance of RStudio, the excitement of putting your own interactive code online in RStudio’s Shiny, some great cheat sheets, the most popular R packages, and Stack Overflow as a great place to find answers to your R questions.

You may not know of the new R course finder, an online directory you can search and filter to find the best online R course for your next step (note there are often free versions or segments of even the pay courses listed). There are YouTube videos for R learning, like  twotorials (two-minute tutorials) and YaRrr! (because pirates) with book.

A very recent new book is getting rave reviews from both statistics and programming viewpoints: The Book of R by Tilman Davies (preview it here). The author writes:

“The Book of R …represents the introduction to the language that I wish I’d had when I began exploring R, combined with the first-year fundamentals of statistics as a discipline, implemented in R….   Try not to be afraid of R. It will do exactly what you tell it to – nothing more, nothing less. When something doesn’t work as expected or an error occurs, this literal behavior works in your favor….   Especially in your early stages of learning…try to use R for everything, even for very simple tasks or calculations you might usually do elsewhere. This will force your mind to switch to ‘R mode’ more often, and it’ll get you comfortable with the environment quickly.”

Because R is such a  stellar example of free and open source software with a very robust community (e.g., great stuff at r-bloggers), it’s a surprise how lucky we are that it IS open source, as heard in this interview with R founder Hadley Wickham on the podcast.

We’ll soon host a guest blogpost on some exploratory coral symbiont data analyses, visualizations, and comments generated in R Markdown, which is RStudio’s method for preserving code and output in one running web document. The work is beautiful and useful, and highlights the use of an electronic notebook as a way to capture and share data exploration, analysis and visualization, and to tell a data story. (A major advance to that software was announced this week in the form of R Notebook, which will ship within the next couple of months.)

Why is it worth learning to love R more?

R helps make sure your data work is reproducible (such an issue for science), repeatable (valuable for any processing you have to do periodically), and reusable (on other datasets or data versions, or by colleagues or your future self).

A couple of high-level languages, like R and Python, are becoming more popular each year, and are finding their way as general purpose tools into analytical platforms. These will serve as primary sources of flexibility in cyberinfrastructure platforms now available or under development. Our future selves thank us for the learning investment.

2016 Top 10 Tools for Analytics and Data Science - KD Nuggets Software Poll
“R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results”  (click graph for article)

Update: speaking of interviews with R makers, here’s an October 2016 interview with JJ Allaire, the creator of RStudio, Shiny, and R Markdown. His advice for people new to R:

I would suggest that they get a copy of the R for Data Science book written by Hadley Wickham and Garrett Grolemund…. Also, when you have questions or run into problems don’t give up. There’s a lot of great activity around R on stackoverflow and other places and there’s an excellent chance you’re going to find the answers to your questions if you look carefully for them.
CRESCYNT Toolbox: Learning to love R more (or R is for reproducible)

One thought on “CRESCYNT Toolbox: Learning to love R more (or R is for reproducible)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s