Instructors:
Matt Bixley, Mik Black, Murray Cadzow, Ngoni Faya, Tanya Major
Helpers:
Justin Farquhar, Cecilia Wang
General Information
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct
research. Its target audience is researchers who have little to no prior computational experience,
and its lessons are domain specific, building on learners' existing knowledge to enable them to quickly
apply skills learned to their own research.
Participants will be encouraged to help one another
and to apply what they have learned to their own research problems.
Who:
The course is aimed at graduate students and other researchers.
You don't need to have any previous knowledge of the tools
that will be presented at the workshop.
Where:
Science Library seminar room, 730 Cumberland Street, Dunedin, Otago, New Zealand.
Get directions with
OpenStreetMap
or
Google Maps.
Requirements: Participants must bring a laptop with a
Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).
Accessibility: We are committed to making this workshop
accessible to everybody.
The workshop organizers have checked that:
The room is wheelchair / scooter accessible.
Accessible restrooms are available.
Materials will be provided in advance of the workshop and
large-print handouts are available if needed by notifying the
organizers in advance. If we can help making learning easier for
you (e.g. sign-language interpreters, lactation facilities) please
get in touch (using contact details below) and we will
attempt to provide them.
Everyone who participates in Carpentries activities is required to conform to the Code of Conduct.This document also outlines how to report an incident if needed.
Surveys
Please be sure to complete these surveys before and after the workshop.
To participate in a
Data Carpentry
workshop,
you will need access to the software described below.
In addition, you will need an up-to-date web browser.
The setup instructions for the Data Carpentry Ecology workshops (with R)
can be found below.
Data
You can download all of the data used in this workshop by clicking
this download link. The file is 38.4 MB.
Clicking the download link will automatically download all of the files to your default download directory as a single compressed
(.zip) file. To expand this file, double click the folder icon in your file navigator application (for Macs, this is the Finder
application).
For a full description of the data used in this workshop see the data page.
To interact with spreadsheets, we can use LibreOffice, Microsoft Excel, Gnumeric, OpenOffice.org, or other programs.
Commands may differ a bit between programs, but the general ideas for thinking about spreadsheets are the same. For this workshop,
we recommend using either Microsoft Excel (paid software) or LibreOffice (free and open source). Other spreadsheet programs may
not have all of the features we will be exploring in this workshop.
To install LibreOffice, go to their download page. The website should
automatically select the correct option for your operating system. Click the “Download” button. You will go to a page that asks about a
donation, but you don’t need to make one. Your download should begin automatically. Once the installer is downloaded, double click on it (you may need to open your Downloads folder) and LibreOffice should install.
R and RStudio
R and RStudio are separate downloads and installations. R is the
underlying statistical computing environment, but using R alone is no
fun. RStudio is a graphical integrated development environment (IDE) that makes
using R much easier and more interactive. You need to install R before you
install RStudio. After installing both programs, you will need to install
some specific R packages within RStudio. Follow the instructions below for
your operating system, and then follow the instructions to install
tidyverse and RSQLite.
Windows
If you already have R and RStudio installed
Open RStudio, and click on “Help” > “Check for updates”. If a new version is
available, quit RStudio, and download the latest version for RStudio.
To check which version of R you are using, start RStudio and the first thing
that appears in the console indicates the version of R you are
running. Alternatively, you can type sessionInfo(), which will also display
which version of R you are running. Go on
the CRAN website and check
whether a more recent version is available. If so, please download and install
it. You can check here for
more information on how to remove old versions from your system if you wish to do so.
Under Installers select RStudio x.yy.zzz - Windows Vista/7/8/10 (where x, y, and z represent version numbers)
Double click the file to install it
Once it’s installed, open RStudio to make sure it works and you don’t get any
error messages.
macOS
If you already have R and RStudio installed
Open RStudio, and click on “Help” > “Check for updates”. If a new version is
available, quit RStudio, and download the latest version for RStudio.
To check the version of R you are using, start RStudio and the first thing
that appears on the terminal indicates the version of R you are running. Alternatively, you can type sessionInfo(), which will
also display which version of R you are running. Go on
the CRAN website and check
whether a more recent version is available. If so, please download and install
it.
Under Installers select RStudio x.yy.zzz - Mac OS X 10.6+ (64-bit)
(where x, y, and z represent version numbers)
Double click the file to install RStudio
Once it’s installed, open RStudio to make sure it works and you don’t get any
error messages.
Linux
Follow the instructions for your distribution
from CRAN, they provide information
to get the most recent version of R for common distributions. For most
distributions, you could use your package manager (e.g., for Debian/Ubuntu run
sudo apt-get install r-base, and for Fedora sudo yum install R), but we
don’t recommend this approach as the versions provided by this are
usually out of date. In any case, make sure you have at least R 3.5.1.
Under Installers select the version that matches your distribution, and
install it with your preferred method (e.g., with Debian/Ubuntu sudo dpkg -i
rstudio-x.yy.zzz-amd64.deb at the terminal).
Once it’s installed, open RStudio to make sure it works and you don’t get any
error messages.
After installing R and RStudio, you need to install the tidyverse and
RSQLite packages. Start RStudio by double-clicking the icon and then type:
install.packages(c("tidyverse", "RSQLite")). You can also do this by going to Tools -> Install Packages and
typing the names of the packages you want to install, separated by a comma.
The Bash Shell
Bash is a commonly-used shell that gives you the power to do simple
tasks more quickly.
Click on "Next" four times (two times if you've previously
installed Git). You don't need to change anything
in the Information, location, components, and start menu screens.
From the dropdown menu select "Use the nano editor by default" and click on "Next".
Ensure that "Git from the command line and also from 3rd-party software" is selected and
click on "Next". (If you don't do this Git Bash will not work properly, requiring you to
remove the Git Bash installation, re-run the installer and to select the "Git from the
command line and also from 3rd-party software" option.)
Ensure that "Use the native Windows Secure Channel library" is selected and click on "Next".
Ensure that "Checkout Windows-style, commit Unix-style line endings" is selected and click on "Next".
Ensure that "Use Windows' default console window" is selected and click on "Next".
Ensure that "Enable file system caching" and "Enable Git Credential Manager" are selected
and click on "Next".
Click on "Install".
Click on "Finish".
If your "HOME" environment variable is not set (or you don't know what this is):
Open command prompt (Open Start Menu then type cmd and press [Enter])
Type the following line into the command prompt window exactly as shown:
setx HOME "%USERPROFILE%"
Press [Enter], you should see SUCCESS: Specified value was saved.
Quit command prompt by typing exit then pressing [Enter]
This will provide you with both Git and Bash in the Git Bash program.
The default shell in some versions of macOS is Bash, and
Bash is available in all versions, so no need to install anything.
You access Bash from the Terminal (found in
/Applications/Utilities).
See the Git installation video tutorial
for an example on how to open the Terminal.
You may want to keep Terminal in your dock for this workshop.
To see if your default shell is Bash type echo $SHELL in
Terminal and press the enter/return key. If the message printed does
not end with '/bash' then your default is something else and you can run
Bash by typing bash.
The default shell is usually Bash and there is usually no need to
install anything.
To see if your default shell is Bash type echo $SHELL in
a terminal and press the enter/return key. If the message printed does
not end with '/bash' then your default is something else and you can run
Bash by typing bash.