Session 6: Geospatial analyses and how to parallelize them in R
Session Rules
- Green Light, Red Light - Use the Zoom participant feedback indicators to show us if you are following along successfully as well as when you need help. To access participant feed back, click on the “Participants” icon to open the participants pane/window. Click the green “yes” to indicate that you are following along successfully, click the red “no” to indicate when you need help. Ideally, you will have either the red or green indicator displayed for yourself throughout the entire tutorial. We will pause every so often to work through solutions for participants displaying a red light.
- Chat questions/comments take first priority - Chat your question/comments either to everyone (preferred) or to the chat moderator (Ryan Lucas) privately to have your question/comment read out loud anonymously. We will answer chat questions first and call on people who have written in the chat before we take questions from raised hands.
- Share your video when speaking - If your internet plan/connectivity allows, please share your video when speaking.
- Keep yourself on mute - Please mute yourself when not speaking.
Learning objectives
This session will include tutorials exploring examples of handling geospatial data, performing geospatial calculations, and applying parallel processing approaches to geospatial processing workflows in R. RStudio via Open OnDemand (see Session 4) will be used for a portion of the tutorials.
- Read in and manipulate raster data with the terra and stars packages
- Read in and manipulate vector data with the sf package
- Time chunks of code in your R script
- Identify package functions with parallelization options built-in
- Parallelize R code of many independent geospatial tasks
Agenda
This session will be an interactive tutorial:
- Geospatial packages
- Parallel processing packages
- Vector tutorial
- Raster tutorial
- Vector-raster tutorial
Tutorial material
Written versions of these tutorials, modified to be accessible to any SCINet user, are available on the Geospatial Workbook
The workshop-specific instructions are kept below.
Steps to prepare for the tutorial:
-
Login to Atlas Open OnDemand at atlas-ood.hpc.msstate.edu/. Your username is typically firstname.lastname. For the password, enter your SCINet account password followed by the 6-digit verification code, e.g. from a Google Authenticator app on your phone, with no spaces. Do not add a ‘+’ between your password and code.
-
Copy the Session 6 material from the workshop project space to your temporary workshop folder. To get to a shell to do so, you can use the Clusters tab at the top of your Open OnDemand page to select ‘Atlas Shell Access’ (if prompted for a password, enter your SCINet account password without the verification code). If you are comfortable ssh-ing in instead from terminal or powershell, feel free to do so.
If you have already made your workshop folder in previous sessions, you will only need to run the following commands, replacing firstname.lastname with your actual name:
cd /90daydata/shared/firstname.lastname cp -r /project/geospatialworkshop/session6/ .
If you have not created your workshop folder yet, run these commands instead, replacing firstname.lastname with your actual name:
cd /90daydata/shared mkdir firstname.lastname cd firstname.lastname cp -r /project/geospatialworkshop/session6/ .
-
Launch an RStudio session. Choose the following values from the menu:
- Account: geospatialworkshop
- Slurm Partition: atlas
- QOS: normal
- R version: 4.1
- Number of hours: 3
- Number of tasks: 16
- Memory required: 64G
Click Launch.
-
The tutorials: The first two tutorials will follow Rmarkdown documents in RStudio. For the third tutorial, we will submit a job to SLURM directly. If your shell from Step 2 has expired when we start the third tutorial, please reconnect, and change directory to your session 6 folder:
cd /90daydata/shared/firstname.lastname/session6