Let’s load in the packages that we will use today. You will need to install the raster
package.
library(tidyverse)
library(tmap)
library(sf)
library(maps)
library(raster)
In the past few lectures you’ve learned some basics of spatial analysis in R. In particular you’ve learned how to perform basic mathematical operations and data wrangling operations on geospatial data object, make basic maps, and access spatial data directly from calls to the USGS, NOAA, as well as the US Census geographic database (using tigris
).
You’ve already dealt with several types of spatial data:
Point data (e.g. location of state capitals)
Line data (e.g. geologic faults, streams, roads,..)
Polygon data (e.g. country borders)
Today you’ll be introduced to raster data. The rasters we will deal with today are geographic raster. A geographic raster is simply gridded data and typically consisted of a matrix of equally spaced cells with each cell containing the value(s) of a measurement(s).
A great example of raster data would be global elevation data. In this case you the world would be divided into grid cells (e.g. 1 degree latitude by 1 degree longitude) and for each grid cell an elevation value would be reported. Below I’m showing a raster of elevation data for Schenectady.
## trying to read file: C:\Users\stahlm\Documents\Teaching_UnionCollege\Environmental_Data_Analysis\stahlm.github.io\ENS_215\Winter_2023\Lectures\https:\github.com\stahlm\stahlm.github.io\raw\master\ENS_215\Winter_2022\Lectures\Data\elevation_schdy.tif
We often encounter geographic raster data in the environmental sciences – in particular, the availability of satellite data (remote sensing) makes rasters widespread in this field of study. You’ll find raster data available from a range of organization and research groups (e.g. NASA, USGS, NOAA, climate modeling researchers,…)
Let’s load in a raster of the world’s topography (elevation). FYI, this raster reports elevation in units of meters.
raster_world_elev <- raster("https://github.com/stahlm/stahlm.github.io/raw/master/ENS_215/Winter_2022/Lectures/Data/World_elev.TIFF")
# Data was obtained from: https://neo.sci.gsfc.nasa.gov/view.php?datasetId=SRTM_RAMP2_TOPO
We can examine the details of the raster by simply typing the object name
raster_world_elev
## class : RasterLayer
## dimensions : 720, 1440, 1036800 (nrow, ncol, ncell)
## resolution : 0.25, 0.25 (x, y)
## extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
## crs : +proj=longlat +datum=WGS84 +no_defs
## source : World_elev.TIFF
## names : World_elev
This gives you a summary of the raster’s attributes. We see the dimensions (number of rows, columns, cells). In this case we can see that the raster has 1,036,800 cells. Thus there are 1,036,800 grid cells with elevation data. We see that the resolution (i.e. the dimensions of each grid cell) is 0.25 degree lat and 0.25 degree long. We also see the spatial extent of th raster and coordinate reference system info. The summary above also reports the min and max values. It is good to check that these values make sense – in this case they do not look amiss.
We can take a quick look at the data by using the plot()
function from the raster
package
raster::plot(raster_world_elev)
You can see that this doesn’t look quiet right. The issue here is that missing data is stored as a numeric value (in this case a very large one) and it is messing up the mapping of the values to a color scale. Let’s diagnose what exactly what is going on, that way we can identify a potential solution to the issue.
Statistics on the raster values should allow us to determine the value that is being used as a placeholder for missing data. We can get stats using the cellStats()
function. Note that the function takes the dataset as the first argument and the stat of interest as the second argument.
cellStats(raster_world_elev,
stat = max)
## [1] 99999
Ok, so we see that missing data is stored as 99999. This is a fairly common practice, but it could introduce errors in our analysis. Since the raster is topographic data, the creator of the data (NASA) decided to assign the value of 99999 for any pixel in the ocean.
Let’s replace the 99999 with NA so that we don’t end up introducing any errors in our upcoming analysis. To do this we simply need to think back to the base R from the first weeks of class. Below we’ll perform this reassignment
raster_world_elev[raster_world_elev == 99999] <- NA
Discuss with your neighbor what this code is doing. Make sure you understand what is going on.
Now let’s plot the data again.
raster::plot(raster_world_elev)
You can see that we have fixed the issue.
We will often need to perform mathematical operations on rasters. For instance you may want to convert the units from meters to feet (multiplication) or you might want to perform some other mathematical operation (e.g. take the log of the values). Any valid mathematical operation(s) can be performed on the raster.
Let’s convert our elevation data from units of meters into feet (1 m = 3.28084 ft). We will simply multiply each value in the raster by factor that represents the conversion from meters to feet.
raster_world_elev__ft <- raster_world_elev * 3.28084
As you might know, the pressure in the Earth’s atmosphere decreases with elevation. So at sea-level we might experience 1 atmosphere of pressure, while on the top of Mt. Everest we would experience a much lower pressure of ~0.34 atmospheres.
Since atmospheric pressure is a function of elevation, using our elevation raster we could generate a map of the likely atmospheric pressure experienced at each point on Earth.
The elevation-pressure relationship can be approximated by the following equation
\(p_{0}*exp(- \displaystyle \frac{g*h*M}{T_{0}*R_{0}})\)
In the equation above, \(h\) is the elevation in meters. The parameters are described (and defined) in the code block below:
coeff_p0 <- 1 # Sea level standard atmospheric pressure (Atmospheres)
coeff_T0 <- 288.16 # Sea level standard temperature(K)
coeff_g <- 9.80665 # Earth-surface gravitational acceleration (m/s^2)
coeff_M <- 0.02896968 # Molar mass of dry air (kg/mol)
coeff_R0 <- 8.314462618 # Universal gas constant (J/(mol*K))
Let’s calculate the atmospheric pressure using our elevation raster.
raster_atmos_pressure <- coeff_p0 * exp( -(coeff_g * raster_world_elev * coeff_M)/(coeff_T0 * coeff_R0) )
Now let’s plot the raster we have created.
raster::plot(raster_atmos_pressure)
Plotting the raster, we can see that it works! Pretty cool, we were able to calculate the atmospheric pressure everywhere on the Earth’s surface using our elevation raster! This is just one example, but as you can see any mathematical operations that you might need to perform can be done on a raster.
We also often need to compute values based on several rasters. For instance we may have rasters of average monthly precipitation across the US. In this case we would have 12 rasters (one for each month) and we might want to determine that average annual precipitation for each grid cell (i.e. the sum of each cell’s monthly values). In this case we would simply need to sum the twelve rasters, which would simply be done as
raster_1 + raster_2 + ... + raster_12
Let’s work with an example involving just two rasters. Here we will load in a raster of the average daytime temperatures for March and a raster of the average nighttime temperatures for March (units are degrees C).
raster_day_temps <- raster("https://github.com/stahlm/stahlm.github.io/raw/master/ENS_215/Winter_2022/Lectures/Data/Day_temp_2001_march.tif")
raster_day_temps[raster_day_temps == 99999] <- NA
raster_night_temps <- raster("https://github.com/stahlm/stahlm.github.io/raw/master/ENS_215/Winter_2022/Lectures/Data/Night_temp_2001_march.TIFF")
raster_night_temps[raster_night_temps == 99999] <- NA
# Data from: https://neo.sci.gsfc.nasa.gov/view.php?datasetId=MOD_LSTD_CLIM_M
# Data from: https://neo.sci.gsfc.nasa.gov/view.php?datasetId=MOD_LSTN_CLIM_M
As an environmental scientist you might be interested in knowing the daily temperature range in March experienced at all locations on Earth as this would have many implications including: ecological/habitat suitability implications; energy requirements (e.g. heating and cooling demand); provide an indication of soil moisture (wetter areas would experience less fluctuation between day and night,…).
Go ahead and calculate the difference between day and night temperature and save this to a new raster called raster_temps_diff
raster_temps_diff <- raster_day_temps - raster_night_temps
Here’s what your results should look like
If you want to see how the data are distributed you can use the hist()
function from the raster
package. Here’s how the data distribution should look
Take a few minutes to think about what you are observing. Discuss what you see with your neighbors. I’ll come around and check-in with you too. In particular think about
Now let’s learn how to re-classify raster data. This is where we reassign values to new values based on some criteria. In the case of our temperature range data, we might want to reclassify the data based on the intra-daily temperature fluctuations as follows:
To reclassify the raster we will need to define a matrix that has the upper and lower bounds of each class and the value that we will like to assign cells that fall into that class. Let’s define our reclassification raster below.
reclass_df <- c(-Inf, 3, 1,
3, 6, 2,
6, 15, 3,
15, Inf, 4
)
reclass_m <- matrix(reclass_df,
ncol = 3,
byrow = TRUE)
reclass_m
## [,1] [,2] [,3]
## [1,] -Inf 3 1
## [2,] 3 6 2
## [3,] 6 15 3
## [4,] 15 Inf 4
Now, we will use the reclassify()
function to reclassify the raster.
raster_temps_diff_reclass <- raster::reclassify(raster_temps_diff,
reclass_df)
Plotting the raster, you can see that our reclassification works – we now have four values (classes) in our raster.
raster::plot(raster_temps_diff_reclass,
col = terrain.colors(4))
In many cases we will want to obtain the values of a raster at specified locations/points. To do this we can use the extract()
function from the raster
package. Let’s test this out by extracting the land surface elevation values from our global elevation raster at the location of every capital city.
For we will need to load in a dataset that has world cities and their coordinates. This information is found in the world.cities
dataset from the maps
package.
cities_world <- world.cities
The world.cities dataset has many cities in addition to the world capitals, so let’s use the code below to filter the dataset to just include the world capitals.
cities_world <- cities_world %>%
filter(capital == 1)
If you hover your mouse over the cities_world
object in your Environment pane you will see that it is a data.frame object. In order to use it to extract values from a raster we will need to convert it to a simple features data frame. We can do this by using the st_as_sf()
function from the sf
package. This function is easy to use - all we need to do is give it the object to convert and then tell the function where name of the columns that have the longitude and latitude data. Run the code below and then confirm that your cities_world_sf
is in fact an sf data frame.
cities_world_sf <- cities_world %>%
st_as_sf(coords = c("long", "lat"))
Now that we have an sf object with the world capitals, let’s use the raster::extract()
function to extract the elevations at each of those locations.
The extract()
function takes the raster to use as the first argument and it takes the spatial object that has the locations of interest as the second argument. Note that in the code below, we are going to add these extracted elevations as a new column called elev__m
to our cities_world_sf
object.
cities_world_sf$elev__m <- raster::extract(raster_world_elev, cities_world_sf)
Now that you’ve run the code, take a look at the cities_world_sf
object to confirm that you extracted the elevations you wanted. Note that some of the cities will have NA
values (this is because the resolution of our elevation raster is fairly low and thus some cities near the coast fell on pixels that have NA
values in the land elevation raster).
Note that there are a number of other features/functionality that you can implement when extracting raster data. For example, you can extract all the values within a buffer (radius) of each point, you can extract values along a lines feature, etc. We won’t cover those approaches today, but you can always ask me how to do this if you are interested and/or you can find great examples in the book Geocomputation with R or elsewhere online.
You will often want to clip (crop) a raster to a specified area. For instance you might want to take the global elevation raster that we’ve been using and clip it to just the United States. To clip (crop) a raster you will need to specify bounds to clip to.
We generally do this by first masking the raster (using mask()
). This keeping values for only the raster cells that fall within the polygon defined by the shape we would like to ultimately crop to – all cells that fall outside the polygon are set to NA. Then we crop()
the raster so that is extent is within the bounding box of the desired region (i.e. rectangle that bounds the polygon we are cropping to).
Let’s go ahead and use the mask()
function from the raster package. However, we will first load in a spatial object that has the borders of the South Africa as we will use this to set our mask and crop bounds.
borders_data <- spData::world # borders of all countries
# Get just South Africa's borders
borders_data_southaf <- borders_data %>%
filter(name_long == "South Africa")
Think back to what you learned earlier this week and use the tmap
package to make a quick map of borders_us
in order to confirm that you have the borders you want. You should get something that looks like this.
Now we will go ahead and mask our elevation data to this region.
raster_elev_southaf <- raster::mask(raster_world_elev,
borders_data_southaf)
Let’s plot the masked raster
raster::plot(raster_elev_southaf)
You can see that we now only have raster data that falls within South Africa. However, the extent of the raster is still the extent of the original global raster. We now need to crop()
the raster to the bounds of South Africa.
raster_elev_southaf <- raster::crop(raster_elev_southaf,
borders_data_southaf)
raster::plot(raster_elev_southaf)
Now we have our masked and clipped raster. Note that the hole in the SW of the country is note an error, it is where the country of Lesotho is (this little detail makes South Africa a nice example to demonstrate masking and cropping).
tmap
So far we’ve just been mapping the rasters using the very basic (but easy and quick) plot()
function. Now let’s learn how to use tmap
, which you learned earlier this week to map rasters.
As you already know the basics of tmap
, adding a raster is very easy. We simply use the tm_raster()
function.
Let’s make a map with the borders of every country and the global elevation raster
tm_shape(raster_world_elev) +
tm_raster() +
tm_shape(borders_data) +
tm_borders() +
tm_layout(legend.position = c("LEFT", "center"))
## stars object downsampled to 1414 by 707 cells. See tm_shape manual (argument raster.downsample)
If you want to change how the breaks in the color scheme are defined you can change use the style
argument in tm_raster()
to specify from a number of built in options for automatically setting breaks. You can also manually define the breaks. Let’s try using the cont
option to use a continuous color scale (as opposed to the discrete one used above).
tm_shape(raster_world_elev) +
tm_raster(style = "cont") +
tm_shape(borders_data) +
tm_borders() +
tm_layout(legend.position = c("LEFT", "center"))
## stars object downsampled to 1414 by 707 cells. See tm_shape manual (argument raster.downsample)
Check out the help file for tm_raster
to see additional options.
We’ve just scratched the surface with what you can do with the analysis of raster data, though this should give you the tools to many important analysis and to learn additional techniques outside of class.
You are interested in studying high elevation habitats and you would like to identify the areas of the world that would be ideal for your research. Make a world map where that shows the borders of every country with the the countries filled in with a solid color. Overlain on this map show areas of the world that have an elevation > 1500 m.
If you have time, use the skills you learned earlier this week to make your map look nice. Also try making some additional maps with the world elevation data if you have time.
Here’s what my map looks like
## stars object downsampled to 1414 by 707 cells. See tm_shape manual (argument raster.downsample)