We learned about if/else/else if statements in an earlier lecture. Now we are going to learn about loops, which are another type of control structure.
Loops allow us work through items in an object and apply operations to these items, without having to repeat our code. For instance we may have a list of names and we would like to print them one-by-one to our computer screen. We could write out a print()
statment for each item in the list. In a case like this we can use a loop.
Remember to consult your R cheatsheets (in today’s lecture, the Base R cheatsheet is particularly helpful)
Let’s load in the tidyverse package, just in case you want to use some of its functionality
library(tidyverse)
Ok, now let’s try out a basic example, so that you learn the structure of for loops.
for(i in 1:10){
print(i)
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
Make sure you understand what’s going on above. Now modify the loop so that it prints out \(i^2\) on each loop iteration.
Pay very close attention to the syntax of the loop. If the syntax is incorrect you will get an error.
Ok, now let’s try looping over a list of some majors available at Union. Here’s the list.
majors_union <- c("Environmental Science","Geology","English",
"Chemistry","Math","History","Computer Science")
Now we would like to print this list out.
for(i_major in 1:7){
print(majors_union[i_major])
}
## [1] "Environmental Science"
## [1] "Geology"
## [1] "English"
## [1] "Chemistry"
## [1] "Math"
## [1] "History"
## [1] "Computer Science"
You can see that we’ve looped over the list majors_union
and we used an index variable that started at 1 and increased by one each iteration of the loop. It went up through 7 iterations (which we specified at the start of the loop) and then stops after the 7th iteration.
i_major
variable. Do you see what is happening to i_major
on each interation of the loop? Note how we use i_major
to access the ith index of majors_union
on each loop iteration.You can also loop through a list using the elements of majors_union
as the variables over which we loop.
for(i_major in majors_union){
print(i_major)
}
## [1] "Environmental Science"
## [1] "Geology"
## [1] "English"
## [1] "Chemistry"
## [1] "Math"
## [1] "History"
## [1] "Computer Science"
The above loop steps through each element in the i_major
vector – moving to the next element on each loop iteration.
In some situations we’ll want to add a counter variable to our loop. This becomes particularly useful when we start to nest if statements inside of our loops (you’ll learn about this later in this lesson). Let’s add a counter variable to the loop we created above. This variable will keep track of how many times the loop is cycled through and thus will tell us how many majors are in our majors_union
variable.
counter_majors <- 0 # Initialize the variable to zero
for(i_major in majors_union){
print(i_major)
counter_majors <- counter_majors + 1 # add one to the counter everytime the loop is run
}
## [1] "Environmental Science"
## [1] "Geology"
## [1] "English"
## [1] "Chemistry"
## [1] "Math"
## [1] "History"
## [1] "Computer Science"
counters_majors
variable. Does the value make sense?Make two vectors. One vector should have the names of the months (you can type out the vector of names, or you can use a vector that is built into R that already has the names! A quick Google search should reveal how to do this). The other vector should have the number of days in each month. Create a loop that prints out a message like below:
January has 31 days
February has 28 days
Marchs has…
Hint: use the paste()
function to combine text. You will nest the paste()
function in your print()
statement.
# Your code here
Challenge: Once you’ve completed the exercise above, create a new code block that has the same loop, but this time, for the month of February you should print a statement that says “February has 28 days (29 on leap year)”. You can accomplish this by nesting an if/else statement in your loop.
+ Talk this over with your neighbors if you get stuck.
# Your code here
We can nest loops inside of other loops. This is often very handy when we want to loop over multiple related variables. Let’s take a look at a simple example.
We have a 5 x 5 matrix with the numbers 1 to 25 in it. First take a look at the x_mat
matrix to make sure you understand what you’ve got.
x_mat <- matrix(1:25, 5, 5, byrow = TRUE)
Now let’s print each element out row-by-row (i.e. start in row 1 and print each element out one-by-one, then go to row 2 and do the same,…)
for(i_row in 1:5){
for(j_col in 1:5){
print(x_mat[i_row, j_col])
}
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11
## [1] 12
## [1] 13
## [1] 14
## [1] 15
## [1] 16
## [1] 17
## [1] 18
## [1] 19
## [1] 20
## [1] 21
## [1] 22
## [1] 23
## [1] 24
## [1] 25
Look at the structure of the code above and make sure you understand what is going on.
Do you see how I “hard-coded” the dimensions of the matrix into the loop (i.e. specified that there are 5 rows and 5 columns). This is generally a bad practice as it makes your code very inflexible. Imagine we are loading in a dataset that is stored in a matrix and we don’t know the dimensions beforehand (or we want to load in different datasets that have different dimensions). If we “hard-code” the dimensions into the loop then our code will throw an error (if our dataset has less than 5 rows and 5 columns in the example above) or it will not loop over all of the matrix (if our dataset has > 5 rows and > 5 columns).
We can fix this issue by getting the dimensions of the data and storing it as a variable that is used in the loop.
Recreate the loop from the example above, but specify the number of rows and columns in the loop using a variable (Hint: you can use the dim()
function to determine the dimensions of an object or the nrow()
and ncol()
functions)
# Your code here
While loops begin by testing a condition. If the condition is TRUE then the loop is executed. The loop continues to be executed until the test condition is no longer TRUE. While loops have many uses, however a note of caution is that these loops will run infinitely if the test condition never changes to FALSE.
Let’s take a look at a simple example of a while loop. Before you run this code, predict the first and last value that will be printed to your console.
x_val <- 30 # initialize x_val
while(x_val > 10){
print(x_val)
x_val <- x_val - 1 # on each loop iteration, subtract 1 from x_val
}
## [1] 30
## [1] 29
## [1] 28
## [1] 27
## [1] 26
## [1] 25
## [1] 24
## [1] 23
## [1] 22
## [1] 21
## [1] 20
## [1] 19
## [1] 18
## [1] 17
## [1] 16
## [1] 15
## [1] 14
## [1] 13
## [1] 12
## [1] 11
Like you do with all of your code, pay careful attention to the syntax used when creating a while loop.
Create your own while loop and test it out
# Your code here
Control structures can be nested within one another. This allows for even greater control in your programming. For example, you can nest an if statement within a for loop.
Let’s take a look at an example. In this example let’s load in air temperature data in Albany for November 2018.
library(readr)
Alb_temps <- read_csv("https://stahlm.github.io/ENS_215/Data/Albany_Temperatures_Nov_2018.csv",
skip = 3)
## Rows: 30 Columns: 4
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## dbl (4): Day, Max, Avg, Min
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Now that you’ve loaded in the data, take a look at it. The data frame has the maximum, average, and minimum temperature (in deg F) for each of the days in November 2018. Make sure you understand each of the variables (columns) before moving ahead.
Let’s loop over each day and determine the freezing risk (imagine you are storing something outside and want to know if it was at risk of freezing).
num_days <- nrow(Alb_temps) # store the number of rows (days) to the num_days variable
freeze_temp <- 32 # water freezing temperature in degress F
for(i_day in 1:num_days){
if(Alb_temps$Avg[i_day] > freeze_temp){
print(paste("On November", Alb_temps$Day[i_day], ": Low risk of freezing"))
} else {
print(paste("On November", Alb_temps$Day[i_day], ": High risk of freezing"))
}
}
## [1] "On November 1 : Low risk of freezing"
## [1] "On November 2 : Low risk of freezing"
## [1] "On November 3 : Low risk of freezing"
## [1] "On November 4 : Low risk of freezing"
## [1] "On November 5 : Low risk of freezing"
## [1] "On November 6 : Low risk of freezing"
## [1] "On November 7 : Low risk of freezing"
## [1] "On November 8 : Low risk of freezing"
## [1] "On November 9 : Low risk of freezing"
## [1] "On November 10 : Low risk of freezing"
## [1] "On November 11 : Low risk of freezing"
## [1] "On November 12 : Low risk of freezing"
## [1] "On November 13 : Low risk of freezing"
## [1] "On November 14 : High risk of freezing"
## [1] "On November 15 : High risk of freezing"
## [1] "On November 16 : Low risk of freezing"
## [1] "On November 17 : Low risk of freezing"
## [1] "On November 18 : Low risk of freezing"
## [1] "On November 19 : Low risk of freezing"
## [1] "On November 20 : Low risk of freezing"
## [1] "On November 21 : High risk of freezing"
## [1] "On November 22 : High risk of freezing"
## [1] "On November 23 : High risk of freezing"
## [1] "On November 24 : High risk of freezing"
## [1] "On November 25 : Low risk of freezing"
## [1] "On November 26 : Low risk of freezing"
## [1] "On November 27 : Low risk of freezing"
## [1] "On November 28 : Low risk of freezing"
## [1] "On November 29 : Low risk of freezing"
## [1] "On November 30 : Low risk of freezing"
source: Wikipedia
# Your code here
Load in the daily temperature data for Albany International Airport and for each year from 1939 through 2021 determine the number of days where the minimum temperature was less than or equal to 32 degrees F. Your results should be saved to a data frame.
The data can be loaded in here
df_met <- read_csv("https://github.com/stahlm/stahlm.github.io/raw/master/ENS_215/Data/Albany_GHCND_2840632.csv")
Ask me and/or discuss with your neighbors if you have any questions or want to go over the approach. FYI, there are many ways that you might implement this solution.
Note: The daily temperature data for Albany was obtained through the National Oceanic and Atmospheric Administration’s (NOAA) Global Historical Climatology Netword daily (GHCNd) database. This is an excellent resource for daily meteorological records for > 100,000 sites around the world, with many sites having data going back many decades or more.
If you have any remaining time you should continue practicing with conditional programming (if/else) and control flow (for loops, while loops). Come up with some of your own ideas that you would like to test out. Also you can look through the Base R cheatsheet and make sure you are familiar with the topics presented there.