The Quarto Notebooks that we work in allow us to incorporate text, code, and output all in one place. This is a huge benefit when you want to create a report from your work in R. Quarto Notebooks are excellent for producing computationally reproducible research.
A Quarto Notebooks (.qmd) is a blend of R (which is the code portion of your Notebook) and Markdown (the text portion). Markdown is simply a system for formatting document features (e.g. text, margins, bullets, table of contents,…).
There are ton’s of formatting options you can specify when working in Quarto and this allows us to create attractive and easy to read documents. We’ll learn a few basics today that will greatly improve how your R Notebooks look when you output your reports.
Note
If you have coded in RStudio before you may have used R Markdown Notebooks. Quarto Notebooks are very similar to R Markdown Notebooks, though Quarto is a newer version that has some added benefits. In this class we will aim to use Quarto Notebooks throughout the term so that we can take advantage of some of these newer features. Furthermore, Quarto Notebooks support the use of many other programming languages in addition to R.
Basic Elements of Quarto files
The Quarto cheatsheet and the R Markdown Cheatsheet that I handed out have examples similar to below, as well as more advanced topics that we won’t cover today.
File Header
All of your Quarto Notebooks have a file header (also called a YAML Header). This is required for specifying how your file will look and what format it will be output to when you generate your reports.
The header is at the top of the Notebook and has three dashes --- at the top and bottom.
Here’s the header that I used on this current Notebook. These settings specified how I want my file to look when it is rendered to a report.
Code
---title:"R Programming and Markdown Basics"author:"_Mason Stahl_ (ENS-215)"date:"2026-01-08"date-format:"MMMM D, YYYY"format: html: code-fold: show code-tools: source: false df-print: paged theme: light: journal dark: darkly page-layout: full toc: true toc-float: true---
Prior to rendering your document you can view how the formatted HTML file will look by using the visual tab.
When you done and ready to generate your report you can Render your document to an html file by clicking the Render button at the top of your editor window.
R Code
As you already know, we can include R Code in our Quarto Notebooks. We can add code blocks by hitting Ctrl + Alt + i (PC) or Cmd + Option + i (Mac).
You can also generate a code block by typing ```{r} on one line, then hitting Enter and typing ``` on the line below.
Give both of these approaches a try.
Example code block
Code
x <-10# comments in a code block are created by putting the hashtag symbol before the commentx +5
[1] 15
Remember you can run a code block by hitting Ctrl + Shift + Enter (PC) or Cmd + Shift + Enter (Mac). To see the other Run options you can click the Run dropdown button in the top right of your Editor window.
Markdown Syntax
Since your Quarto Notebook (qmd) file is essentially a plain text file (e.g. You can’t modify how the text looks in your editor like you can in Microsoft Word) you need to use special characters to specify how your text should be formatted in your output report.
Section headers/titles like the ones you see separating the sections of this document are created by putting an # at the start of a line of text. To create smaller section headers add more hashtags to the start of the line. For instance ## will create a smaller section header and ### would create and even smaller one.
Bold font is created by putting ** at the start and end of the section of text you want in bold. For instance you would type ** text I want in bold **.
Italics are created with either _text inside is in italics_ or *text inside is in italics*
To make code show up verbatim you use put the ` symbol around the text your want to appear as verbatim. Note that the symbol is NOT the single quote but is the symbol that appears to the left of the 1 on your keyboard.
Superscripts such as X2 are done with the ^superscripted text here^. So X2, is created by X^2^
Subscripts such as Xi are done with ~subscripted text here~. So Xi, is created by X~i~
I can also create bulleted lists using the + at the start of a line.
I can create numbered lists by typing something like this
Code
1. First item2. Second item i) sub-item ii) another sub-item
And the list would look like this in my report.
First item
Second item
sub-item
another sub-item
Line breaks to make a line break show up in your formatted document, you need to put TWO SPACES at the end of the line before and then hit ENTER. The line break will only show up in your rendered document if you have two spaces
Exercise
Spend some time testing out the different Markdown formatting options you learned above
R Programming basics
Basic operations and calculations
As you’ve already seen by now you can use R as a calculator. Below is a list of some basic operations.
Code
2+1#Add
[1] 3
Code
15-4#Subtract
[1] 11
Code
9*2#Multiply
[1] 18
Code
3^4#Exponents
[1] 81
Code
120/8#Divide
[1] 15
Code
5%%2#Modulus
[1] 1
Code
4>2#Greater than
[1] TRUE
Code
2<5#Less than
[1] TRUE
Code
5<=5#Less than or equal
[1] TRUE
Code
8>=2#Greater than or equal
[1] TRUE
Code
2==2#Equality: notice that it is TWO equal signs!
[1] TRUE
Code
5!=7#Not Equals
[1] TRUE
Note that when you run a code block it is sending the code the the console. You can also type code directly into the console and it will be evaluated. This can be handy for a quick one off calculation, however for running many operations we’ll stick to using an R notebook.
Assiging values to a variable
Typically we’ll be re-using the results from some calculation so we’ll want to assign it to a variable. In R we use <- to assign values to objects So x <- 10 would mean that the object x is assigned a value of 10.
Code
x <-10# assign a variable# to print out the value of x to the console I can simply type out the variable on its own line of codex
[1] 10
Code
y <- (2*x) +5# you can use mathematical operations and previously declared variables when assigning a new variabley
[1] 25
Code
z <- x + y +0.1234z
[1] 35.1234
Variables can take non-numeric values. The objects below take strings (i.e. text) as their values.
Code
studentName_1 <-"Bob"studentName_1
[1] "Bob"
Code
studentName_2 <-"Jess"studentName_2
[1] "Jess"
Notice how I gave the objects descriptive names. Also notice how I used a consistent naming format. You should be put thought into how you name objects This will make your code much easier to read and much faster to write.
Object names cannot begin with a number, contain spaces, or (most) special characters. You may use underscores and periods in object names. Also note that objects are case sensitive.
So if you have an object a then typing out A would NOT be referring to the object that your names a.
Examining your Environment
Now take a look at your Environment tab. You’ll see all of the objects that we’ve assigned thus far. If you want to see all of the objects in your environment you use the ls() function.
Code
ls() # this prints out the names of all of the objects currently in my environment
[1] "studentName_1" "studentName_2" "x" "y"
[5] "z"
To remove an object from your workspace you can use the rm() function
Code
rm(x)
Refresher Exercise 1:
Create two objects named number_1 and number_2 and give them the values of 2.5 and 10, respectively
Create two more objects named string_1 and string_2, give them any character string that you would like.
Now using number_1, number_2, and the power of math create an object called number_3 that equals 25
Create two more objects whose value is of your choosing
List the objects in your workspace
Remove string_2
Try to add string_1 and number_1. What happens?
Data types and data structures
Everything in R is an object. The data assigned to a given object can be categorized by its data type. Data can be organized into different structures and these structures can often accommodate a mix of different data types.
Data types
Any value stored in a data object can be characterized by its data type.
The basic data types in R are:
Example
Type
“a” “swc”
character
2, 15.5
numeric
2L
integer
TRUE, FALSE
logical
1+4i
complex
62 6f 62
raw
We will almost always be dealing with character, numeric, and logical data types in this class.
In many cases the data you deal with may have missing values or other issues. Values such as missing data NA, not a number NaN and infinity inf will come up from time to time. We’ll learn techniques for dealing with these throughout the term.
Infinity can arise as such
Code
1/0
[1] Inf
Not a number can arise as follows
Code
0/0
[1] NaN
Getting help (reminder)
If you get stuck remember along with me and you classmates, Google can almost always point you in the right direction. Your textbook is also a great resource.
In addition to these resource R has built-in help files. Let’s practice with these.
To get help you type ?term_of_interest in your console or in your Notebook (and then run the code block) and help will appear in the Help window to the right. For example
In your console get help for the na.omit() function. Take a minute to look at the help file and understand what it is showing. All help files are similarly formatted.
Try getting help for another function that you are interested in.
Data Structures
Data can be stored in R as a number of different data structures. The structure that you chose to assign data to will depend on the features/characteristics of your data.
The data structures available in base R include:
vector
list
matrix
data frame
factors
tables
Vectors
Common and basic data structure in R.
Can be a vector of characters, logical, integer, or numerica data - However a given vector can only contain one data type.
Try taking the sum of the a_missing by using the sum() function. Do you see any issues?
In some cases we will want to remove missing data entries so that we can just examine the entries where we have values. Let’s remove the NAs from a_missing and assign the new data to a new object called a_cleaned
Code
a_cleaned <-na.omit(a_missing)
Look at a_cleaned. Does it look like everything worked?
Now try taking the sum of a_cleaned.
You can also use na.exclude() to remove missing values
Code
na.exclude(a_missing) #similar to omit, but has different behavior with some functions.
Think of some more examples where you might use factors. Can you think of both ordered and unordered examples?
Did you encounter any variables in your first lab that could be treated as a factor?
Data frames
We are going to be using these all the time in this class and in data analysis in general. They are similar in structure to a spreadsheet that you might open in Excel.
Data frames are made up of rows and columns. Each column is a vector and all columns must be of the same length. Basically anything the you save in as a delimited text or Excel file .csv, .xls, or .xlsx can be read into R as a data frame.
Date frames have a number of important attributes that you’ll interact, in particular column names, row names, and dimensions.
We can load in data to a data frame or create one from scratch. We’ll create one below using the data.frame() function
Code
numbers <-c(1:26, NA)lettersNew <-c(NA, letters) #letters is a special object available from base Rlogical <-c(rep(TRUE, 13), NA, rep(FALSE, 13))examp_df <-data.frame(lettersNew, numbers, logical, stringsAsFactors =FALSE)
To look at the first few rows and last few rows
Code
head(examp_df) # first rows
Code
tail(examp_df) # last rows
To access a variable (column) from a data frame you use the $ operator
Code
examp_df$lettersNew # access the lettersNew variable
[1] NA "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r"
[20] "s" "t" "u" "v" "w" "x" "y" "z"
Try accessing some other variables from this data frame
You can also access a data frame by specifying the rows and columns of interest. We use bracket notation [] to do this. You specify the row(s) and then the column(s) of interest within the bracket.
Code
examp_df[2,3] # access the data in row 2 and column 3
[1] TRUE
Code
examp_df[2,] # to access all of the indices in a row or column, leave the index blank
To access all of the indices in a row or column, leave the index blank
Code
examp_df[2,] # access the data across all of the columns of row 2
Can you access all of the rows of column 3?
Once you’ve done that, assign this subset of the data to a new object called examp_df_subset
What data type is examp_df_subset?
To access row and/or column range you can use the : operator in your indexing statement
Code
examp_df[1:4,2:3] # access the data found in rows 1 through 4 and columns 2 through 3
Access the data rows 10:20 and all of the columns in examp_df
Access only the even rows in columns 1 and 2 of examp_df
Below are some other useful functions for examining data frames
$letters
[1] "x" "y" "z"
$animals
[1] "cat" "dog" "bird" "fish"
$numbers
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
[19] 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[37] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
[55] 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
[73] 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
[91] 91 92 93 94 95 96 97 98 99 100
$df
lettersNew numbers logical
1 <NA> 1 TRUE
2 a 2 TRUE
3 b 3 TRUE
4 c 4 TRUE
5 d 5 TRUE
6 e 6 TRUE
7 f 7 TRUE
8 g 8 TRUE
9 h 9 TRUE
10 i 10 TRUE
11 j 11 TRUE
12 k 12 TRUE
13 l 13 TRUE
14 m 14 NA
15 n 15 FALSE
16 o 16 FALSE
17 p 17 FALSE
18 q 18 FALSE
19 r 19 FALSE
20 s 20 FALSE
21 t 21 FALSE
22 u 22 FALSE
23 v 23 FALSE
24 w 24 FALSE
25 x 25 FALSE
26 y 26 FALSE
27 z NA FALSE
Exercises
Create a vector named vec_seq that goes from 0 to 99 by 1. Print the vector results to console using the print() function
Create another vector named vec_fracs with the following sequence 0/1, 1/2, 2/3, 3/4, 4/5,…,99/100. Print the vector results to the console.
Access every other element of vec_fracs starting with the 2nd element and print these subset to the console. Thus you would access element 2, 4, 6, 8,…,100.
Create a character vector that has five first names. Create another vector that has five last names. Then create a third vector that has the the first names listed in the first five elements and the last names listed in the last five elements.
Now create a vector that combines the first and last names, however each entry should be in the format Lastname, Firstname. Hint: look at the help for the paste() to see how you might do this.