23-Jan-2023
One of the best tools you have to interpreting/understanding your data are your eyes!
A visual representation of your data can reveal patterns and interesting features that would have been difficult or impossible to identify by looking at a data table.
\(Graphics + Eyes + Brain = Understanding\)
What is the relationship between life expectancy and per capita GDP?
You have 30 seconds to assess the relationship. GO!!
country | gdpPercap | lifeExp |
---|---|---|
Afghanistan | 974.5803 | 43.828 |
Albania | 5937.0295 | 76.423 |
Algeria | 6223.3675 | 72.301 |
Angola | 4797.2313 | 42.731 |
Argentina | 12779.3796 | 75.320 |
Australia | 34435.3674 | 81.235 |
Austria | 36126.4927 | 79.829 |
Bahrain | 29796.0483 | 75.635 |
Bangladesh | 1391.2538 | 64.062 |
Belgium | 33692.6051 | 79.441 |
Benin | 1441.2849 | 56.728 |
Bolivia | 3822.1371 | 65.554 |
Bosnia and Herzegovina | 7446.2988 | 74.852 |
Botswana | 12569.8518 | 50.728 |
Brazil | 9065.8008 | 72.390 |
Bulgaria | 10680.7928 | 73.005 |
Burkina Faso | 1217.0330 | 52.295 |
Burundi | 430.0707 | 49.580 |
Cambodia | 1713.7787 | 59.723 |
Cameroon | 2042.0952 | 50.430 |
Canada | 36319.2350 | 80.653 |
Central African Republic | 706.0165 | 44.741 |
Chad | 1704.0637 | 50.651 |
Chile | 13171.6388 | 78.553 |
China | 4959.1149 | 72.961 |
Colombia | 7006.5804 | 72.889 |
Comoros | 986.1479 | 65.152 |
Congo, Dem. Rep. | 277.5519 | 46.462 |
Congo, Rep. | 3632.5578 | 55.322 |
Costa Rica | 9645.0614 | 78.782 |
Cote d’Ivoire | 1544.7501 | 48.328 |
Croatia | 14619.2227 | 75.748 |
Cuba | 8948.1029 | 78.273 |
Czech Republic | 22833.3085 | 76.486 |
Denmark | 35278.4187 | 78.332 |
Djibouti | 2082.4816 | 54.791 |
Dominican Republic | 6025.3748 | 72.235 |
Ecuador | 6873.2623 | 74.994 |
Egypt | 5581.1810 | 71.338 |
El Salvador | 5728.3535 | 71.878 |
Equatorial Guinea | 12154.0897 | 51.579 |
Eritrea | 641.3695 | 58.040 |
Ethiopia | 690.8056 | 52.947 |
Finland | 33207.0844 | 79.313 |
France | 30470.0167 | 80.657 |
Gabon | 13206.4845 | 56.735 |
Gambia | 752.7497 | 59.448 |
Germany | 32170.3744 | 79.406 |
Ghana | 1327.6089 | 60.022 |
Greece | 27538.4119 | 79.483 |
Guatemala | 5186.0500 | 70.259 |
Guinea | 942.6542 | 56.007 |
Guinea-Bissau | 579.2317 | 46.388 |
Haiti | 1201.6372 | 60.916 |
Honduras | 3548.3308 | 70.198 |
Hong Kong, China | 39724.9787 | 82.208 |
Hungary | 18008.9444 | 73.338 |
Iceland | 36180.7892 | 81.757 |
India | 2452.2104 | 64.698 |
Indonesia | 3540.6516 | 70.650 |
Iran | 11605.7145 | 70.964 |
Iraq | 4471.0619 | 59.545 |
Ireland | 40675.9964 | 78.885 |
Israel | 25523.2771 | 80.745 |
Italy | 28569.7197 | 80.546 |
Jamaica | 7320.8803 | 72.567 |
Japan | 31656.0681 | 82.603 |
Jordan | 4519.4612 | 72.535 |
Kenya | 1463.2493 | 54.110 |
Korea, Dem. Rep. | 1593.0655 | 67.297 |
Korea, Rep. | 23348.1397 | 78.623 |
Kuwait | 47306.9898 | 77.588 |
Lebanon | 10461.0587 | 71.993 |
Lesotho | 1569.3314 | 42.592 |
Liberia | 414.5073 | 45.678 |
Libya | 12057.4993 | 73.952 |
Madagascar | 1044.7701 | 59.443 |
Malawi | 759.3499 | 48.303 |
Malaysia | 12451.6558 | 74.241 |
Mali | 1042.5816 | 54.467 |
Mauritania | 1803.1515 | 64.164 |
Mauritius | 10956.9911 | 72.801 |
Mexico | 11977.5750 | 76.195 |
Mongolia | 3095.7723 | 66.803 |
Montenegro | 9253.8961 | 74.543 |
Morocco | 3820.1752 | 71.164 |
Mozambique | 823.6856 | 42.082 |
Myanmar | 944.0000 | 62.069 |
Namibia | 4811.0604 | 52.906 |
Nepal | 1091.3598 | 63.785 |
Netherlands | 36797.9333 | 79.762 |
New Zealand | 25185.0091 | 80.204 |
Nicaragua | 2749.3210 | 72.899 |
Niger | 619.6769 | 56.867 |
Nigeria | 2013.9773 | 46.859 |
Norway | 49357.1902 | 80.196 |
Oman | 22316.1929 | 75.640 |
Pakistan | 2605.9476 | 65.483 |
Panama | 9809.1856 | 75.537 |
Paraguay | 4172.8385 | 71.752 |
Peru | 7408.9056 | 71.421 |
Philippines | 3190.4810 | 71.688 |
Poland | 15389.9247 | 75.563 |
Portugal | 20509.6478 | 78.098 |
Puerto Rico | 19328.7090 | 78.746 |
Reunion | 7670.1226 | 76.442 |
Romania | 10808.4756 | 72.476 |
Rwanda | 863.0885 | 46.242 |
Sao Tome and Principe | 1598.4351 | 65.528 |
Saudi Arabia | 21654.8319 | 72.777 |
Senegal | 1712.4721 | 63.062 |
Serbia | 9786.5347 | 74.002 |
Sierra Leone | 862.5408 | 42.568 |
Singapore | 47143.1796 | 79.972 |
Slovak Republic | 18678.3144 | 74.663 |
Slovenia | 25768.2576 | 77.926 |
Somalia | 926.1411 | 48.159 |
South Africa | 9269.6578 | 49.339 |
Spain | 28821.0637 | 80.941 |
Sri Lanka | 3970.0954 | 72.396 |
Sudan | 2602.3950 | 58.556 |
Swaziland | 4513.4806 | 39.613 |
Sweden | 33859.7484 | 80.884 |
Switzerland | 37506.4191 | 81.701 |
Syria | 4184.5481 | 74.143 |
Taiwan | 28718.2768 | 78.400 |
Tanzania | 1107.4822 | 52.517 |
Thailand | 7458.3963 | 70.616 |
Togo | 882.9699 | 58.420 |
Trinidad and Tobago | 18008.5092 | 69.819 |
Tunisia | 7092.9230 | 73.923 |
Turkey | 8458.2764 | 71.777 |
Uganda | 1056.3801 | 51.542 |
United Kingdom | 33203.2613 | 79.425 |
United States | 42951.6531 | 78.242 |
Uruguay | 10611.4630 | 76.384 |
Venezuela | 11415.8057 | 73.747 |
Vietnam | 2441.5764 | 74.249 |
West Bank and Gaza | 3025.3498 | 73.422 |
Yemen, Rep. | 2280.7699 | 62.698 |
Zambia | 1271.2116 | 42.384 |
Zimbabwe | 469.7093 | 43.487 |
What is the relationship between life expectancy and per capita GDP?
You have 30 seconds again…this time it should be much, much easier!
Graphics are critical at all stages of a project – from the initial data aquisition and exploration to the final product that is conveys results to other (colleagues, the public, …).
Graphics reveal patterns and features in data that statistics (e.g. mean, median, correlation) may fail to convey/capture.
Consider the four datasets that were constructed by the statistician Francis Anscombe
All of four datasets have the following statistics:
Property | Value | |
---|---|---|
Mean of x | 9 | |
Sample variance of x | 11 | |
Mean of y | 7.5 | |
Sample variance of y | 4.125 | |
Corr. between x and y | 0.816 | |
Linear regression line | y = 3.00 + 0.500x | |
Based on the above table you would be led to believe that the data look roughly the same.
Anscombe’s quartet (source: Wikipedia)
Take home message: You should examine your data graphically!
You will create graphics for many different purposes throughout this class and your career.
The style, detail, and level of refinement will be a function of your goals.
Good examples of these types of figures are found in:
R
R has excellent graphic making capabilities that allow you to create figures of the highest quality.
In fact many figures you see in scientific journals and in the popular press are made in R (many of the graphics in the NY Times are made in R!).
graphics
package that comes bundled with R.lattice
, ggplot2
packages)Most of the figures in this class will be made with the ggplot2
package.
R
In the this and upcoming lectures you will learn how to make static graphics in R.
You will learn fundamental concepts about how to visualize different types of data and how to generate these visualizations in R.
Later in the term we will cover how to make interactive (dynamic) graphics and basic maps in R.
ggplot2
packageThe ggplot2
package implements what is called the grammar of graphics. This is a system that describes a graph’s construction and complex graphs can be built by combining elements together much like you would construct a sentence in a natural language.
ggplot2
is widely used (you’ve already been using) and there are countless learning resources freely available.
ggplot2
to teach data visualizaitonggplot2
packageThe grammar of graphics is based on the concept that1:
A graphic is created by
mapping
thedata
variables to theaes
thetic attributs ofgeom
etric objects.
The three essential components of a graphic are:
data
: dataset containing the mapped variablesgeom
: geometric object that the data is mapped to (e.g. point, lines, bars, …)aes
: aesthetic attributes of the geometric object. The aesthetics control how the data variables are mapped to the geometric objects (e.g. x/y position, size, shape, color, …)Additional components that can be added include:
ggplot2
packageThe basic template for creating a graphic in ggplot2
is
geom
function you want to use (e.g. geom_point()
)geom
(e.g. x = gdpPercap, y = lifeExp, color = continent
)Minard’s illustration of Napolean’s March (source: Wikipedia)
Minard’s illustration of Napolean’s March (source: Wikipedia)
Let’s see how we would construct Minard’s figure using the grammar of graphics
Where? | data variable | aes() | geom_ |
---|---|---|---|
top map | longitude | x | path |
" | latitude | y | path |
" | army size | size | path |
" | army direction (forward vs retreat) | color | path |
bottom graph | date | x | line and text |
" | temperature | y | line and text |
See md chapter 3.↩︎