Introduce ggplot2 & today’s data sets (Exercise 1)
Understand the basics of ggplot2 (Exercise 2, 3)
Geom functions (Exercise 4-8)
Geom_point properties (Exercise 9)
Position adjustments (Exercise 10)
Coordinate and Scale Functions (Exercise 11)
Zooming into a plot (Exercise 12)
Faceting (Exercise 13)
Themes (Exercise 14)
Labels & Legends (Exercise 15)
Adding lines to plots (Exercise 16)
Making multi-panel figures (Exercise 17)
Saving a plot (Exercise 18)
ggplot2
ggplot2 is based on the “grammar of graphics”, the idea that you can build every graph from the same components: a data set, a coordinate system, and “geoms” - visual marks that represent data points.
Today’s datasets
In this class, we will use a data set from ggplot2: diamonds contains thousands of gem prices and qualities.
There are many interesting data sets you can install as R packages for learning to manipulate and plot data:
data: carat, cut, color, clarity, depth, table, price, x, y, z
[53940x10]
mapping: x = ~carat, y = ~price, colour = ~cut, size = ~carat
faceting: <ggproto object: Class FacetNull, Facet, gg>
compute_layout: function
draw_back: function
draw_front: function
draw_labels: function
draw_panels: function
finish_data: function
init_scales: function
map_data: function
params: list
setup_data: function
setup_params: function
shrink: TRUE
train_scales: function
vars: function
super: <ggproto object: Class FacetNull, Facet, gg>
-----------------------------------
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
ggplot is powerfully simple for making complex plots
shape, size, fill, color, and transparency - Exercise 9
R has 25 built in shapes that are identified by numbers.
Some are similar: 0, 15, and 22 are all squares, but interact differently with color and fill aesthetics.
Hollow shapes have a border determined by color, solid shapes (15-18) are filled with color, an the filled shapes (21-24) have color border and fill inside.
Note that aesthetics can also be defined within a geoms.
This is useful if you use two different geoms that share an aesthetic.
We won’t go into these functions too much today, but here is a brief overview:
The coordinate system determines how the x and y aesthetics combine to position elements in the plot. The default coordinate system is Cartesian ( coord_cartesian() ), which can be tweaked with coord_map() , coord_fixed() , coord_flip() , and coord_trans() , or completely replaced with coord_polar()
Scales control the details of how data values are translated to visual properties. There are 20+ scale functions. We will look at one; the ggplot2 cheatsheet is your friend for the rest.
Logarithmic axes - 1
Note the difference between axis labels in these two examples.
We’re not going to cover it, but you can also customize pre-existing themes.
Labels & Legends - Exercise 15
Use labs() to add / change plot labels.
ggplot(data = diamonds,mapping =aes(x = cut,y =log(price),fill = cut )) +geom_boxplot() +labs(x ="Cut",y ="Price (log)",color ="Cut",title ="Distribution of diamond prices by cut",subtitle ="Data come from a random sample of 1000 diamonds",caption ="Source: diamonds dataset from ggplot2" ) +theme_cowplot()
Additional points
How to add a line to a plot? (Exercise 16)
How to combine multiple plots into a figure? (Exercise 17)
More information on using plot_grid (from package cowplot) is here
Saving plots (Exercise 18)
Saves last plot as 5’ x 5’ file named “plot_final.png” in working directory. Matches file type to file extension