Introduce ggplot2 & today’s data sets (Exercise 1)
Understand the basics of ggplot2 (Exercise 2, 3)
Geom functions (Exercise 4-8)
Geom_point properties (Exercise 9)
Position adjustments (Exercise 10)
Coordinate and Scale Functions (Exercise 11)
Zooming into a plot (Exercise 12)
Faceting (Exercise 13)
Themes (Exercise 14)
Labels & Legends (Exercise 15)
Adding lines to plots (Exercise 16)
Making multi-panel figures (Exercise 17)
Saving a plot (Exercise 18)
ggplot2
ggplot2 is based on the “grammar of graphics”, the idea that you can build every graph from the same components: a data set, a coordinate system, and “geoms” - visual marks that represent data points.
Today’s datasets
In this class, we will use a data set from ggplot2: diamonds contains thousands of gem prices and qualities.
There are many interesting data sets you can install as R packages for learning to manipulate and plot data:
data: carat, cut, color, clarity, depth, table, price, x, y, z
[53940x10]
mapping: x = ~carat, y = ~price, colour = ~cut, size = ~carat
faceting: <ggproto object: Class FacetNull, Facet, gg>
compute_layout: function
draw_back: function
draw_front: function
draw_labels: function
draw_panels: function
finish_data: function
init_scales: function
map_data: function
params: list
setup_data: function
setup_params: function
shrink: TRUE
train_scales: function
vars: function
super: <ggproto object: Class FacetNull, Facet, gg>
-----------------------------------
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
ggplot is powerfully simple for making complex plots
shape, size, fill, color, and transparency - Exercise 9
R has 25 built in shapes that are identified by numbers.
Some are similar: 0, 15, and 22 are all squares, but interact differently with color and fill aesthetics.
Hollow shapes have a border determined by color, solid shapes (15-18) are filled with color, an the filled shapes (21-24) have color border and fill inside.
Note that aesthetics can also be defined within a geoms.
This is useful if you use two different geoms that share an aesthetic.
We won’t go into these functions too much today, but here is a brief overview:
The coordinate system determines how the x and y aesthetics combine to position elements in the plot. The default coordinate system is Cartesian ( coord_cartesian() ), which can be tweaked with coord_map() , coord_fixed() , coord_flip() , and coord_trans() , or completely replaced with coord_polar()
Scales control the details of how data values are translated to visual properties. There are 20+ scale functions. We will look at one; the ggplot2 cheatsheet is your friend for the rest.
Logarithmic axes - 1
Note the difference between axis labels in these two examples.
We’re not going to cover it, but you can also customize pre-existing themes.
Labels & Legends - Exercise 15
Use labs() to add / change plot labels.
ggplot(data = diamonds,mapping =aes(x = cut,y =log(price),fill = cut )) +geom_boxplot() +labs(x ="Cut",y ="Price (log)",color ="Cut",title ="Distribution of diamond prices by cut",subtitle ="Data come from a random sample of 1000 diamonds",caption ="Source: diamonds dataset from ggplot2" ) +theme_cowplot()
Additional points
How to add a line to a plot? (Exercise 16)
p <-ggplot(data = diamonds_subset,mapping =aes(x = carat, y = price, color = cut)) +geom_point(alpha =0.8) +theme_cowplot()p +geom_line()
p +geom_hline(aes(yintercept =5000))
p +geom_vline(aes(xintercept =2))
p +geom_smooth(method = lm)
`geom_smooth()` using formula = 'y ~ x'
p +geom_abline(aes(intercept =0.5, slope =5000))
How to combine multiple plots into a figure? (Exercise 17)
plot1 <- pplot2 <- p +geom_hline(aes(yintercept =5000))plot3 <- p +geom_vline(aes(xintercept =2))plot4 <- p +geom_abline(aes(intercept =0.5, slope =5000))all_plots <-plot_grid( plot1, plot2, plot3, plot4,labels =c("A", "B", "C", "D"),nrow =2)all_plots
# we have 4 legends, which is too many - can they be removed?# Yes, but it is not exactly straightforwardlegend <-get_legend(plot1 +theme(legend.position ="bottom"))
Warning in get_plot_component(plot, "guide-box"): Multiple components found;
returning the first one. To return all, use `return_all = TRUE`.