R Bootcamp Problem Set 6

Author

Your name here

Published

October 21, 2024

Problem Set

Use the data files in the data/ directory to answer the questions.

For this problem set, you are allowed to help each other, but you are not allowed to post correct answers in slack.

The problem set is due 12pm on Sept 1.

Grading rubric

Everything is good: full points
Partially correct answer: depends on how many steps are correct
Reasonable attempt: half points

Libraries

Load the libraries you need for analysis below.

library(tidyverse)

-- Attaching core tidyverse packages ------------------------ tidyverse 2.0.0 --
v dplyr     1.1.3     v readr     2.1.4
v forcats   1.0.0     v stringr   1.5.0
v ggplot2   3.4.3     v tibble    3.2.1
v lubridate 1.9.2     v tidyr     1.3.0
v purrr     1.0.2     
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
i Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(here)

here() starts at /Users/jayhesselberth/devel/rnabioco/molb-7950

library(palmerpenguins)

Question 1 - 5 points

Run the following chunk:

set.seed(42)
x <- sample(1000, replace = TRUE)

Now use logical indexing to find the number of values > 450 in x.

# either are correct
sum(x > 450)

[1] 566

length(x[x > 450])

[1] 566

Question 2 - 5 points

Count the number of species in the penguins tibble using forcats::fct_count()

Count number of island + sex combinations using dplyr::count(), and sort the result by count.

forcats::fct_count(penguins$island)

# A tibble: 3 x 2
  f             n
  <fct>     <int>
1 Biscoe      168
2 Dream       124
3 Torgersen    52

dplyr::count(penguins, island, sex, sort = TRUE)

# A tibble: 9 x 3
  island    sex        n
  <fct>     <fct>  <int>
1 Biscoe    male      83
2 Biscoe    female    80
3 Dream     male      62
4 Dream     female    61
5 Torgersen female    24
6 Torgersen male      23
7 Biscoe    <NA>       5
8 Torgersen <NA>       5
9 Dream     <NA>       1

Question 3 - 5 points

Use stringr::str_c() to combine upper and lowercase letters from letters and LETTERS with a slash.

Your answer should look like: "A/a" "B/b" "C/c" etc.

combined <- str_c(LETTERS, "/", letters)

Use stringr::str_split() or one of its variants to split up the strings you made above and extract the letter after the slash.

str_split_i(combined, "/", 2)

 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"

str_split(combined, "/") |>
  purrr::map(2) |>
  unlist()

 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"

Question 4 - 5 points

Create a ggplot using the diamonds data set.

Generate a ggridges::geom_density_ridges() for the prices, with a different fill color for each cut.
Recolor the densities using ggplot2::scale_fill_brewer(), choosing a specific palette.
make the outline of the densities black, and change their alpha to 0.2.
change the theme to cowplot::theme_minimal_grid()
remove the legend (google: “remove legend from ggplot2”)
add an informative title and subtitle using ggplot2::labs().

library(ggridges)
library(cowplot)


Attaching package: 'cowplot'

The following object is masked from 'package:lubridate':

    stamp

ggplot(
  diamonds,
  aes(
    x = price,
    y = cut,
    fill = cut
  )
) +
  geom_density_ridges(color = "black", alpha = 0.2) +
  scale_fill_brewer(palette = "Set2") +
  theme_minimal_grid() +
  theme(legend.position = "none")

Picking joint bandwidth of 458

Submit

Be sure to click the “Render” button to render the HTML output.

Then paste the URL of this Posit Cloud project into the problem set on Canvas.