R Bootcamp Problem Set 6

Author

Your name here

Published

October 21, 2024

Problem Set

Use the data files in the data/ directory to answer the questions.

For this problem set, you are allowed to help each other, but you are not allowed to post correct answers in slack.

The problem set is due 12pm on Sept 1.

Grading rubric

  • Everything is good: full points
  • Partially correct answer: depends on how many steps are correct
  • Reasonable attempt: half points

Libraries

Load the libraries you need for analysis below.

-- Attaching core tidyverse packages ------------------------ tidyverse 2.0.0 --
v dplyr     1.1.3     v readr     2.1.4
v forcats   1.0.0     v stringr   1.5.0
v ggplot2   3.4.3     v tibble    3.2.1
v lubridate 1.9.2     v tidyr     1.3.0
v purrr     1.0.2     
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()
i Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
here() starts at /Users/jayhesselberth/devel/rnabioco/molb-7950

Question 1 - 5 points

Run the following chunk:

set.seed(42)
x <- sample(1000, replace = TRUE)

Now use logical indexing to find the number of values > 450 in x.

# either are correct
sum(x > 450)
[1] 566
length(x[x > 450])
[1] 566

Question 2 - 5 points

Count the number of species in the penguins tibble using forcats::fct_count()

Count number of island + sex combinations using dplyr::count(), and sort the result by count.

forcats::fct_count(penguins$island)
# A tibble: 3 x 2
  f             n
  <fct>     <int>
1 Biscoe      168
2 Dream       124
3 Torgersen    52
dplyr::count(penguins, island, sex, sort = TRUE)
# A tibble: 9 x 3
  island    sex        n
  <fct>     <fct>  <int>
1 Biscoe    male      83
2 Biscoe    female    80
3 Dream     male      62
4 Dream     female    61
5 Torgersen female    24
6 Torgersen male      23
7 Biscoe    <NA>       5
8 Torgersen <NA>       5
9 Dream     <NA>       1

Question 3 - 5 points

Use stringr::str_c() to combine upper and lowercase letters from letters and LETTERS with a slash.

Your answer should look like: "A/a" "B/b" "C/c" etc.

combined <- str_c(LETTERS, "/", letters)

Use stringr::str_split() or one of its variants to split up the strings you made above and extract the letter after the slash.

str_split_i(combined, "/", 2)
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"
str_split(combined, "/") |>
  purrr::map(2) |>
  unlist()
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"

Question 4 - 5 points

Create a ggplot using the diamonds data set.


Attaching package: 'cowplot'
The following object is masked from 'package:lubridate':

    stamp
ggplot(
  diamonds,
  aes(
    x = price,
    y = cut,
    fill = cut
  )
) +
  geom_density_ridges(color = "black", alpha = 0.2) +
  scale_fill_brewer(palette = "Set2") +
  theme_minimal_grid() +
  theme(legend.position = "none")
Picking joint bandwidth of 458

Submit

Be sure to click the “Render” button to render the HTML output.

Then paste the URL of this Posit Cloud project into the problem set on Canvas.