-- Attaching core tidyverse packages ------------------------ tidyverse 2.0.0 --
v dplyr 1.1.3 v readr 2.1.4
v forcats 1.0.0 v stringr 1.5.0
v ggplot2 3.4.3 v tibble 3.2.1
v lubridate 1.9.2 v tidyr 1.3.0
v purrr 1.0.2
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
i Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Rows: 100 Columns: 7
-- Column specification --------------------------------------------------------
Delimiter: ","
chr (1): ensembl_transcript_id
dbl (6): rna_0h_rep1, rna_0h_rep2, rna_0h_rep3, rna_14h_rep1, rna_14h_rep2, ...
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Question 2
Explore the dataset. Is this dataset tidy? If not, why not?
This data frame is a subset (100 lines) of transcript-level gene expression data where transcript abundance was measured at two different time points of a certain treatment conducted in triplicates. The column names have the format of molecule_time_replicate
First, explore the structure of the dataset using some of the functions we learned in class.
Comment on whether this dataset is tidy, and if not, list the reasons why. Hint: In a tidy dataframe, every column represents a single variable and every row represents a single observation
Answer
It is not tidy because the time points and replicates are not in their own columns.
Question 3
How will you reshape the data frame so that each row has only one experimental observation?