R Bootcamp - Day 2 - Exercises

Author

Jay Hesselberth

Published

October 21, 2024

Tidying exercises

Data sets for tidying - Exercise 2

  • Explore table1, table2, table3, table4a, table4b, and table5, which all display the number of TB cases documented by the World Health Organization in Afghanistan, Brazil, and China between 1999 and 2000.

Getting familiar with the data - Exercise 3

R provides many functions to examine features of a data object

  • View() - To open the table up in an excel-like interface - not recommended for large tables

  • class() - what kind of object is it (high-level)?

  • typeof() - what is the object’s data type (low-level)?

  • is_tibble() - use is.? to confirm data type

  • str() - what is the structure of the object?

  • attributes() - does it have any metadata?

  • Let’s explore table1

Getting familiar with the data - Exercise 4

Some of the useful functions for data.frames / tibbles are as follows:

  • head() - shows first 6 rows
  • tail() - shows last 6 rows
  • dim() - returns the dimensions of data frame (i.e. number of rows and number of columns)
  • nrow() - number of rows
  • ncol() - number of columns
  • names() or colnames() - both show the names attribute for a data frame
  • glimpse()

Exercise 5 -Getting familiar with data - summary, hist, & table

summary: A generic function used to produce result summaries of the results of various model fitting functions.

hist: Takes in a vector of values and plots a histogram.

table: Uses the cross-classifying factors to build a contingency table of the counts at each combination of factor levels.

pivot_wider - Exercise 6

What will the output look like?

If you want to save the output, assign it to a new variable. This new variable will appear in your Environment tab.

pivot_longer - Exercise 7

What will the output look like?

separate - Exercise 8

What will the output look like?

separate_rows - Exercise 9

unite - Exercise 10

Missing values