MOLB 7950 Syllabus
Course Overview
MOLB 7950 is a hands-on tutorial of skills and theory needed to process, analyze, and visualize output from large biological data sets. We emphasize the R statistical computing environment.
🗓️ Class will run from Aug 26 - Oct 30
📍 Classes will be held in-person at locations found on the schedule page.
🕘 Class time is 9:00-10:30am
MOLB 7950 is a three credit hour course.
The course is divided into blocks:
Bootcamp
THe Bootcamp block covers R programming and introduces important statistical concepts and approaches. We will also cover data types you will encounter during biological data analysis and approaches for their analysis.
During the bootcamp block, we will meet everyday for 90 minutes to cover fundamental concepts you will need throughout the course.
Experimental blocks
After Bootcamp, will cover experimental approaches used to analyze DNA and RNA. Each block spans ~4 weeks, with each week focused on a particular type of experiment (see below). Each block covers statistical concepts needed for rigorous analysis and analysis approaches to process raw data to results (tables and figures) using reproducible coding techniques.
In most weeks we will discuss and analyze data from a publication. You are responsible for reading the week’s material before class begins on Monday.
Block experiments
The DNA block covers genome sequencing for identifying mutations, and two approaches for analyzing chromatin state (ChIP-seq and MNase-seq).
The RNA block covers RNA-seq, alternative splicing, differential gene expression, and RNA:protein interactions.
Schedule
Classes begin on August 26 and end on October 30. Dates are from the Fall 2024 Academic Calendar.
During the Bootcamp block, classes will be held every day, Mon-Fri from 9:00-10:30am.
During the DNA & RNA blocks, we will have in-class exercises and discussion on Mon-Wed-Fri 9:00-10:30am.
Location
Classes will be held in-person in a variety of different rooms. Please check the schedule page to see each class’s room assignment. All classes will be recorded and made available through Canvas.
Policies
Attendance
Class attendance is a firm expectation; frequent absences or tardiness are considered cause for a grade reduction.
if you are sick, please let us know (e-mail Srinivas and Matt) and stay home.
Anticipated absences outside of sickness should be reported to the instructors of a given block as soon as possible to make plans for possible accommodation.
We will record all lectures on Panopto and they will be available online through Canvas.
Late and missed work
We have a late work policy for homework assignments:
If a problem set set is late but within 24 hours of due date/time, the grade will be reduced by 50%
If a problem set is returned any later, no credit will be given.
All regrade requests must be discussed with the professor within one week of receiving your grade. There will be no grade changes after the final project.
Diversity & Inclusiveness
Our view is that students from all diverse backgrounds and perspectives will be well-served by this course, that students’ learning needs be addressed both in and out of class, and that the diversity that the students bring to this class iss a resource, strength, and benefit.
Disability Policy
Students with disabilities who need accommodations are encouraged to contact the Office of Disability, Access & Inclusion as soon as possible to ensure that accommodations are implemented in a timely fashion.
Honor code
Academic dishonesty will not be tolerated and is grounds for dismissal from the class with a failing grade (“F”). For other information, please consult the Graduate Student Handbook.
ChatGPT will probably be able to answer most coding questions you ask of it. While it is useful for fleshing out an initial approach from pseudocode, we do not recommend using it, as these conceptual approaches are an essential foundation for buildling expertise in bioinformatic analysis.
Problem Sets
Problem sets will be assigned at the end of each class.
You can use external resources but must explicitly cite where you have obtained code (both code you used directly and “paraphrased” code / code used as inspiration). Any reused code that is not explicitly cited will be treated as plagiarism.
You can discuss the content of assignments with others in this class. If you do so, you must acknowledge your collaborator(s) at the top of your assignment, for example: “Collaborators: Hillary and Bernie”. Failure to acknowledge collaborators will result in a grade of 0. You may not copy code and/or answers directly from another student. If you copy other work, both parties will receive a grade of 0.
The problem set with the lowest score for each student will be dropped.
Rather than copying someone’s work, ask for help. You are not alone in this course!
Professionalism
- Please refrain from texting or using your computer for anything other than coursework during class.
Assignments and Grading
The course measures learning through daily problem sets, a final project, and your participation.
Type | % of grade |
---|---|
Problem Sets | 60 |
Final Project | 20 |
Participation | 20 |
Grades will be assigned as follows:
Percent total points | Grade |
---|---|
>= 95 | A |
>= 90 | A- |
>= 85 | B+ |
>= 80 | B |
Problem sets
We reinforce concepts with problem sets assigned at the end of class that should take ~60 minutes to complete.
Problems sets assigned on Friday will be more substantial, requiring ~1-2 hours to complete.
Together the problem sets constitute 60% of your grade.
Assigned | Due | Grades By | Who grades | Time to complete (approx) |
---|---|---|---|---|
Mon @ 12pm | Tues @ 5pm | Wed @ 5pm | Instructors / TAs | 60 min |
Tue @ 12pm | Wed @ 5pm | Thurs @ 5pm | Instructors / TAs | 60 min |
Wed @ 12pm | Thurs @ 5pm | Fri @ 5pm | Instructors / TAs | 60 min |
Thurs @ 12pm | Fri @ 5pm | Tues @ 5pm | Instructors / TAs | 60 min |
Fri @ 12pm | Mon @ 5pm | Wed @ 5pm | Instructors / TAs | 1-2 hr |
Final projects
Final projects can be completed in groups of 1-3 people. Projects will involve analysis of existing public data sets and end with a short presentation the last week of class. The final project constitutes 20% of your grade.
Grading Rubrics
Problem Set Rubric
Problem sets are worth 60% of your grade. Values in parentheses represent point values for each level from 20 points total. This rubric will be assessed at the end of the semester.
Criteria | Expert | Competent | Needs Improvement |
---|---|---|---|
Coding style | Student has gone beyond what was expected and required, coding manual is followed, code is well commented | Coding style lacks refinement and has some errors, but code is readable and has some comments | Many errors in coding style, little attention paid to making the code human readable |
Coding strategy | Complicated problem broken down into sub-problems that are individually much simpler. Code is efficient, correct, and minimal. Code uses appropriate data structure (list, data frame, vector/matrix/array). Code checks for common errors | Code is correct, but could be edited down to leaner code. Some “hacking” instead of using suitable data structure. Some checks for errors. | Code tackles complicated problem in one big chunk. Code is repetitive and could easily be functionalized. No anticipation of errors. |
Presentation: graphs | Graph(s) carefully tuned for desired purpose. One graph illustrates one point | Graph(s) well chosen, but with a few minor problems: inappropriate aspect ratios, poor labels. | Graph(s) poorly chosen to support questions. |
Presentation: tables | Table(s) carefully constructed to make it easy to perform important comparisons. Careful styling highlights important features. | Table(s) generally appropriate but possibly some minor formatting deficiencies. | Table(s) with too many, or inconsistent, decimal places. Table(s) not appropriate for questions and findings. Major display problems. |
Achievement, mastery, cleverness, creativity | Student has gone beyond what was expected and required, e.g., extraordinary effort, additional tools not addressed by this course, unusually sophisticated application of tools from course. | Tools and techniques from the course are applied very competently and, perhaps,somewhat creatively. Chosen task was acceptable, but fairly conservative in ambition. | Student does not display the expected level of mastery of the tools and techniques in this course. Chosen task was too limited in scope. |
Ease of access for instructor, compliance with course conventions for submitted work | Access as easy as possible, code runs! | Satisfactory | Not an earnest effort to reduce friction and comply with conventions and/or code does not run |
Participation rubric
Attendance & participation is worth 20% of your grade. Values in parentheses represent point values for each level from 20 points total. This rubric will be assessed at the end of the semester.
Criteria | Expert | Competent | Needs improvement |
---|---|---|---|
Attendance (physically present for class, or coordinating with instructor when absent) | Attends class regularly (5) | Attends most classes (4) | Attends some classes (0-3) |
Preparation (activities required for in-class participation, like surveys and software installation) | Completes requested activities prior to class (5) | Completes most requested activities prior to class, sometimes needs to finish during class (4) | Rarely completes requested activities prior to class, often takes class time to complete (0-3) |
Engagement (in-class activities like coding exercises and discussion) | Actively engages in class activities (10) | Sometimes engages in class activities (8) | Doesn’t engage in class activities (0-7) |
Acknowldgements & Attribution
Instructor contributions
Several people have contributed to course development over the past several years.
- Sujatha Jagannathan contributed the original R bootcamp material.
- Srinivas Ramachandran contributed material for the DNA block, including lecture material and examples for yeast chromatin accessibility and factor mapping.
- Matt Taliaferro contributed material for the RNA block, including lecture material and examples for RNA expression and splicing analysis.
- Kent Riemondy and Kristen Wells contributed material for single-cell RNA sequencing.
- Jay Hesselberth and Neel Mukherjee revamped much of this material in Fall 2023.
External resources
We have borrowed from several (open licensed) resources for course content, including:
- Stats 545 at UBC, particularly their grading rubrics
- Courses from Mine Çetinkaya-Rundel, particularly inspiration for quarto websites