Statistics 206: Introduction to Data Science

Course Description: Statistics 206 is an introduction to data science. We will cover most of the topics from a standard first-term statistics course, with additional emphasis on computing and data analysis. You will learn to think critically about the strengths and weaknesses of different types of data and of different approaches to analyzing data. You will also develop skills for managing and analyzing data using a programming language.

Computing: You will need access to a computer with a web browser and an internet connection. We will be using a cloud-based system for data analysis, so you will not need to install any software on your computer (except perhaps a vpn client). If you wish to take this class and do not have access to a computer, please contact the instructor.

Attendance and format: All lectures and labs for Stats 206 will take place in-person. There will be no recording or remote attendance option. Students should regularly attend lectures and labs.

Communication and participation: You will have plenty of opportunity to participate actively during lectures and labs. Questions and comments are always welcome. You may email the instructor with questions, but most questions about course content should be posted to the on-line forum (details TBA).

Labs: Regular in-person lab attendance is expected.

Exams: Two midterm exams will be held in-person during class time, on Tuesday, October 12th, and Thursday, November 18th. Students must be present in-class on those days to take the exam. The final exam is comprehensive of the entire semester, and will be held on Monday December 13th from 1:30-3:30, students must be present in-person to take the final exam. The exams will focus on concepts from statistics, probability, data analysis, and computing, and will not use computers. All exams will be closed-book.

Homework: You will have semi-regular homework, roughly one assignment every two weeks. These assignments will be completed using Jupyter notebooks in Python, and are submitted through Canvas. You may discuss the homeworks with other students but all code, calculations, and written responses must be your own. Any copying of code, mathematics, or text from another student or any other source is academic dishonesty.

Quizzes and homework: On-line quizzes will be administered during most weeks when no homework assignment is due (roughly every other week). You will have from 8:00 am until midnight EST to complete the quiz on a Tuesday when applicable; you will have a timed window (around 20 minutes) within this 16 hour period to complete the quiz. You will know beforehand which weeks will have quizzes. You may use any materials (course notes, materials from the internet, etc.) for the quizzes and homework. You must not discuss the quiz with any other person until the quiz solutions are posted.

Academic integrity: Any effort to cheat, gain an unfair advantage over another student, or obtain credit for other people’s work is a violation of academic integrity and will be handled strictly at the instructor’s discretion.

Grading: Your final grade will be based on the following components: three exams (20% each), homeworks (20%), quizzes (15%), lab attendance (5%). The lowest homework and lowest quiz grade will be dropped. We will approximately follow a straight grading scale, in which overall scores in the range 90-100 will receive A’s (A-/A/A+), scores between 80 and 89 will receive B’s (B-/B/B+), and scores between 70 and 79 will receive C’s (C-/C/C+). Scores below 70 will be assessed at the instructor’s discretion.

Office hours: We will hold joint office hours with the other section of the course. Students from either section can attend office hours with GSIs or Instructors from either section. The office hour schedule is to be determined.