Syllabus

Statistics 504: Practice and communication in applied statistics #

Instructors #

Kerby Shedden (kshedden@umich.edu)

Octavio Mesner (omesner@umich.edu)

Overview #

The goal of this course is to provide Master’s level students with hands-on experience using a variety of techniques from modern applied statistics. Most course material is presented through case studies involving data drawn from various fields. Lectures will provide background about each case study along with discussion of relevant methodologies. Students will then conduct independent data analyses and produce brief written reports. Evaluation will be based on attaining insight from the data, effective communication of findings, and appropriate use of statistical methodology, as shown in the written reports.

Participation in class discussions is an essential part of the class. Regular attendance and active participation is expected from all students.

Some major themes of this course are:

  • Formulating meaningful and tractable questions based on research goals and consistent with available data

  • Devising, documenting, and implementing analysis strategies

  • Communicating findings

  • Leveraging knowledge of statistical foundations and theory when engaging in applied research

  • Understanding the capabilities and limitations of statistical methods

  • Understanding the value and limitations of different types of data

  • Interpretation of analytic results

  • Developing data manipulation and computing skills, especially for large and complex data sets

  • Building knowledge about several applied research areas where data-driven investigation plays a major role

Pre-requisites #

Students are expected to have mastered the essential foundations of statistics at the undergraduate and master’s levels, and should posess a solid understanding of ideas such as sampling, variation, bias, and uncertainty. Students should have substantial prior exposure to core statistical methods including regression and multivariate analysis.

Coursework #

Students will conduct independent analyses of datasets throughout the semester and write about their findings. Datasets, reading materials, and writing prompts will be provided by the instructor.

There is no textbook or other materials to purchase for the course.

Grading #

Writing assignments will be due roughly every 7-10 days, and will be submitted online using Google docs.

There are no exams for this class.

Writing assignments will generally be short (around 2 pages). See below for guidelines on what we are expecting in the writing assignments.

The instructors or GSI will read and evaluate every assignment, providing individual feedback. Assignments will be graded on a 100 point scale.

All assignments will count equally toward the final grade. Since assignments are submitted electronically and are announced at least 7 days in advance, late assignments will generally not be accepted. Exceptions to this policy may be made at the instructor’s discretion, but only in cases of serious and unanticipated personal emergencies.

All writing is to be done individually, and should primarily reflect each student’s own ideas. Students are welcome and encouraged to discuss statistical methods, coding strategies, and topics relating to the data and motivating questions for each assignment.

Plagiarism, including copying any material prepared by another person, copying from other students (with or without permission), or allowing other students to write all or part of your assignment, will be handled strictly following U-M policies regarding academic integrity.

Computing #

You are free to use whatever computing tools you choose. Code will not be submitted or evaluated in this course. The instructors will mainly use R or Python when providing code for illustration.

Most of the datasets used in this course will be large. In addition to statistical analyses, substantial data manipulations will be required. If you are using R, you will likely want to use the data.table or dplyr libraries. If you are using Python, you will likely want to use the Pandas library. However these are only suggestions and you are free to use whatever software and libraries you choose.

The code used to produce the analyses discussed in the course lectures will be available on the course Github site here.

Expectations for student writing #

This is both an applied statistics course and a technical communications course. The coursework will consist of writing and data analysis. Only the writing will be graded (you will not submit any code for evaluation). All statisticians must frequently communicate and document their findings in written form. Writing about your analytic plans and research findings, and writing reviews and critiques of other people’s writing are all excellent ways to organize your thoughts, strengthen your arguments, and identify weaknesses in your claims.

For most of the writing assignments in this course, you should imagine that you are writing a memo or email to be read by collaborators or colleagues. Your “audience” consists of people familiar with the data and scientific (or industrial) context behind the data, and who are also knowledgeable about statistics.

Below are some guidelines for writing in this course. These are a few of the most important things to keep in mind. We will expand on this a lot during the semester.

  • Write in an appropriate academic or business tone. Do not write informally or casually, but also avoid excessively formal language.

  • Organize your content so that your writing has a focused message, with each paragraph contributing distinctly to communicating this message.

  • Use simple, direct language. Avoid convoluted expressions, hidden or needlessly subtle meanings, unusual vocabulary, and digressions that do not contribute to your main message.

  • Express your arguments in as plain and simple terms as is practical. Do not write in a way that requires the reader to re-read your writing multiple times to understand your point. Favor short sentences in the active voice with one clause, and paragraphs that focus on one topic and do not extend for more than half a page.

  • Write for an audience that may include non-native English speakers. Avoid colloquialisms and obscure cultural references.

  • You are expected to be able to write in grammatically-correct English, appropriate for graduate level coursework. It is understood that there are many non-native English speakers in the course, and occasional minor grammatical issues will be overlooked.