Advanced data analysis (taught by Shravan Vasishth)

Basic information

This graduate level course will be taught every winter. In 2013-14 the times are Mondays 12-2PM in the CIP Pool of Haus 14 in Golm. This course aims to teach bayesian data analysis for psycholinguistic research. I presuppose that you have taken my introductory course, which I teach in summer, and/or actively attended to Reinhold Kliegl's lectures. You will have some difficulty in doing this course if you: don't know how to do a single sample and a paired sample t-test in R, have never fit a linear mixed model using lmer, don't know off-hand what a sampling distribution of a sample means is, or what a confidence interval is, or can't (correctly) explain what a p-value is and what Type I, II error is and what statistical power is.
Note: I will review all these topics at the start of this course, this will help in case you have forgotten things. If you have never seen this review material before, it will be hard going.

Lecture Notes and Schedule

The moodle page is here.
The current version of the lecture notes is available here. Please note that I will be adding a chapter or two towards the end of this course. The homework assignments are in the lecture notes. I will spend one week on each topic id, except maybe Gibbs sampling and Metropolis-Hastings, which might take two weeks.
The schedule is as follows.
Topic id Content Data for class Homework
1 Introduction and review ch1 data set up JAGS and test it
2 Probability theory and probability distributions HW 1
3 Important distributions work through chapter
4 Jointly distributed random variables work through chapter
5 Maximum Likelihood Estimation play with optim etc.
6 Basics of bayesian statistics HW 2, 3, 4, 5
7 Gibbs sampling and the Metropolis-Hastings algorithm HW8
8 Using JAGS for modeling HW 6
9 Priors HW 7, 8
10 Regression models HW 9, 10
11 Linear mixed models HW 11, 12

Psycholinguistically interesting linear mixed models examples in JAGS and Stan

Here are some of the most frequently fit linear mixed models, written in JAGS and Stan. I use the Gibson and Wu 2012 dataset.
The data are here.
All JAGS and Stan code is here.

Grading

Grading is based on homework assignments (50%) and a project (50%). The project will consist of a complete bayesian data analysis of either your own research data or a dataset that I will provide.

In the classroom

I would greatly appreciate it if you do not chat with each other while I am talking or while we are discussing something in class. I expect active participation, which means asking questions and discussing the various issues that come up. Please do not use cell phones or computers to chat, message or call anyone, and do not check your email. Please come to class on time so that we can start punctually. Note: Deutsche Bahn is not a reliable system at any time of the year. Please make sure you leave home well in time.