Shravan Vasishth

Professor, Dept. of Linguistics, University of Potsdam, 14476 Potsdam, Germany
Speaker, Language Cluster, Cognitive Science
Phone: +49-(0)331-977-2950 | Fax: - 2087 | Email: vasishth at uni-potsdam.de
GPG public key, Orcid ID, google scholar, github, bitbucket, statistics blog, vasishth lab blog
Back to home page

Bayesian Linear Mixed Models using Stan: A tutorial for psychologists, linguists, and cognitive scientists

by Tanner Sorensen, Sven Hohenstein, Shravan Vasishth, In Press: Quantitative Methods for Psychology
The published paper is available here.
You can download all the source code from here. Please look at the doc directory for the paper and run the code embedded in the Rnw file.

The reader may also benefit from this more introductory review of Bayesian methods intended for linguists and psychologists: Statistical methods for linguistic research: Foundational Ideas - Part II. Part I covers frequentist methods (it is in press with Language and Linguistic Compass) and is available here.

Further examples

  1. The RePsychLing package, by Bates, Kliegl, Vasishth, and Baayen has examples of linear mixed models using Stan. See, for example, the R markdown file KBStan.Rmd in vignettes; the code here is the most efficient of the examples shown as the computations are done in matrix form.
  2. More Stan and JAGS examples are here.

Papers we have published (or submitted or in press) using Stan and JAGS

[1]
Stefan L. Frank, Thijs Trompenaars, and Shravan Vasishth. Cross-linguistic differences in processing double-embedded relative clauses: Working-memory constraints or language statistics? Cognitive Science, page n/a, 2015. accepted pending minor revisions. [ code | pdf ]
An English double-embedded relative clause from which the middle verb is omitted can often be processed more easily than its grammatical counterpart, a phenomenon known as the grammaticality illusion. This effect has been found to be reversed in German, suggesting that the illusion is language specific rather than a consequence of universal working memory constraints. We present results from three self-paced reading experiments which show that Dutch native speakers also do not show the grammaticality illusion in Dutch, whereas both German and Dutch native speakers do show the illusion when reading English sentences. These findings provide evidence against working memory constraints as an explanation for the observed effect in English. We propose an alternative account based on the statistical patterns of the languages involved. In support of this alternative, a single recurrent neural network model that is trained on both Dutch and English sentences indeed predicts the cross-linguistic difference in grammaticality effect.

[2]
Philip Hofmeister and Shravan Vasishth. Distinctiveness and encoding effects in online sentence comprehension. 5:1-13, 2014. Article 1237. [ DOI | code | pdf ]
In explicit memory recall and recognition tasks, elaboration and contextual isolation both facilitate memory performance. Here, we investigate these effects in the context of sentence processing: targets for retrieval during online sentence processing of English object relative clause constructions differ in the amount of elaboration associated with the target noun phrase, or the homogeneity of superficial features (text color). Experiment 1 shows that greater elaboration for targets during the encoding phase reduces reading times at retrieval sites, but elaboration of non-targets has considerably weaker effects. Experiment 2 illustrates that processing isolated superficial features of target noun phrases—here, a green word in a sentence with words colored white—does not lead to enhanced memory performance, despite triggering longer encoding times. These results are interpreted in the light of the memory models of Nairne, 1990, 2001, 2006, which state that encoding remnants contribute to the set of retrieval cues that provide the basis for similarity-based interference effects.

[3]
Samar Husain, Shravan Vasishth, and Narayanan Srinivasan. Strong Expectations Cancel Locality Effects: Evidence from Hindi. PLoS ONE, 9(7):1-14, 2014. [ code | pdf ]
Expectation-driven facilitation (Hale, 2001; Levy, 2008) and locality-driven retrieval difficulty (Gibson, 1998, 2000; Lewis & Vasishth, 2005) are widely recognized to be two critical factors in incremental sentence processing; there is accumulating evidence that both can influence processing difficulty. However, it is unclear whether and how expectations and memory interact. We first confirm a key prediction of the expectation account: a Hindi self-paced reading study shows that when an expectation for an upcoming part of speech is dashed, building a rarer structure consumes more processing time than building a less rare structure. This is a strong validation of the expectation-based account. In a second study, we show that when expectation is strong, i.e., when a particular verb is predicted, strong facilitation effects are seen when the appearance of the verb is delayed; however, when expectation is weak, i.e., when only the part of speech “verb” is predicted but a particular verb is not predicted, the facilitation disappears and a tendency towards a locality effect is seen. The interaction seen between expectation strength and distance shows that strong expectations cancel locality effects, and that weak expectations allow locality effects to emerge.

[4]
Shravan Vasishth, Zhong Chen, Qiang Li, and Gueilan Guo. Processing Chinese Relative Clauses: Evidence for the Subject-Relative Advantage. PLoS ONE, 8(10):1-14, 10 2013. [ code | pdf ]
A general fact about language is that subject relative clauses are easier to process than object relative clauses. Recently, several self-paced reading studies have presented surprising evidence that object relatives in Chinese are easier to process than subject relatives. We carried out three self-paced reading experiments that attempted to replicate these results. Two of our three studies found a subject-relative preference, and the third study found an object-relative advantage. Using a random effects Bayesian meta-analysis of fifteen studies (including our own), we show that the overall current evidence for the subject-relative advantage is quite strong (approximate posterior probability of a subject-relative advantage given the data: 78–80%). We argue that retrieval/integration based accounts would have difficulty explaining all three experimental results. These findings are important because they narrow the theoretical space by limiting the role of an important class of explanation—retrieval/integration cost—at least for relative clause processing in Chinese.

[5]
Dario Paape and Shravan Vasishth. Local coherence and preemptive digging-in effects in German. MS, accepted pending minor revisions, Language and Speech, 2015.
SOPARSE (Tabor & Hutchins, 2004) predicts so-called local coherence effects: locally plausible but globally impossible parses of substrings can exert a distracting influence during sentence processing. Additionally, it predicts digging-in effects: the longer the parser stays committed to a particular analysis, the harder it becomes to inhibit that analysis. We investigated the interaction of these two predictions using German sentences. Results from a self-paced reading study show that the processing difficulty caused by a local coherence can be reduced by first allowing the globally correct parse to become entrenched, which supports SOPARSE’s assumptions.

[6]
Samar Husain, Shravan Vasishth, and Narayanan Srinivasan. Integration and prediction difficulty in Hindi sentence comprehension: Evidence from an eye-tracking corpus. Journal of Eye Movement Research, 8(2):1-12, 2015. [ pdf ]
This is the first attempt at characterizing reading difficulty in Hindi using naturally occurring sentences. We created the Potsdam-Allahabad Hindi Eyetracking Corpus by recording eye-movement data from 30 participants at the University of Allahabad, India. The target stimuli were 153 sentences selected from the beta version of the Hindi-Urdu treebank. We find that word- or low-level predictors (syllable length, unigram and bigram frequency) affect first-pass reading times, regression path duration, total reading time, and outgoing saccade length. An increase in syllable length results in longer fixations, and an increase in word unigram and bigram frequency leads to shorter fixations. Longer syllable length and higher frequency lead to longer outgoing saccades. We also find that two predictors of sentence comprehension diffi- culty, integration and storage cost, have an effect on reading difficulty. Integration cost (Gibson, 2000) was approximated by calculating the distance (in words) between a dependent and head; and storage cost (Gibson, 2000), which measures difficulty of maintaining predictions, was estimated by counting the number of predicted heads at each point in the sentence. We find that integration cost mainly affects outgoing saccade length, and storage cost affects total reading times and outgoing saccade length. Thus, word-level predictors have an effect in both early and late measures of reading time, while predictors of sentence comprehension difficulty tend to affect later measures. This is, to our knowledge, the first demonstration using eye-tracking that both integration and storage cost influence reading difficulty.


Please send comments and corrections regarding this page to : vasishth SQUIGGLE ling dot uni hyphen potsdam dot de.