Projects and Research Interests

This page lists the projects I'm involved in, and more generally the research areas I'm interested in. I've divided the stuff here into a thematic section, and a more technical section on the kinds of methods I'm using to approach these phenomena.

Since it turned out to be a rather longish list, here's a quick overview (with links to the sections in the text):

Thematically Ordered List of Projects and Interests

Dialogue

Generally speaking, I am interested in how people make sense of other people's (linguistic) actions, and how they produce (linguistic) actions that make sense to other people. The most interesting domain for such research in my opinion is what is also the most natural setting in which (linguistic) actions occur: dialogue.
In particular, I have looked or am looking at the following phenomena.
  • Fragments. For my doctoral thesis (submitted to the School of Informatics at the University of Edinburgh, and successfully defended in August 2003), I have developed an account of the interpretation of a certain kind of non-sentential utterance occuring in dialogue, namely one where the utterance, despite its `incomplete' syntactic form, is intended to convey a proposition, a question or a request. Perhaps the most prominent type of such utterance is the short answer, as in ``A: Who came to the party? --- B: Peter.'', but I have investigated many other types as well. (The thesis was supervised by Alex Lascarides and Claire Grover.)

    I've worked within the framework of SDRT (Asher & Lascarides 2003), an extension of "classical" DRT with rhetorical relations and ideas from AI-based pragmatics. SDRT started out life as a discourse- (i.e. text)-based theory, but has lately been extended to dialogue, and of these extension I have made use in my approach.

    Together with Alex Lascarides I've also worked on an implementation of SDRT called RUDI. The system interprets (certain aspects of) dialogue utterances from the domain of appointment scheduling; more specifically, it computes a representation that is a simplified version of an SDRS.

    [ The results of this research have been published as (Schlangen, Lascarides and Copestake, 2001), (Schlangen, 2002), (Schlangen and Lascarides, 2002), (Schlangen and Lascarides, 2003a), (Schlangen and Lascarides, 2003b), (Schlangen, Lascarides and Copestake, 2003). Some newer research on this topic has been published as (Schlangen, 2005).]

  • One part of the thesis I was never really happy with was the treatment of Clarification Ellipsis (e.g. "A: I talked to Peter. B: Peter Miller?"), and so since coming to Potsdam I've worked some more on this, and more generally on Clarification Requests (CRs). Using the models of communication by Clark (1996) and Alwood (1995) as a starting point, and relating them to SDRT's processing model, I've developed a detailed model of the possible causes for requesting clarification. I've also extended RUDI to deal with uncertain input and to produce clarifications. This is based on a generalized notion of confidence score, where confidence from speech recognition is combined with semantic / pragmatic confidences.
    [ This was published as (Schlangen, 2004). ]

    I was also supervising a Master's student here in Potsdam, Kepa Rodriguez, who did a corpus study of CRs in task-oriented spoken German dialogues. We found some nice correlations between intonation and interpretation of CRs.
    [ Published as (Rodriguez and Schlangen, 2004). ]

    This interest led us to propose a project on this topic to the EU "Marie Curie Action", which was positively reviewed and is now (Oct. 05) finally starting up. The aim is to investigate strategies for "Dealing with Uncertainty" (the official project title) about what the input was to a spoken dialogue system. Two post-docs (Andrea Corradini, formerly NISlab, Denmark, and Raquel Fernandez, formerly King's College London) are now working full-time on the project; I'm associated member and scientific coordinator.
    [ Dedicated website here. ]

  • Following up on the dialogue system side of my previous work, I'm also interested in dialogue management (DM) in dialogue systems. Together with Manfred Stede, I'm working on a text-based dialogue system that implements a novel DM strategy, which we call "Dialogue Management by Topic Structure". The idea basically is that the domain-knowledge of the system is encoded in some sort of an ontology, and that this knowledge also drives the dialogue through a sequence of local planning decisions that guide the user through the topic. We've called the system The Wanderer, and hope to have a prototype online soon.
    [ Published as (Stede and Schlangen, 2004). ]

  • While trying to build a computer system that talks, it occured to me that another aspect of dialogue that I previously had abstracted away from is actually quite interesting: time. Speakers can make use of it to mean things, and also have to coordinate their use of it. As an example of the former, I looked at significant silences, a use of time that produces non-linguistic events that mean something; as an example of the latter, I've done some machine learning experiments to work out features that help coordinate turn taking. The larger idea is to build a model (or rather, comunicating models) that integrate content and form management.
    [ The former was presented as an invited talk at "Constraints in Discourse 2005" (Schlangen 2005), which I will hopefully have the time to write up soon; the latter is a paper at Interspeech 2006 (Schlangen 2006). ]

Discourse

It's not all dialogue that glitters, though. Text (or discourse) still holds many challenges for computational linguistics, some of which are the following.
  • Rhetorical Parsing is the process of automatically deriving the rhetorical structure of a text. For a while I've worked on the ROSIE project, a cooperation between the University of Edinburgh and Stanford University, where we used RUDI to suggest rhetorical relations for annotation of dialogues. (See (Schlangen, Baldridge and Lascarides, 2003).) RUDI is a fully symbolic, "deep" processing system. Since coming here, I've become interested in used more shallow (and consequently hopefully more robust) methods for inferring rhetorical structure.
    Together with Manfred Stede and Michael Grabski I've written a project proposal (to be submitted to DFG, the German Research Council) for a project in this area.

  • Last, but not least, there is also the project that actually pays (half of) my rent: robust parsing of / information extraction from medical texts. In this project, a cooperation with the Charite (Berlin's research hospital) and the Freie Universität Berlin, we are working on autmatically parsing pathology reports into a semantic web representation. (This project is funded by DFG, the German Research Council.)
    Besides technical issues, there are some interesting research questions here (non-standard syntax, discourse phenomena, semantically / ontology-guided parsing, interfacing logical forms and OWL (semantic web language) etc.).
    [ Published as (Schlangen, Stede and Paslaru Bontas, 2004), (Schlangen, Hanneforth and Stede, 2005), (Paslaru Bontas, Schlangen and Niepage, 2005), (Paslaru Bontas, Schlangen and Schrader, 2005). ]

Methods

  • Formal Semantics and Automated Reasoning. Logics are among the best tools we have for modeling language, and also for processing it. (This is not restricted to semantics. A grammar-formalism needs a formal semantics as well; which is why I've prefered HPSG so far.)
    I am interested in using standard theorem provers for language processing tasks, such as those described above. For example, for computing rhetorical relations RUDI makes use of an implementation of a fragment of a non-monotonic logic called Common-Sense Entailment, which in turn makes use of a standard first order theorem prover. (See (Schlangen and Lascarides, 2002) for a description of the non-monotonic prover.)

  • Shallow Approaches. To my own surprise, I've become interested in shallow approaches to NLP as well. I've even run machine learning experiments. We'll see what will come out of this.. [Actually, an ACL paper is what came out of this: (Schlangen, 2005). And now, 2006, an Interspeech paper as well: (Schlangen 2006)]


[back]

[last changed - 07/07/2006]