Discourse Marker Annotation

Many discourse connectives also have nondiscourse, or sentential readings. Therefore, for automatic discourse structure analysis, there arises a disambiguation problem even before the question of signaled discourse relation becomes relevant. We focused on a set of nine German connectives and characterize the task of determining their discourse/sentential reading. Starting from an analysis of the utility of state-of-the-art PoS taggers, we describe a series of experiments with training the Brill tagger for identifying connectives. Our results indicate that there is a relatively simple baseline approach, which retraining the tagger can in turn improve on, but not very much.
