Many discourse connectives also have nondiscourse, or sentential
readings. Therefore, for automatic discourse structure analysis, there
arises a disambiguation problem even before the question of signaled
discourse relation becomes relevant. We focused on a set of
German connectives and characterize the task of determining their
discourse/sentential reading. Starting from an analysis of the utility
of state-of-the-art PoS taggers, we describe a series of experiments
with training the Brill tagger for identifying connectives. Our results
indicate that there is a relatively simple baseline approach, which
retraining the tagger can in turn improve on, but not very much.
For more information read our paper "Disambiguating potential connectives".
Example output [txt]
Back to pipeline