Discourse Annotation

A workshop immediately following ACL '04 in Barcelona, Spain

July 25-26, 2004
Full Paper Submissions: March 22, 2004

Preliminary Program

A preliminary version of the program for the workshop is available here.

Workshop Overview

Advances in language technology draw on a combination of annotated empirical data and linguistic theory. The richer the annotation, the more that can potentially be learned and applied to unseen data. Thus the Penn TreeBank (PTB), with its part-of-speech (POS) tags and syntactic annotation, has been more useful than corpora annotated for POS-tags alone, and PropBank, in which PTB is annotated with predicate-argument relations, will be useful for more applications than the PTB alone.

Two gross features of PTB and PropBank are that they annotate sentence/clause-level features and that they were undertaken with communal agreement (albeit somewhat contentious at first). Similar, largely communal projects have been undertaken for dialogue annotation, including MATE (now NITE).

Discourse annotation (in contrast with sentence-level annotation) has taken a somewhat different course. While an early communal effort (DRI) to annotate discourse structure according to a consensus framework failed to achieve its goal, recognition remained of the value of discourse annotated corpora. The result has been that diverse grass-roots efforts have been producing individual corpora annotated for a wide variety of phenomena such as

Groups involved in these efforts appear to be using (or planning to use) these corpora for a range of applications that include: empirical testing of theoretical claims/hypotheses; supporting second-language acquisition of discourse-sensitive linguistic devices; training resolution procedures for co-referring expressions or other anaphors, that can be used in annotating additional texts or in supporting technologies such as information extraction, question answering, summarization, and/or text generation; training discourse parsers that can be used for annotating additional texts or for reducing the amount of manual effort needed in the process; and probabilistic sentence and text realization.

The workshop is neutral as to whether consensus annotation is possible for every type of discourse phenomenon. Its aims are rather to:

With these aims in mind, we solicit papers on: As well as for presentation, the papers will be used for structuring the above-mentioned small group discussions and feedback sessions.

Submissions are limited to original, unpublished work. Papers should be written in English.

Schedule

Paper submissions due at midnight GMT on March 22, 2004
Notification of acceptance for papers: April 30, 2004
Camera ready papers due: May 24, 2004
Workshop date: Jul 25-26, 2004

Co-chairs

Professor Bonnie Webber
School of Informatics
University of Edinburgh
2 Buccleuch Place
Edinburgh EH8 9LW
UK
email: bonnie@inf.ed.ac.uk
phone: +44 131 650 4190
fax: +44 131 650 4587
Professor Donna Byron
Dept. of Computer and Information Science
Ohio State University
395 Dreese Laboratory
2015 Neil Avenue
Columbus, Ohio   43210
USA
email: dbyron@cis.ohio-state.edu
phone: +1 614-292-6370
fax: +1 614-292-2911

donna byron
Last modified: Thu Jan 8 17:55:08 EST 2004