Interface Corroboration

Formal semantic representations as a generic interface to parsing (abstractly, a linguistic API) are subject to constraints from multiple disciplines: (computational) linguistics capitalizes on representational (or explanatory) adequacy and compositional compatibility with syntax and morphology; formal logic has its focus on mathematical properties and support for logical inference; application development, finally, emphasizes more practical requirements—sufficiently detailed, yet easy-to-comprehend representations that are stable over time.

Reconciling such different points of view remains a major scientific challenge and a traditional barrier to wider uptake of semantic parsing for HLT applications. The project cannot fully resolve these issues, but it will seek to become a catalyst (and ‘clearing house’) for a recent surge of international interest in harmonizing and documenting semantic representations, for improved application support.

The different types of stakeholders are all represented among project partners, and additional key players (from DELPH-IN and beyond) will be invited to participate in a longer-term community effort towards semantic standardization. This effort will take a data-centered approach, based on a careful manual selection from UGC samples compiled in track (A), combined with a ‘catalog’ of frequent semantic phenomena. The data will be released publicly, challenging interested research groups to submit structural semantic annotation (manually constructed analyses or hand-corrected parser outputs). In a first round of investigation, these representations were aligned (at the level of sub-string positions) and provided the basis for an in-depth, by-invitation workshop.

Reflecting on this input, the project will prepare gold-standard semantic analyses for this data (focusing on English, but aiming for a cross-linguistic perspective), accompanied by descriptive summaries of core analyses and the underlying motivations—which, in some cases, may just point out stand-in solutions where a fully satisfactory analysis is not yet available. Annotation and documentation will be made available to external collaborators in a Wiki-based interface (linking to an on-line demonstrator for the parsers, for user experimentation); a phase of community critique and discussion will prepare a second topical workshop organized by the project. The final revision of gold-standard analyses and documentation and a more general summary of formal and application-oriented desiderata for computational semantics will comprise the final results of this track, serving as greatly improved interface documentation and as a foundation for mid- and long-term standardization.

By Stephan Oepen, Lilja Øvrelid
Published Sep. 27, 2012 1:13 PM - Last modified Sep. 27, 2012 1:22 PM