Dies ist das Newsblog des Sprachwissenschaftlichen Instituts an der Ruhr-Universität Bochum.





Ruhr-Universität Bochum
Sprachwissenschaftliches Institut



Powered by PivotX - 2.3.11 
XML-Feed (RSS 1.0) 
XML: Atom Feed 

« Vortrag von Peter Aue… | home | Vortrag von James Hen… »

Vortrag von Tristan Miller am Dienstag, 17.10.2017 (16:00 Uhr - GB 3/159)

Mittwoch, 04. Oktober 2017. Aus der Kategorie 'Vortragsreihe'.

Das Sprachwissenschaftliche Institut lädt herzlich ein zum Vortrag von

Tristan Miller (Technische Universität Darmstadt)

Sense-based clustering for the interpretation of humorous ambiguity.

Word sense disambiguation (WSD) – the task of determining which meaning a word carries in a particular context – is a core research problem in computational linguistics. Though it has long been recognized that supervised approaches to WSD can yield impressive results, they require an amount of manually annotated training data that is often too expensive or impractical to obtain. This is a particular problem for processing the sort of lexical-semantic anomalies employed for deliberate effect in humour and wordplay. In contrast to supervised systems are knowledge-based techniques, which rely only on pre-existing lexical-semantic resources (LSRs) such as dictionaries and thesauri. In this talk, we treat the task of extending the efficacy and applicability of knowledge-based WSD, both generally and for the particular case of English puns. In the first part of the talk, we present two approaches for bridging the “lexical gap” problem and thereby improving WSD coverage and accuracy. In the first approach, we supplement the word’s context and the LSR’s sense descriptions with entries from a distributional thesaurus. The second approach enriches an LSR's sense information by aligning and clustering the senses to those of other, complementary LSRs.  In the second part of the talk, we describe how these techniques, along with evaluation methodologies from traditional WSD, can be adapted for the “disambiguation” of puns, or rather for the automatic identification of their double meanings.