Vortrag von Varada Kolhatkar am Dienstag, 03.02.2015 (16:00 Uhr - GB 3/159)

Freitag, 19. Dezember 2014. Aus der Kategorie 'Vortragsreihe'. Das Sprachwissenschaftliche Institut lädt ein zum Vortrag von

Varada Kolhatkar (Universität Hamburg)
Resolving Shell Nouns

Shell nouns are abstract nouns, such as fact, issue, idea,
and problem, which facilitate efficiency by avoiding repetition of long
stretches of text. An example is shown in (1). Shell nouns encapsulate
propositional content, and the process of identifying this content is
referred to as shell noun resolution.

(1) Living expenses are much lower in rural India than in New York, but this
fact is not fully captured if prices are converted with currency exchange

My research presents the first computational work on resolving shell nouns.
The research is guided by three primary questions: first, how an automated
process can determine the interpretation of shell nouns; second, the extent
to which knowledge derived from the linguistics literature can help in this
process; and third, the extent to which speakers of English are able to
interpret shell nouns.

I start with a pilot study to annotate and resolve occurrences
of this issue in the Medline abstracts. The results illustrate the
feasibility of annotating and resolving shell nouns, at least in this
closed domain. Next, I move to developing general algorithms to resolve
a variety of shell nouns in the newswire domain. The primary challenge was
that there was no annotated data available. I developed a number of
computational methods for resolving shell nouns that do not rely on manually
annotated data. For evaluation, I developed annotated corpora for shell
nouns and their content using crowdsourcing. The annotation results showed
that the annotators agreed to a large extent on the shell content.
The evaluation of resolution methods showed that knowledge derived from
the linguistics literature helps in the process of shell noun resolution, at
least for shell nouns with strict semantic and syntactic expectations.