Vortrag von Stefan Evert am Donnerstag, 20.01.2011, 16-18 Uhr

Montag, 17. Januar 2011. Aus der Kategorie 'Vortragsreihe'. Das Sprachwissenschaftliche Institut lädt ein zu dem Vortrag von Stefan Evert (Osnabrück): Distributional Semantic Models and Language Change --
Distributional semantic models (DSM) -- also known as "word space" or "distributional similarity" models -- are based on the assumption that the meaning of a word can (at least to a certain extent) be inferred from its usage, i.e. its distribution in text. Therefore, these models dynamically build semantic representations -- in the form of high-dimensional vector spaces -- through a statistical analysis of the contexts in which words occur. DSMs are a promising technique for solving the lexical acquisition bottleneck by unsupervised learning, and their distributed representation provides a cognitively plausible, robust and flexible architecture for the organisation and processing of semantic information.

Despite renewed interest and rapidly growing research activity in recent years -- as shown by the series of workshops on DSM at Context 2007, ESSLLI 2008, EACL 2009, CogSci 2009, NAACL-HLT 2010, ACL 2010, ESSLLI 2010 and ACL 2011 -- our understanding of the semantic knowledge encoded by DSMs, their full potential and their limitations is still incomplete. While these models have successfully been used in a wide range of NLP applications, it is therefore unclear whether distributional semantics can also provide new and valuable information for linguistic theory.

The first part of my talk is a general introduction to distributional semantics, including an overview of the many parameters of DSMs, the range of distributional representations they offer, as well as typical DSM applications and evaluation tasks.

In the second part of the talk, I present recent joint work with Cristina Sanchez Marco (UPF Barcelona) on semantic change in Spanish participial constructions. This study is based on a diachronic corpus of more than 40 million words in 651 documents from the 12th to the 20th century, annotated with parts of speech and lemmata. In addition to tracking the occurrence frequency of different participial constructions in the corpus, we explore the use of DSM representations to detect changes in their meaning and to uncover possible causes of semantic change.