Computational Pragmatics (CompPrag2016)

Workshop at the 38th Annual Conference of the German Linguistics Society (DGfS) in Konstanz, February 24-26, 2016.

Computational pragmatics can be understood in two different senses. First, it can be seen as a subfield of computational linguistics, in which it has a longer tradition. Example phenomena addressed in this tradition are: computational models of implicature, dialogue act planning, discourse structuring, coreference resolution (Bunt & Black 2000, and others). Second, it can refer to a rapidly growing field at the interface between linguistics, cognitive science and artificial intelligence. An example is the rational speech act model (Frank & Goodman 2012) which uses Bayesian methods for modeling cognitive aspects of the interpretation of sentence fragments and implicatures. Computational pragmatics is of growing interest to linguistic pragmatics, first, due to the availability of theories that are precise enough to form the basis of NLP systems (e.g. game theoretic pragmatics, SDRT, RST), and second, due to the additional opportunities which computational pragmatics provides for advanced experimental testing of pragmatic theories. As such, it enhances theoretical, experimental and corpus-based approaches to pragmatics.

In this workshop, we want to bring together researchers from both branches of computational linguistics, as well as linguists with an interest in formal approaches to pragmatics. Topics of the workshop include, but are not limited to, the following issues:

  • implicature calculation and its implementation in NLP systems: interaction with information structure, discourse relations, dialogue goals etc.
  • computational models of experimental results and computational systems as a means for experimental research
  • corpus annotation of pragmatic phenomena

Organizers

References

  1. Bunt, H. & Black, W. 2000. The ABC of Computational Pragmatics. In: Bunt, H. and W. Black (eds.) Abduction, Belief and Context in Dialogue: Studies in Computational Pragmatics.; 1–46.
  2. Frank, M. C., & Goodman, N. D. (2012). Predicting pragmatic reasoning in language games. Science, 336(6084), 998.

Program

Wednesday, 24.2.

14:00 - 15:00

As a sequel to Bunt and Black (2000), which presented a characterization of the field of computational pragmatics and a survey of its main issues, this paper discusses some of the most interesting developments in the field in the last 15 years. Current research is dependent on large-scale annotated corpora. The paper includes an overview of such corpora and accompanying software tools. Of the pragmatic phenomena that have received attention in such corpora, the use of dialogue acts in spoken interaction stands out. Dialogue acts, which have become popular for modeling the use of language as the performance of actions in context, are realized by ‘functional segments’ of communicative behavior; these may be discontinuous, may overlap, and may contain parts contributed by different speakers.
Based on the DIT++ taxonomy of dialogue acts, the ISO 24617-2 standard for dialogue act annotation has been defined, including the Dialogue Act Markup Language DiAML, which supports the annotation of functional segments with multiple communicative functions, type of semantic content, speaker and addressee(s), functional and feedback dependences, pragmatic qualifiers, and rhetorical relations. The context-update semantics of DiAML accounts for inference relations among dialogue acts.
Computational pragmatics contributes to dealing with the fundamental challenge of pragmatics to understand how language interacts with context by providing computational models of interpretation, generation, inferencing and learning. What is still missing, however, is the use of powerful context models. Much of the work that takes context information into account considers only the linguistic context, i.e. the preceding discourse. This is virtually the only kind of context information that is available in corpora, and therefore for applying machine learning techniques. As a result only a fraction of the relevant context information is taken into consideration. Ideally, dialogue and discourse corpora should include information from richer context models including e.g. speaker and hearer beliefs, mutual beliefs, communicative goals, multimodal perceptual information, and social relations. Manual addition of this information in corpora hardly seems feasible, in view of its complexity, therefore a challenge for computational pragmatics is the development of new computational tools to make this feasible.

15:00 - 16:00

Mixed motives represent a mixture of congruent, i.e., joint motives as well as incongruent, partially conflictive motives of interlocutors in dialogues. Motives refer to objectives or situations that interlocutors would like to accomplish in the sense of a motivational state. As mixed-motive dialogues we describe all grades between collaborative dialogues with exclusively congruent interlocutors’ motives, e.g., when solving a PC problem together, and non-collaborative dialogues with purely incongruent motives of dialogue participants, e.g., in a pro/contra debate. Adopting the idea of mixed-motive games by Schelling, we consider these dialogues as situations in which participants are faced with a conflict between their motives to cooperate and to compete with each other, e.g., in sales conversation, where bargainers have to make concessions to establish a compromise agreement, but at the same time, they must compete to achieve a good bargain. In everyday life, interlocutors are able to solve this conflict between cooperation and competition with trade-offs between selfishness and fair play for creating dialogues perceived as fair.
Despite of the overall presence of mixed-motive dialogues in everyday life, little attention has been given to this topic in dialogue planning in contrast to scrutinized collaborative as well as non-collaborative dialogues. Therefore, the support of these rarely considered dialogue type by dialogue systems in real-world environments is still a challenge.
Our objective is the investigation of dialogue systems that support mixed-motive dialogues between users and indirect, absent interlocutors, for instance customers and retailers in sales conversations. Adopted motives by indirect interlocutors as well as anticipated motives by users constitute mixed motives that are processed by the dialogue system when generating answers to posed questions. Since complete satisfaction of all motives by all interlocutors at any point in mixed-motive dialogues is not possible, we draw the concept of satisficing by Herbert Simon (1956) for capturing the idea of finding the best alternative available in the sense of a sufficient satisfaction of motives by all interlocutors. Therefore, satisficing answers are planned that lead to mixed-motive dialogues perceived as fair by all interlocutors regarding the absolute and relative satisfaction of their motives. Restricted to question-answering settings, our contribution is an approach for satisficing answer planning in mixed-motive QA dialogues by means of a game-theoretical equilibrium approach. Based on the proposed approach, we implemented a text-based QA system that provides a sales assistant in an online shopping scenario. The validity of the approach was evaluated in an empirical end user study (n=120) with the QA system with promising results.

16:00 - 16:30

Coffee break

16:30 - 17:30

In this talk, we will report on our research on interactive natural language generation (NLG) in the context of situated dialogue systems. In situated communication, the meaning of a sentence is always relative to the environment in which it is uttered, requiring us to model both in parallel. Within this task, we are particularly interested in generating referring expressions (REs), i.e. of noun phrases that identify a given object effectively within the scene. More specifically, our research on situated NLG focuses on generating instructions that help a human user solve a given task in a virtual 3D environment. This domain has the advantage of a technical complexity and reliability that is greatly reduced compared to situated communication in real-life environments. Furthermore, data collection and evaluation can be done with experimental subjects that are recruited over the Internet. We will report on the GIVE Challenge, an NLG evaluation challenge organized by our group which is built on top of this idea. Since 2009 we have developed a set of tools capable of recording, analyzing, and modeling user behavior around this challenge scenario, and have collected hundreds of hours of interactions between NLG systems and experiment subjects, which we can use to train and evaluate our systems. Crowdsourcing, the practice of collecting data from participants all over the world, allows us today to test new hypotheses in a cheap and efficient manner. Next to this training and evaluation setting, we will also report on our work on the interactive, situated generation of REs. Generating REs, reacting to misunderstandings and establishing common ground are some of the pragmatic phenomena that we must take into account. We developed a data-driven approach that allows us to generate the "best" RE for any given situation. Unlike some earlier research, we take "best" to mean the RE that maximizes the chance that the listener will understand the RE correctly. We then exploit the interactivity of the environment by tracking the listener's behavior in the virtual environment in real time. We have implemented a system that detects automatically whether the listener has understood the RE correctly, and generates corrective feedback if a misunderstanding occurred.

17:30 - 18:30

In this talk I will, firstly, summarise the state of the art of the Generation of Referring Expressions, viewed as the construction of computational models of human reference production; in this first part of the talk, I will ask what algorithms in this area are able to do well and what it is that they still struggle to do. In the second part of the talk, I will argue that the most difficult problems for the Generation of Referring Expressions arise from situations in which reference is something other than the ''simple'' identification of a referent by means of knowledge that the speaker shares with the hearer; I will give examples of these epistemically problematic situations and of the generation algorithms that try to address them. The talk offers a sneak preview of my book ''Computational Models of Referring: a Study in Cognitive Science'', which is soon to appear with MIT Press.

Thursday, 25.2.

09:00 - 10:00

Consider the problem of generating and intepreting non-literal utterances in the context of (1), where "Rewe'' and "Edeka'' are supermarkets.

    • Q: Does Rewe sell turnips?
    • A:
      1. Edeka sells turnips.
      2. ?Rewe sells carrots.
      3. \#Rewe sells soap.
Intuitively, (1-a) is licensed by the presumption that the questioner/hearer wants to buy turnips, and conveying that Edeka sells them would be helpful in accomplishing this goal. But why wouldn't the hearer have simply asked, "where can I get some turnips?'' A strategy for answering that wh-question by breaking it down into yes/no sub-questions (see Büring 2003) makes sense if two conditions are met. First, the questioner expects the answerer to supply a single candidate store, rather than an exhaustive list. Second, the questioner has a preferred outcome: perhaps for reasons of convenience or price, he/she would rather go to Rewe for turnips. Asking about Rewe first avoids an outcome where the questioner is led to a sub-optimal supermarket. In this case, a helpful answerer does well to supply the alternative in (1-a), but only in the case where Rewe does not sell turnips. With this in mind, the hearer will draw the implicature from (1-a) that Rewe does not sell turnips. A similar implicature can be drawn from (1-b), but one gets the intuition that (1-a) is a better answer than (1-b). And more clearly, (1-c) is downright infelicitous. This should fall out as a direct consequence of how (un)likely it is that the alternatives supplied help accomplish the questioner's goal. Recently, game theory has proven to be a useful formal tool for modeling reasoning of this kind, and has begun to be applied to problems of language generation in a computational setting (Stevens et al., 2015). We propose a framework for developing methods to solve generation/interpretation tasks in parallel using iterated game-theoretic reasoning over algorithms. A discourse situation is modeled as a cooperative Bayesian game between two interlocutors, taking into account their conversational and domainlevel goals. The strategies are algorithms for generating and interpreting/reacting to propositions. Starting with a principled default speaker strategy, algorithms iteratively refined to better achieve the players' goals until fixed point has been reached, à la Franke (2009). Pragmatic inferences are made based on conditions on algorithm outputs. We illustrate our approach by applying it to (1).

10:00 - 11:00

The question whether and when pragmatic enrichments, like scalar implicatures, can occur in nonmatrix position is crucial for understanding pragmatic inferences and processing in general. Here, we would like to address the associated disambiguation problem (c.f. Chemla and Singh, 2014): any theory of implicature-like meaning enrichments should ideally specify, for any sentence and context pair, which candidate readings are preferred, and to what extent even dispreferred readings may be selected.
With this goal in mind, we turn to probabilistic computational pragmatics, which aims to bridge classical formal pragmatic theory and the demands of empirical data analysis. In particular, we look at a joint-inference model in which the listener infers, not only the most likely world state that could have triggered the speaker’s utterance, but also the speaker’s intended meaning, modeled here as a topic proposition (a special kind of question under discussion). In keeping with previous probabilistic pragmatics models that build on Frank and Goodman (2012)’s rational speech act model, we define a chain of naive listener R0, Gricean speaker S1 and pragmatic interpreter R2, where each next component builds on the previous. The main innovation of this model is that the speaker’s choice of utterance depends on a choice of topic proposition which in turn depends on the actual world state. Speakers are assumed to select topic propositions probabilistically, so that more informative (surprising) propositions are more likely to be selected. Utterances should then make the to-be-communicated topic proposition likely, given conventional semantic meaning. Listeners then jointly infer world state and topic proposition based on the utterance.
We show how this joint-inference model makes appealing predictions about complex sentences with scalar implicature triggers in line with recent empirical data about preference in disambiguation (Franke et al., 2015). We also argue that the joint-inference model offers many possibilities for linking model predictions to experimental conditions.


References
Chemla, Emmanuel and Raij Singh (2014). “Remarks on the Experimental Turn in the Study of Scalar Implicature (Part I & II)”. In: Language and Linguistics Compass 8.9, pp. 373–386, 387–399.
Frank, Michael C. and Noah D. Goodman (2012). “Predicting Pragmatic Reasoning in Language Games”. In: Science 336.6084, p. 998.
Franke, Michael et al. (2015). “Embedded Scalars, Preferred Readings and Intonation: An Experimental Revisit”. Under review, Journal of Semantics.

11:00 - 11:30

Coffee break

11:30 - 12:30

Probabilistic models of human cognition have been widely successful at capturing the ways that people represent and reason with uncertain knowledge. The Rational Speech Act framework uses probabilistic modeling tools to formalize natural language understanding as social reasoning: literal sentence meaning arises through probabilistic conditioning, and pragmatic enrichment is the result of listeners reasoning about cooperative speakers. I will consider how this framework provides a theory of the role of context in language understanding. In particular I will show that when uncertainty about the speaker is included in the pragmatic inference several of the most subtle aspects of language emerge: vagueness (in scalar adjectives and generics) and presupposition accommodation.

Friday, 26.2.

11:30 - 12:00

We present an extension of LFG’s Abstract Knowledge Representation (AKR) component that integrates a model of the Common Ground (CG) and allows for the calculation of pragmatic inferences. The system uses a rule set based on Gunglogson’s (2002) discourse model. We illustrate our implementation with respect to a set of German discourse particles. These particles arguably contribute information that is pertinent for the CG (e.g., Zimmerman 2011).
Our pragmatic parser for dialogues uses the existing AKR framework built on top of LFG’s syntactic architecture (e.g., Bobrow et al. (2007) and Crouch & King (2006)) within the XLE grammar development platform. The platform integrates an XFR rewriting system that allows for packed rewriting of XLE’s syntactic output. It produces semantic representations that allow for Entailment & Contradiction detection (Bobrow et al. 2007). We extend this component to produce a semantic/pragmatic representation that is dynamically updatable for pragmatic reasoning. Our system on the one hand enriches AKRs with pragmatically relevant information, e.g. speaker, speech time, state of information in discourse. On the other hand, we modified the ECD system such that it determines discourse moves (conversational actions) and accordingly modifies the AKR that represents the discourse.
To illustrate the system we use German discourse particles to demonstrate how grammatical information interacts with pragmatic information. Concretely, our pragmatic parser interprets the meaning that the German discourse particles ja, doch and wohl add to utterances in discourse like structures. We show how dynamic pragmatic inferencing takes place within the AKR system based on the information coming from the particles.
In sum, we present an extension of a meaning component that has been used for information retrieval and reasoning in a Question&Answer system. Our extension provides a model of the CG and allows for dynamic reasoning about the information in the CG. Furthermore, our system provides a treatment of German discourse particles that is computationally elegant and linguistically well motivated.


References:
Asher, N. & A. Lascarides. 2003. Logics of Conversation
Bobrow, D. G., B. Cheslow, C. Condoravdi, L. Karttunen, T. H. King, R. Nairn, & A. Zaenen. 2007. PARC’s bridge and question answering system. GEAF 2007 Workshop.
Condoravdi, Cleo, D. Crouch, R. Stolle, V. de Paiva, & D. G. Bobrow. 2003. Entailment, intensionality and text understanding. Human Language Technology Conference
Crouch, D. und T. Holloway King. 2006. Semantics via f-structure rewriting LFG06 Gunglogson, C. 2002. Declarative questions. SALT XII
Zimmermann, M. 2011. Discourse particles. In P. Portner, C. Maienborn, und K. von Heusinger, edt., Semantics. HSK 33.2

12:00 - 12:30

The work in progress presented here is a contribution to argumentation mining in the German legal text domain. Focused in this abstract is the building of a corpus of argumentative sequences and argumentation structures of German legal decisions that will later provide models of these layers for conditional random field-based sequence labelling and treekernels for structure classification. Most related works are Mochales and Moens 2011 and Stab et al. 2014. However, there is no corpus of German legal decisions available and building a gold-standard corpus of this genre will be an important addition to all related fields of research. The data collection has been compiled from a free online service and consists of 100 private law decisions. For pre-processing, a genre-specific sentence tokenizer has been trained. The annotation framework chosen for the study is Webanno (Yimam et al. 2013).
The study divides into two subtasks: The first step is the labelling of all argumentative sequences in the justification section of a decision document on sentence level.
The second annotation task is to enrich each of the premises with structural information on its local argumentative elements on word token level.
Besides being part of the argumentation mining study, the corpus will deliver valuable information for discourse related studies in the German legal domain and can contribute to comparative studies among different argumentative text genres.


References
R. Mochales and M. F. Moens. Argumentation mining. Artificial Intelligence and Law, 19(1):1–22, 2011.
C. Stab, C. Kirschner, J. Eckle-Kohler, and I. Gurevych. Argumentation mining in persuasive essays and scientific articles from the discourse structure perspective. Frontiers and Connections between Argumentation Theory and Natural Language Processing, Bertinoro, Italy, 2014.
S. M. Yimam, I. Gurevych, R. E. de Castilho, and C. Biemann. Webanno: A flexible, web-based and visually supported system for distributed annotations. In ACL (Conference System Demonstrations), pages 1–6, 2013.

12:30 - 13:00

Every morning, while reading the newspaper, we are faced with a lot of different omissions, which we do not always even realize. We read headlines as

    1. Größte Dürre seit einem halben Jahrhundert
    2. Kampfjet in Bayern abgestürzt (zeit.de)

In the first case, we consider an (def. or indef.) article is missing, as there is an obligatory article before singular nouns in German. In 1 b., the structure additionally lacks some copula verb, like "(Ein) Kampfjet ist in Bayern abgestürzt''.
These kinds of ellipsis are found not only in headlines. We claim that we can get a profile of text types on the basis of their distribution of ellipses. Hereto, we built a corpus containing more than 10 different text types (spoken and written language) to compare the patterns. A big challenge was the right detection and annotation of the missing elements. How can we more or less automatically find the missing article (<art>)?
    1. Geh <art> Schritt zurück!
      [pos="VVIMP''] . #a:[pos!="ART''] & #a . [pos="NN'']
    2. Geh <art> großen Schritt zurück!

Since the STTS doesn't distinguish between singular and plural nouns, the query for patterns as in 2 a. gives no satisfying output. Cases like in 2(b) further challenge the task. Finally, we did the annotation by hand for having a reliable annotation.
Secondly, I want to discuss the different forms of article omissions in the light of Information Theory. A central claim is that such omissions are a way to "densify'' an utterance in order to reduce redundancy. For this purpose we train Language Models on different text types and calculate Information Density (i.e. -log2 P(w|c)) like in (3) for headlines.
    1. "Die Stadt hat wenig Chancen'' (Total #: 43)
      -log2 P(Die | <s>) = -log2 P(0.040307) = 4.6328
      -log2 P(Stadt | Die) = -log2 P(0.00440483) = 7.7953
    2. "Stadt droht durch Erosion unterzugehen'' (Total #: 73)
      -log2 P(Stadt | <s>) = -log2 P(0.000309407) = 11.6582

In (3), the ID of the noun without preceding article (11.6582) is higher and hence much more "dense'' than in the case of article realization (7.7953). The ID of the article itself is quite low (4.6328). Thus, a puzzle I want to address in the talk is, why the article sometimes is realized instead of its omission -- and vice versa. Furthermore, which role plays the Uniform Information Hypothesis (e.g. Jaeger 2010) here? A further aim of this (ongoing) work is to compare the values in different text types. The claim of a certain "profile'' for each text type should be reflected in different probability values, different ID profiles respectively. The aim is to show first profiles and to discuss further possibilities in CompPrag -- since there are some other ellipsis on hold.

13:00 - 13:30

In this paper, we use the phenomenon of 'embarrassed laughter' as a case study of one approach to corpus pragmatics. We construct a set of interlinked ontologies by comparing the transcription practice of various collections of data as summarised by Hepburn and Varney (2013), making explicit the implied knowledge underlying those transcription practices about the characteristics of laughter which have been treated as interactionally relevant. These ontologies allow us to see the essentially combinatorial nature of certain pragmatic phenomena and therefore also allow us to develop strategies for searching for relevant data. We then proceed to illustrate how such search strategies can work with the example of 'embarrassed laughter'. Such laughter often occurs early in an interaction (especially first encounters) and following long pauses. We can therefore establish a set of search criteria (laughter AND (start of interaction OR long pause) to try to find possible instances of this phenomenon in varied collections of data such as those which form part of the Australian National Corpus. Our approach acknowledges the complexity of the factors which may be relevant to the identification of any pragmatic phenomenon without relying on the prior identification of instances in any specific dataset and has the capability to generate candidate sets of examples across varied data sets while relying on features which are annotated in standard practice. We suggest that looking for clusters of features which characterize pragmatic phenomena and organizing our knowledge of the features with ontologies constitutes a very promising approach in the field of corpus pragmatics.


Reference:
Hepburn, Alexa & Scott Varney. 2013. Beyond ((Laughter)): Some Notes on Transcription. In Phillip Glenn & Elizabeth Holt (eds.), Studies of Laughter in Interaction, 25–38. London: Bloomsbury Academic.
http://www.bloomsburycollections.com/book/studies-of-laughter-ininteraction (16 August, 2015).

13:30 - 14:00

Concluding remarks