Discourse analysis and annotation for contrastive linguistics and translation studies.
Ekatarina Lapshinova-Koltunski (Universität des Saarlandes),
In this talk, I will report on the ongoing work on discourse analysis in a multilingual context. I will present two approaches in the analysis of coreference and discourse-related phenomena: (1) top-down or theory-driven: here we start from some linguistic knowledge derived from the existing frameworks, define linguistic categories to analyse and create an annotated corpus that can be used either for further linguistic analysis or as training data for NLP applications; (2) bottom-up or data-driven: in this case, we start from a set of features of shallow character that we believe are discourse-related. We extract these structures from a huge amount of data and analyse them from a linguistic point of view trying to describe and explain the observed phenomena from the point of view of existing theories and grammars.