Towards Multilingual Coreference Resolution
|Other Titles:||Multilinguale Koreferenz-Resolution||Authors:||Zhekova, Desislava||Supervisor:||Kübler, Sandra||1. Expert:||Bateman, John, PhD||2. Expert:||Kübler, Sandra||Abstract:||
The current work investigates the problems that occur when coreference resolution is considered as a multilingual task. We assess the issues that arise when a framework using the mention-pair coreference resolution model and memory-based learning for the resolution process are used. Along the way, we revise three essential subtasks of coreference resolution: mention detection, mention head detection and feature selection. For each of these aspects we propose various multilingual solutions including both heuristic, rule-based and machine learning methods. We carry out a detailed analysis that includes eight different languages (Arabic, Catalan, Chinese, Dutch, English, German, Italian and Spanish) for which datasets were provided by the only two multilingual shared tasks on coreference resolution held so far: SemEval-2 and CoNLL-2012. Our investigation shows that, although complex, the coreference resolution task can be targeted in a multilingual and even language independent way. We proposed machine learning methods for each of the subtasks that are affected by the transition, evaluated and compared them to the performance of rule-based and heuristic approaches. Our results confirmed that machine learning provides the needed flexibility for the multilingual task and that the minimal requirement for a language independent system is a part-of-speech annotation layer provided for each of the approached languages. We also showed that the performance of the system can be improved by introducing other layers of linguistic annotations, such as syntactic parses (in the form of either constituency or dependency parses), named entity information, predicate argument structure, etc. Additionally, we discuss the problems occurring in the proposed approaches and suggest possibilities for their improvement.
|Keywords:||Coreference Resolution, Anaphora, Machine Learning, Natural Language Processing||Issue Date:||20-Dec-2013||URN:||urn:nbn:de:gbv:46-00103541-11||Institution:||Universität Bremen||Faculty:||FB10 Sprach- und Literaturwissenschaften|
|Appears in Collections:||Dissertationen|
checked on Sep 26, 2020
checked on Sep 26, 2020
Items in Media are protected by copyright, with all rights reserved, unless otherwise indicated.