Logo des Repositoriums
Zur Startseite
  • English
  • Deutsch
Anmelden
  1. Startseite
  2. SuUB
  3. Dissertationen
  4. Towards Multilingual Coreference Resolution
 
Zitierlink URN
https://nbn-resolving.de/urn:nbn:de:gbv:46-00103541-11

Towards Multilingual Coreference Resolution

Veröffentlichungsdatum
2013-12-20
Autoren
Zhekova, Desislava  
Betreuer
Kübler, Sandra  
Gutachter
Kübler, Sandra  
Zusammenfassung
The current work investigates the problems that occur when coreference resolution is considered as a multilingual task. We assess the issues that arise when a framework using the mention-pair coreference resolution model and memory-based learning for the resolution process are used. Along the way, we revise three essential subtasks of coreference resolution: mention detection, mention head detection and feature selection. For each of these aspects we propose various multilingual solutions including both heuristic, rule-based and machine learning methods. We carry out a detailed analysis that includes eight different languages (Arabic, Catalan, Chinese, Dutch, English, German, Italian and Spanish) for which datasets were provided by the only two multilingual shared tasks on coreference resolution held so far: SemEval-2 and CoNLL-2012. Our investigation shows that, although complex, the coreference resolution task can be targeted in a multilingual and even language independent way. We proposed machine learning methods for each of the subtasks that are affected by the transition, evaluated and compared them to the performance of rule-based and heuristic approaches. Our results confirmed that machine learning provides the needed flexibility for the multilingual task and that the minimal requirement for a language independent system is a part-of-speech annotation layer provided for each of the approached languages. We also showed that the performance of the system can be improved by introducing other layers of linguistic annotations, such as syntactic parses (in the form of either constituency or dependency parses), named entity information, predicate argument structure, etc. Additionally, we discuss the problems occurring in the proposed approaches and suggest possibilities for their improvement.
Schlagwörter
Coreference Resolution

; 

Anaphora

; 

Machine Learning

; 

Natural Language Processing
Institution
Universität Bremen  
Fachbereich
Fachbereich 10: Sprach- und Literaturwissenschaften (FB 10)  
Dokumenttyp
Dissertation
Zweitveröffentlichung
Nein
Sprache
Englisch
Dateien
Lade...
Vorschaubild
Name

00103541-1.pdf

Size

4.13 MB

Format

Adobe PDF

Checksum

(MD5):53bb96ce94d1376f62fa240a4e9f2515

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Datenschutzbestimmungen
  • Endnutzervereinbarung
  • Feedback schicken