Skip navigation
SuUB logo
DSpace logo

  • Home
  • Institutions
    • University of Bremen
    • City University of Applied Sciences
    • Bremerhaven University of Applied Sciences
  • Sign on to:
    • My Media
    • Receive email
      updates
    • Edit Account details

Citation link: http://nbn-resolving.de/urn:nbn:de:gbv:46-00103541-11
00103541-1.pdf
OpenAccess
 
copyright

Towards Multilingual Coreference Resolution


File Description SizeFormat
00103541-1.pdf4.23 MBAdobe PDFView/Open
Other Titles: Multilinguale Koreferenz-Resolution
Authors: Zhekova, Desislava 
Supervisor: Kübler, Sandra 
1. Expert: Bateman, John, PhD 
2. Expert: Kübler, Sandra 
Abstract: 
The current work investigates the problems that occur when coreference resolution is considered as a multilingual task. We assess the issues that arise when a framework using the mention-pair coreference resolution model and memory-based learning for the resolution process are used. Along the way, we revise three essential subtasks of coreference resolution: mention detection, mention head detection and feature selection. For each of these aspects we propose various multilingual solutions including both heuristic, rule-based and machine learning methods. We carry out a detailed analysis that includes eight different languages (Arabic, Catalan, Chinese, Dutch, English, German, Italian and Spanish) for which datasets were provided by the only two multilingual shared tasks on coreference resolution held so far: SemEval-2 and CoNLL-2012. Our investigation shows that, although complex, the coreference resolution task can be targeted in a multilingual and even language independent way. We proposed machine learning methods for each of the subtasks that are affected by the transition, evaluated and compared them to the performance of rule-based and heuristic approaches. Our results confirmed that machine learning provides the needed flexibility for the multilingual task and that the minimal requirement for a language independent system is a part-of-speech annotation layer provided for each of the approached languages. We also showed that the performance of the system can be improved by introducing other layers of linguistic annotations, such as syntactic parses (in the form of either constituency or dependency parses), named entity information, predicate argument structure, etc. Additionally, we discuss the problems occurring in the proposed approaches and suggest possibilities for their improvement.
Keywords: Coreference Resolution, Anaphora, Machine Learning, Natural Language Processing
Issue Date: 20-Dec-2013
Type: Dissertation
URN: urn:nbn:de:gbv:46-00103541-11
Institution: Universität Bremen 
Faculty: FB10 Sprach- und Literaturwissenschaften 
Appears in Collections:Dissertationen

  

Page view(s)

32
checked on Jan 19, 2021

Download(s)

7
checked on Jan 19, 2021

Google ScholarTM

Check


Items in Media are protected by copyright, with all rights reserved, unless otherwise indicated.

Legal notice -Feedback -Data privacy
Media - Extension maintained and optimized by Logo 4SCIENCE