Skip navigation
SuUB logo
DSpace logo

  • Home
  • Institutions
    • University of Bremen
    • City University of Applied Sciences
    • Bremerhaven University of Applied Sciences
  • Sign on to:
    • My Media
    • Receive email
      updates
    • Edit Account details

Citation link: https://doi.org/10.26092/elib/3755
WeSIS_Technical_Papers_No 17 (1).pdf
OpenAccess
 
by-nc-nd 4.0

Text Mining and Document Classification Workflows for Chinese Administrative Documents


File Description SizeFormat
WeSIS_Technical_Papers_No 17 (1).pdf6.08 MBAdobe PDFView/Open
Authors: Müller, Armin 
Publisher: SFB Globale Entwicklungsdynamiken von Sozialpolitik (SFB 1342) 
Abstract: 
Background: The political system of the People’s Republic of China features a combination of political centralization and administrative decentralization, which makes it one of the most decentralized political systems in the world. The case of social insurance is illustrative of this phenomenon: the national level enacts general laws and regulations, which are further specified at the first sub-national level – by governments at provincial level. But social insurance systems like health insurance or unemployment insurance are typically organized at the second or third sub-national level. Government and administration of prefectural cities and counties pool the funds within their jurisdictions, and enact regulations that ultimately determine inclusiveness and the scope of benefits.

Aim: The aim of this paper is to present approaches to reconstruct the regulatory differences at sub-national level, and to leverage the results for quantitative and qualitative analysis. It provides an introduction to the ongoing document analysis work in project B05 of the CRC 1342 in Bremen. Furthermore, it enables researchers in social-scientific China studies to sort large amounts of regulatory documents by relevance, and to connect regulatory data to survey data or sub-national time series.

Content: This technical paper presents step-by-step the creation of a database to organize the documents, and two workflows to extract information for qualitative and quantitative analysis. The two workflows presented do not exhaust the possibilities of the approach, but merely provide examples used in ongoing publication projects. A complementary GitHub repository provides the code files needed for implementation.

Complementary GitHub repository: https://github.com/arminmueller81/health_insurance_coverage
Keywords: text as data; text classisfication; machine learning; neural networks; China; administrative documents; legislation
Issue Date: Mar-2025
Project: SFB Globale Entwicklungsdynamiken von Sozialpolitik (SFB 1342) 
Funders: Deutsche Forschungsgemeinschaft (DFG)
Grant number: 374666841
Series: Wesis - technical papers 
Volume: 17
Type: Bericht, Report
Secondary publication: no
DOI: 10.26092/elib/3755
URN: urn:nbn:de:gbv:46-elib88742
Institution: Universität Bremen 
Faculty: Zentrale Wissenschaftliche Einrichtungen und Kooperationen 
Institute: SFB Globale Entwicklungsdynamiken von Sozialpolitik (SFB 1342) 
Appears in Collections:Forschungsdokumente

  

Page view(s)

94
checked on May 9, 2025

Download(s)

37
checked on May 9, 2025

Google ScholarTM

Check


This item is licensed under a Creative Commons License Creative Commons

Legal notice -Feedback -Data privacy
Media - Extension maintained and optimized by Logo 4SCIENCE