Logo des Repositoriums
Zur Startseite
  • English
  • Deutsch
Anmelden
  1. Startseite
  2. SuUB
  3. Forschungsdokumente
  4. Text Mining and Document Classification Workflows for Chinese Administrative Documents
 
Zitierlink DOI
10.26092/elib/3755

Text Mining and Document Classification Workflows for Chinese Administrative Documents

Veröffentlichungsdatum
2025-03
Autoren
Müller, Armin  
Zusammenfassung
Background: The political system of the People’s Republic of China features a combination of political centralization and administrative decentralization, which makes it one of the most decentralized political systems in the world. The case of social insurance is illustrative of this phenomenon: the national level enacts general laws and regulations, which are further specified at the first sub-national level – by governments at provincial level. But social insurance systems like health insurance or unemployment insurance are typically organized at the second or third sub-national level. Government and administration of prefectural cities and counties pool the funds within their jurisdictions, and enact regulations that ultimately determine inclusiveness and the scope of benefits.

Aim: The aim of this paper is to present approaches to reconstruct the regulatory differences at sub-national level, and to leverage the results for quantitative and qualitative analysis. It provides an introduction to the ongoing document analysis work in project B05 of the CRC 1342 in Bremen. Furthermore, it enables researchers in social-scientific China studies to sort large amounts of regulatory documents by relevance, and to connect regulatory data to survey data or sub-national time series.

Content: This technical paper presents step-by-step the creation of a database to organize the documents, and two workflows to extract information for qualitative and quantitative analysis. The two workflows presented do not exhaust the possibilities of the approach, but merely provide examples used in ongoing publication projects. A complementary GitHub repository provides the code files needed for implementation.

Complementary GitHub repository: https://github.com/arminmueller81/health_insurance_coverage
Schlagwörter
text as data

; 

text classisfication

; 

machine learning

; 

neural networks

; 

China

; 

administrative documents

; 

legislation
Institution
Universität Bremen  
Fachbereich
Zentrale Wissenschaftliche Einrichtungen und Kooperationen  
Institute
SFB Globale Entwicklungsdynamiken von Sozialpolitik (SFB 1342)  
Dokumenttyp
Bericht, Report
Serie(s)
Wesis - technical papers  
Band
17
Zweitveröffentlichung
Nein
Lizenz
https://creativecommons.org/licenses/by-nc-nd/4.0/
Sprache
Englisch
Dateien
Lade...
Vorschaubild
Name

WeSIS_Technical_Papers_No 17 (1).pdf

Size

5.94 MB

Format

Adobe PDF

Checksum

(MD5):8a919109d9af9979623a941707b20119

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Datenschutzbestimmungen
  • Endnutzervereinbarung
  • Feedback schicken