Filling Data Gaps in the Measurement of Income Inequality. A Complete Dataset of National GINI Coefficients 1995-2019
Veröffentlichungsdatum
2026-03
Zusammenfassung
Income inequalities are a major societal challenge (Grusky 2018; Polacko 2021). Despite the criticism that is being expressed, the Gini coefficient - especially income based - remains the most important indicator for measuring the extent and development of income inequality within a country. Unfortunately, Gini coefficients based on comparable methodologies are only available to a very limited extent. The most comprehensive data set available with consistent definitions for net income is the WIID Gini. With around 900 data points, this data set covers only 22% of the possible country-year combinations for the selected sample of 160 countries between 1995 and 2019.
We pursue two objectives: (1) to close existing data gaps through statistical imputation thereby creating a consistent and plausible dataset of Gini coefficients for 160 countries with over 1 Mio. inhabitants from 1995 to 2019 and (2) to identify the socioeconomic and political indicators that most strongly influence these imputations. To achieve this, missing data are estimated using a gradient boosting machine (GBM) drawing on over 1.400 socioeconomic and political indicators from the WeSIS database.
With this novel dataset, we enable researchers to broaden their inquiry into causes and effects of socio-economic inequality on a formerly unachievable scale.
We pursue two objectives: (1) to close existing data gaps through statistical imputation thereby creating a consistent and plausible dataset of Gini coefficients for 160 countries with over 1 Mio. inhabitants from 1995 to 2019 and (2) to identify the socioeconomic and political indicators that most strongly influence these imputations. To achieve this, missing data are estimated using a gradient boosting machine (GBM) drawing on over 1.400 socioeconomic and political indicators from the WeSIS database.
With this novel dataset, we enable researchers to broaden their inquiry into causes and effects of socio-economic inequality on a formerly unachievable scale.
Schlagwörter
socio-economic inequality
;
gini coefficient
;
imputation
;
machine learning
;
gradient boosting
;
social policy
Institution
Dokumenttyp
Buch
Serie(s)
Band
0024
Zweitveröffentlichung
Nein
Sprache
Englisch
Dateien![Vorschaubild]()
Lade...
Name
WeSIS_Technical_Papers_No 24 (1).pdf
Size
7.7 MB
Format
Adobe PDF
Checksum
(MD5):1814819a13799aa2900ba1da6c154a5f
