DEFINITION OF INEQUALITY MEASURES AND THEIR VARIABILITY
19.4.1 Definition of the Dependent Variable
This section describes how the dependent variable—household income inequality—is measured in the empirical work under review. It is important to note right from the outset that in an overwhelming majority of cases researchers do not have full discretion over which inequality measure they will analyze or include in their models.
This is, in most of the cases, limited by the availability of the data, and it is especially so in the case of country-level comparisons of secondary data. The variable list of the large international secondary data sets (such as WIID, for example) hugely constrains the choice. The larger the data set in terms of country coverage, the more this is likely to be the case (because the possibility of having new, harmonized indicators diminishes with the size of the surveys). There are only a few measures usually available, of which the Gini coefficient is by far the most often used, followed by various decile shares (S80/S20 or S90/S10) and, sometimes, percentile ratios such as P90/P10 or ratios of some other percentile values.None of the above-mentioned measures are overly sensitive to the tails of the income distribution, and therefore the analyses based on them may miss important changes within the distribution. This could partly be overcome by the use of more tail-sensitive measures such as D9/D5 ratios, generalized entropy-type measures of inequality (Theil, MLD), or Atkinson-class measures. However, it also became important to pay attention to polarization measures comparing the values of a comparison distribution to the values of a reference distribution (Alderson and Doran 2013; Handcock and Morris, 1999; Morris et al., 1994; Wolfson 1997). The share of population classified by cutpoints of the comparison distribution can show how it falls in similarly defined categories of the reference distribution, allowing us to compare relative positions of people at various parts of the distribution.[265]
Studies investigating developments of tail-sensitive overall inequality measures or polarization measures, however, remain rare in the literature, given the fact that these measures are, unlike Gini coefficients, much less available for international comparisons.[266] On the other hand, using the Gini and other middle sensitive measures does also have advantages, especially when sampling variability due to small sample sizes is an issue.
Further, in some studies, such as, for example, political science explanations, or in analyses of the effects of redistribution, it is not the actual value of the inequality measure such as the Gini coefficient (of net disposable incomes) in itself but the difference between the pre-tax and -transfer Gini on the one hand and the post-tax and -transfer Gini on the other that is used as the dependent variable. This is a measure of redistribution for many analytic papers (e.g., Bradly et al., 2003; Iversen and Soskice, 2006) and a proxy of how politics and policies affect inequalities.
The range of available inequality indicators also constrain the features of inequality that can be analyzed in international comparisons. If only inequality measures insensitive to the tails are available and analyzed, there is a risk that important changes in the income distribution are missed or noticed too late.
19.4.2 Variability of the Dependent Variable
Trends and patterns of inequalities in countries in the OECD area are analyzed in depth in Chapters 7—9 of this volume. Overviews of the developments of income inequality have been presented in a large number of studies; some of the recent core publications include OECD (2008, 2011),[267] Alderson and Doran (2013), Brandolini and Smeeding (2009), Ward et al. (2009), Toth (2014), Ferreira and Ravallion (2009), Salverda et al. (2014), and Nolan et al. (2014).
One of the most fundamental questions of comparisons of inequality is the variability of the measures used to characterize inequality in society, both across countries and over time as well. The large and rapidly growing income distribution literature (Atkinson and Bourguignon, 2000; Salverda et al., 2009) presents various narratives about the development of income inequality. The major narrative dominating the literature is proposed by the landmark studies of the OECD (2008, 2011) and by various papers based on the data collections of the LIS. According to this, within-country inequalities have increased in a majority of OECD countries since the 1980s, and at least until the breakout of the Great Recession (OECD, 2008, 2011, 2013a; see also Atkinson, Rainwater and Smeeding, 1995; Gottschalk and Smeeding, 2000; Brandolini and Smeeding, 2009; Chapter 8 of this volume).
As the most recent OECD (2011) study stresses, in a large majority of OECD countries the income of the richest 10% of households has grown faster than that of the poorest 10%. The Gini coefficient increased on average from 0.286 in the mid- 1980s to 0.316 in the late 2000s. Of the 22 countries for which a long time series is available, 17 have witnessed increasing inequality. For seven of these the Gini coefficient increased by more than four points over the period. In only five of these countries did inequality not increase or even decline. This is a narrative proposing inequality trends, which are dominant in the era ofthe “great U-turn” of inequality developments.19
After an analysis ofthe GINI Inequality and Poverty Database, Toth (2014) concludes that over the past three decades, inequality has indeed increased on average across the countries included in the analysis (25 EU countries, to which the United States, Canada, Korea, Japan and Australia are added); the whole range of Gini coefficients were at a higher level at the end ofthe period (from a minimum/maximum level of0.20/0.33 to 0.23/0.37). The above work also stresses that the growth in inequality was far from uniform. In some countries (mostly in continental European welfare states such as Austria, Belgium, France), the level of inequality remained largely unchanged or fluctuated around the same level, whereas in others it increased substantially. The latter trend was experienced by some European transition countries (Bulgaria, Estonia, Lithuania, Latvia, Romania and Hungary) and to a lesser but still a considerable extent by the Nordic countries, most notably Sweden and Finland. It also was found that the pattern of inequality change may sometimes show declines for shorter or for longer periods. Such spells of decline were observed in Estonia, Bulgaria and Hungary, for example, sometimes after sharp increases.
Finally, over time it seems possible indeed that countries shift between inequality regimes (Toth, 2014).
After decades of a gradual but incessant increase of inequality, some of the Nordic countries, for example, while still being part of the group of low-inequality countries, no longer are at the lowest end of the inequality “league table.” The United Kingdom moved from being a middle-level inequality country in the 1970s to the group of high-level inequality countries by 1990. Also, some ofthe transition countries such as the Baltic countries, Romania or Bulgaria witnessed very large changes that have put their inequality levels in a different range (see also Toth and Medgyesi, 2011). Chapter 8 of this volume provides a more detailed account of post-1970 trends in within-country inequality in OECD and a range of middle-income countries.19.4.3 Reliability of the Dependent Variable
Population surveys from which data on inequality are computed cover only a sample of the population. Originating basically from this fact, there is always a sampling variance of the statistic chosen to describe features of the distribution. The variability of the sample estimate about its expected value in hypothetical repetitions of the sample (the sampling variance) may be due to sampling and nonsampling errors. Most surveys are based on complex sample design (allowing, for example, a stratification of base populations to draw the sample, of a clustering of cases, of differential techniques providing equal probability of getting into the sample, etc.) Nonsampling errors (of coverage, wording, nonresponse, imputation, weighting, etc.) add to the uncertainty of the selected statistics.
All inequality measures (Gini figures, P90/P10 ratios, etc.) used in international comparisons are estimates from samples that are, in most cases of different designs, based on partially (or not at all) harmonized surveys. In addition, inequality indices are not like simple ratios from samples; for most of them the calculation is based on complicated formulae, leading to nonlinearities of the indexes.
It is therefore very important to understand to what extent secondary uses (i.e., multivariate and multicountry analyses of drivers of inequality) can account for such uncertainties.Inference for inequality and poverty measures calculated from properly documented microdata can be tested by “direct” or formula-based (asymptotic) methods and by experimental methods (based on resampling techniques such as bootstrapping or Jacknife, for example) (see Kovacevic and Binder, 1997; Biewen and Jenkins, 2006; Osier et al., 2013; and others). Both types of methods are used in various research contexts, but none of the results are frequently reported in official statistics and in secondary datasets. While it is shown that the way inference is calculated is important—Davidson and Flachaire, 2007, for example, found that in the case of complex sample design, bootstrapping may lead to not accurate estimates of inference, even for very large samples—sticking to point estimates only is clearly problematic, in part because it creates false images of certainty in inequality statistics and in part because it misguides interpretations of intertemporal change and cross-country differentials. While the degree of accuracy that may be worth pursuing is open to discussion (as Osier et al., 2013 stress, there is need to address a trade-off between statistical accuracy and operational efficiency when choosing estimation methods for standard errors), overlooking the issue is clearly the worst option.
To properly estimate sampling variance, sample design, weighting procedures, imputation practices and the actual computation formula of the statistic is to be taken into account. The effects of these factors are tested in various papers. As Goedeme (2013) and Biewen and Jenkins (2006) stress, ignoring the effect of clustering of individuals in households for poverty indexes (that are derived from incomes measured at the household level but analyzed at the individual level) may lead to a serious underestimation of standard errors for the analyzed poverty measures.
Taking clustering into account leads to fairly good proxy of “true” estimations to settings when sample design variables are not missing. Little is known on similar tests for inequality measures.Van Kerm and Pi Alperin (2013) tested how their measures of inequality reacted to the presence or elimination of extreme values from the surveys they analyzed, and they found their measures were arbitrarily large when they left outliers in their sample. However, other measures such as poverty rates remained more robust for the presence or elimination of extreme values (Van Kerm, 2007).
An essential requirement for computation of variance estimates for inequality measures is that microdata be available for analysis. Most secondary data sets lack any indication of not only the standard error estimates but also essential properties of the samples they have been drawn from. This makes it especially difficult for comparative studies using secondary data sets to assess reliability of their findings.
Further, the Gini coefficient, by construction, is a variable with a relatively small range. Even if inequality may change significantly in the long run, when shorter periods are taken into account and when many data points within the longer period are considered, the adjacent Ginis (in time or across countries) may not (in statistical terms) be significantly different from each other. Therefore, if these values are put into a variable on the left-hand side of a regression, there is a serious risk that a large “noise” enters the estimates.2
Also, when using secondary datasets, where there are no microdata at hand the researchers have to apply some rule of thumb to decide what can be considered a “real” change over time. There is no agreement in the literature, however, about how over-time changes or cross-country differences of Gini coefficients (normally arrived at from heterogeneous sample designs and greatly varying samples) could be defined as significant in statistical terms. Bootstrap (or, better, linearization) estimates of confidence intervals of Gini would suggest roughly ±1 Gini point differences in EU-SILC samples to be registered as “significant,” but little is known on how this could be applied to changes over time given the lack of information in necessary detail about sample designs.
Atkinson (2008) proposed a simple metric of changes in the case of considering changes in percentiles (relative to the median) over a period of decades. He requires a 5% change to be “registered,” a 10% change to be qualified as “significant,” and a 20% change to be qualified as “large.” The bottom decile falling from 50% to at least 47.5% of the median thus would “register” as a change, be considered “significant” if falling below 45%, and being considered “large” if falling below 40%.
Breaks in series pose a serious challenge for cross-country comparisons as well as for intertemporal tracking of inequality, as already noted (Atkinson and Brandolini, 2001). A break in a series may provide an obvious basis for suspicion if accompanied by a sudden change in the level of inequality that subsequently does not continue in the same direction. However, in other cases one must rely on expert judgements as to whether such breaks have in fact masked an underlying change in inequality.
A way of constructing long-term data series of inequality is to link subsequent data series stemming from different data sources or definitions together with use of information on overlaps of these series (Atkinson and Morelli, 2014; Forster and Mira d’Ercole, 2012), a method often called “data splicing.”[268] [269] A proper definition of inequality change in empirical studies (in addition to knowledge of sample sizes and sample designs) also has to be based on careful examination of annual increments of the inequality measure at hand, on the length of the data spells, and on many other “accidental” factors. As Toth (2014) stresses, a year-to-year difference up to a magnitude of 1 Gini point can be considered as no change, especially if variation in subsequent years go in different directions. However, consistent year-to-year changes, even if small ones (say, half a point) from 1 year to another, may accumulate into a five-point change or more in the Gini over 10 years, which is a substantial change indeed. Such longer-run consistency of increments over time may also change the interpretation of short-term comparisons. Consider long-term fluctuation of the Belgian or the Irish Gini series (resulting in longer periods of “no change” in inequalities) and compare those with the very small but consistent year-to-year increments of Ginis in Sweden or Finland, and it becomes clear how important it is to pay attention to even small and insignificant Gini changes (Toth, 2014). Nevertheless, when the Gini index is used as left-hand variable in regressions, spell contexts (as defined above) cannot always be taken into account, and the actual interpretation of the parameter estimates depends heavily on statistical inference. Careful and balanced evaluation: this is the main lesson that can be drawn and the only suggestion that can be given at this stage. 19.5.