<<
>>

CHALLENGES AND LIMITATIONS

24.4.1 Reconciling Simulated Income with Recorded Income and Macro Statistics

A common problem when using micro-data from surveys for the analysis of policies and income distribution is that aggregate values (e.g., gross earnings or income taxes) do not match estimates from national accounts or other sources of macroeconomic statistics.

This problem also applies to microsimulation studies based on survey data, with one exception. Tax-benefit model calculations of benefit entitlements may match administrative totals better than information on recorded receipt in the data, if there is a problem of underreporting of these sources of income in the survey.

Chapter 11 considers the reconciliation of household surveys and national accounts. Here, we focus on a somewhat different issue, also related to the plausibility and usability of empirical findings. This is that the simulated income distribution is not identical to the income distribution that is measured by directly using the underlying survey (or register) micro-data. Typically, measures of income inequality in microsimulated estimates, using the same micro-data and the relevant policy year, are lower. Adjustments in the simulations for the non-take-up of benefits and for tax evasion go some way to reducing the discrep­ancy, and these issues are discussed in Section 24.4.2. However, they appear not to be the full explanation, and it is clear that the contributory factors differ across countries. Indeed, in some countries for particular datasets and policy years, the differences are small: for example, Figari et al. (2012a) show this to be the case for four EU countries, using data from the EU Statistics on Income and Living Conditions (EU-SILC) and EUROMOD. However, this is by no means always or even often the case, and reconciling simulated and recorded estimates is an important component ofboth the process ofbuilding a tax-benefit model and validating the content of micro-data from surveys.

As alluded to above, there is evidence that some surveys underreport recipients of some major cash benefits, when compared with administrative statistics. If the reason for this is failure to report these sources of income by recipients, then simulated benefits may perform better, generally leading to higher incomes at the bottom of the distribution and suggesting that the survey overestimates income inequality. An illustration from the UK is provided in Box 24.2. [549] [550] [551] [552] [553] [554] [555] [556] [557] [558] [559] [560] [561]

shortfall is larger. The entitlement here mainly depends on being in low paid work over the year, allowing families to meet the eligibility criteria for the working tax credit for short periods, which is not captured by the simulations based on current income and circumstances. For the other two payments shown in the table, EUROMOD over- rather than underestimates recipiency. The overestimation of Child Tax Credit recipients is to some extent explained by the administrative statistics not containing some long-term recipients of income support, whose child payments are still waiting to be migrated to the tax credit system. Most simulated and nonsimulated benefits are included in the means-test for Council Tax Benefit: its overestimation is expected to the extent that some nonsimulated benefits are underreported and tax credits are undersimulated.

Clearly, simulating receipt is not a solution in itself, and a comprehensive reconciliation needs other benefit-specific factors to be taken into account.

Numbers of recipients of selected UK benefits in the 2009-2010 tax year: estimates from Family Resources Survey (FRS), EUROMOD, and administrative statistics (thousands)

Source: EUROMOD version F6.20 with adjustments for non-take-up, using Family Resources Survey 2009/10 updated to 2010—2011 incomes.

Shortfalls in the reported receipt of means-tested welfare benefits compared with admin­istrative information are also found in US surveys on a larger scale (Meyer et al., 2009). Wheaton (2007) uses microsimulation to calculate entitlement and then to calibrate the numbers of recipients so that they match administrative statistics. The result is a large increase in the estimated extent of poverty reduction due to the programs in question.

However, as illustrated in Box 24.2, underreporting ofbenefit income may not be the only source of the problem. Ifpart of the reason for the shortfall in the survey is that ben­efit recipients are more likely to be nonrespondents, then microsimulation of eligibility and entitlement is unlikely to solve the problem on its own, and benefit recipiency estimates will still not match administrative information. In this case recalculation of the survey weights, including controls for characteristics that are correlated with benefit receipt and also underrepresented in the survey, may in principle provide a solution, if such characteristics can be identified and external information is available to control the process. This is not often the case.

There are many possible reasons for discrepancies in each simulated income compo­nent. Here we discuss income tax as an important example. First, survey estimates of income tax may not relate to the current year or may include only withholding taxes. Second, survey gross incomes (and hence taxes) may have been imputed from net income (see also Section 24.2.1), but their quality and consistency with calculations in the tax­benefit model are usually difficult to establish due to detailed documentation not being made available. We might also expect some discrepancies when the values are compared with fiscal data. Such comparisons need to take national specifics into account, including the nature of the tax structure and administration, as well as the questions asked in the survey. The nature of the comparison and the conclusions that are drawn also depend on whether fiscal data are available at the micro level and whether they can be matched to the survey.

In addition, the fiscal data may not provide a fully reliable benchmark, especially if they are based on samples of administrative data or if the administrative pro­cess that generates them is not comprehensive or consistent. We provide a case study in Appendix B based on a published table of fiscal statistics for the UK.

Microsimulation estimates of income taxes may be over- or underestimated relative to what is shown by fiscal data. For example, income tax may be underestimated because the market incomes that make up the tax base are underreported or the survey does not adequately represent high-income taxpayers. In this case estimates of income distribution are sometimes adjusted by inflating incomes at the top of the distribution, informed by fiscal data. This is the case for the official estimates of poverty and income distribution produced by the UK Department ofWork and Pensions (DWP, 2013), though the same adjustment is not (to our knowledge) applied in UK tax-benefit models. In contrast, the French model TAXIPP merges micro-data and statistics from many sources for its input database.27 This includes information on top incomes specifically used to correctly cap­ture the very top of the distribution and particularly the taxes paid by that section of the population (Bozio et al., 2012).

Income tax may be overestimated because of tax evasion that has not been modeled (see Section 24.4.2) or because it is not possible to model or measure the size of some tax reliefs and common avoidance measures. It may also be under- or overestimated in line with other simulated income components that are taxable. Combinations of these fac­tors may occur, and indeed it is possible for the simulated tax aggregate to match well that from fiscal data but for the distribution of tax paid to be very different—see Appendix B for an example of this. In addition, estimates of gross income and tax lia­bility from fiscal data may be subject to error due to tax evasion.

Time periods for income assessments are also important.

In surveys that collect current income (as in the UK), which mainly use a reference time period of a month, the simulation of income tax must assume that the same monthly income was received all year and will not identify cases with tax liability for part of the year. However, the survey response for those with part year incomes will, at least in principle, indicate the correspondingly lower or higher tax payments, already adjusted for part-year incomes. The UK is unusual in collecting short-period current income. Most income surveys ask about annual income (in the previous year), which is the appropriate refer­ence time period for the calculation of tax liabilities. However, it must also be used to simulate the income assessment of social assistance and other means-tested benefits for which the relevant period is generally much shorter than 1 year. This leads to fewer households being simulated to receive these benefits than shown in the data.

Generally, simulations are only as good as the underlying micro-data and, in the cases where they are necessary, as good as the imputations and adjustments that must be carried out in the absence of all the necessary information. This in turn depends on the specifics of the national benefit and tax systems as well as the quality of the data. In some circum­stances it might be appropriate to calibrate and reweight to try and adjust the baseline simulated distribution of income and its components to match that given by the data directly. Generally, however, such an approach will distort the estimates of change due to a policy reform. A better approach is to try and understand the source of each problem and to make adjustments that can be applied in a consistent way, and with transparent assumptions, across policy scenarios. This highlights the importance not only of valida­tion and adjustment but also of documenting the process so that users of the models and readers of model applications can make their own assessment, based on the research ques­tions at hand.

24.4.2 Modeling Non-Take-Up and Noncompliance

One particular challenge arises with benefit non-take-up and tax noncompliance.[562] There is no natural data source with explicit information about these phenomena, and modeling each is highly context-specific. Accounting for take-up and noncompliance behavior in tax-benefit models is important because it affects estimates of fiscal aggregates (i.e., total benefit expenditures and tax revenues), but even more importantly, it can affect various parts of the income distribution in a different way. Furthermore, take-up and compliance behavior are likely to be affected by tax-benefit policy reforms and, hence, are themselves endogenous factors in the analysis. Even if microsimulation models com­monly assume full take-up and compliance, this has an important implication for cross­national comparisons as results are unlikely to be consistent, as long as the prevalence and patterns of non-take-up and noncompliance vary across countries.

Benefit non-take-up refers to the situation in which those eligible for a given benefit do not successfully claim it for various reasons. This could simply be due to people not being aware of their entitlement (or even the existence of a particular form of public support), being put off by a complex or time-consuming claiming process, or related to social stigma, such as not wanting to appear vulnerable and dependent on others’ support. In an economic context, these factors can be summarized as implied costs related to take­up (Hernandez et al., 2007). Anotherlikely key determinant is the size of the entitlement (Blundell et al., 1988), both in absolute terms and relative to other income sources and wealth of the claimant. Benefit take-up tends to be higher for universal benefits because the claiming process is simpler and the associated social stigma lower. Arguably, people are most likely to claim contributory benefits (e.g., for old age and maternity) because these are directly linked to their own previous contributions and, hence, entitlement is perceived to be more justified, while take-up of means-tested benefits tends to be lower. Therefore, assuming full take-up can distort comparisons between various benefits and make some benefits seem more effective than in fact they are. It also matters how extensive and long-established the benefit scheme is, because the benefit’s scale and lon­gevity contribute to the spread of knowledge among the population. A related phenom­enon is benefit leakage, meaning that a benefit is received by those who should not be eligible. This could either indicate an unintentional error on behalf of the benefit admin­istrator or claimant, or benefit fraud.

28

Studies estimating the scale and determinants of benefit take-up require information on eligibility for a given benefit and actual benefit awards. Because benefit eligibility is not directly observed (for a wider population), it must be inferred from relevant individ­ual and household characteristics on the basis of benefit rules, and as such, it constitutes a microsimulation exercise in itself.Depending on the nature of the rules, especially when means-testing is involved, there can be complex interactions with other tax-benefit instruments, as well as with tax compliance. It is difficult to overemphasize the impor­tance of data quality in this context, and most precise estimates can presumably be obtained with administrative data providing information as close as possible to that used by the welfare agencies, as well as actual benefit receipt (e.g., Bargain et al., 2012). Forthis to cover all potentially eligible people and not just claimants, it implies that agencies rely (mainly) on information from existing registries (e.g., tax records) rather than data col­lection from the claimants. Even then, there can still be some scope for simulation error if the claiming process involves factors such as discretion on behalf of officials awarding benefits. For example, in some countries, local social welfare offices are given a consid­erable level of discretion in deciding who is in greater need and, hence, more qualified for public support. On the other hand, there could be also errors made by the program administrators in the assessment of the eligibility, resulting in incorrect approval or rejec­tion of the claim.

This type of administrative data, if it exists, is usually not accessible, and most empir­ical studies have relied on survey data instead. There are, however, additional challenges with survey data due to potential measurement error in the observed benefit receipts and other characteristics affecting the eligibility and the entitlement calculation (see Section 24.4.1). For example, survey respondents may have simply forgotten the receipt of a particular benefit, associated it with an incorrect period or benefit type, or intention­ally left it unreported (e.g., because of social stigma). Often, there is also a time delay between becoming entitled and receiving a first payment. Therefore, a careful assessment and cleaning of benefit data are usually required (e.g., Hancock and Barker, 2005; Matsaganis et al., 2010). Similarly, individual and household characteristics relevant for determining benefit eligibility and entitlement might be reported with error, espe­cially other income sources and/or assets in the case of means-tested benefits. There have been only a few attempts to model the various errors explicitly (Duclos, 1995, 1997; Hernandez and Pudney, 2007; Zantomio et al., 2010).

The modeling of benefit take-up becomes even more complicated when considering the receipt of multiple benefits (e.g., Dorsett and Heady, 1991; Hancock et al., 2004), interactions with labor supply (e.g., Bingley and Walker, 1997, 2001; Keane and Moffitt, 1998; Moffitt, 1983) or dynamics in take-up behavior (e.g., Anderson and Meyer, 1997; Blank and Ruggles, 1996). Analyses combining several of these aspects are rare (e.g., Chan, 2013), and avoiding behavioral responses in other dimensions, such as labor supply, is one reason why many of the recent advances in take-up modeling have concentrated on take-up among the retired or others unable to work (e.g., Hernandez and Pudney, 2007; Pudney et al., 2006; Zantomio et al., 2010). Much of the applied research has been done for the UK and US (see above), but, among others, there are also studies for Canada (Whelan, 2010), Finland (Bargain et al., 2012), Germany (Bruckmeier and Wiemers, 2012; Riphahn, 2001), Greece and Spain (Matsaganis et al., 201 0).[563] For recent reviews, see Hernanz et al. (2004) and Currie (2004).

Despite general progress with modeling take-up, it remains a challenge to deal with in microsimulation models due to the data requirements and complexities involved. Ideally, tax-benefit models should treat take-up endogenously in simulations, because policy reforms can change take-up behavior (e.g., Zantomio et al., 2010). Such attempts remain scarce (see Pudney et al., 2006). A second best approach is to predict the probability of take-up conditional on personal characteristics that are not affected by policy changes and hence remain constant in policy simulations. To predict take-up on the basis of previ­ously estimated statistical models, the same explanatory variables need to be present in the data used for the tax-benefit model. Furthermore, take-up is highly circumstantial, and a prediction model developed for one benefit in one country is unlikely to perform satisfactorily for other benefits or countries. A simpler approach commonly used to account for incomplete benefit take-up in tax-benefit models is to assign take-up ran­domly among the group of eligible units for a given benefit such that the aggregate take-up rate matches that in official statistics or previous studies (e.g., Hancock and Pudney, 2014; Redmond et al., 1998; Sutherland et al., 2008). This is obviously a rather crude approach because some people are more likely to claim than others, and, hence, it may not be sufficient to align aggregate benefit expenditure with official statistics, par­ticularly if take-up is correlated with the level of entitlement. Another option is to link benefit entitlement to the observed receipt, which, however, seriously limits the scope for simulations.

Tax noncompliance (or tax evasion) is the other side of the coin and refers to inten­tional effort to lower tax liability in unlawful ways. In the context of tax-benefit models, this primarily concerns income tax and payroll tax evasion, in the form of underreporting taxable income or overreporting (income tax) deductions. Compared to benefit non take-up, this is an even more challenging issue for several reasons. First, take-up is binary by nature (i.e., an eligible person either claims or not), but tax compli­ance is often partial. Second, there is no single data source that would allow the precise measurement of tax evasion. Although tax records contain income reported to the tax authority, “true” income remains unobserved. Third, evading taxes may also affect how related incomes are reported to surveys. These constraints point towards the need to combine and utilize multiple data sources to study tax evasion and help to explain why hard empirical evidence at the individual level is very scarce.

Studies estimating the extent and determinants of tax noncompliance by individuals have mainly relied on audited tax records (e.g., Clotfelter, 1983; Erard, 1993, 1997; Erard and Ho, 2001; Feinstein, 1991; Martinez-Vazquez and Rider, 2005). Although tax audits are designed to detect tax noncompliance, these are not often carried out ran­domly and target those more likely to evade on the basis of initial screening. Repeated and extensive random tax audits, from which insights into tax evasion can be inferred for a broader population, have been primarily carried out in the US. However, even audits are unable to detect all noncompliance, especially income underreporting in which cash transactions are involved, and usually have very limited information on individual characteristics.

Surveys offer a much richer set of information on individuals but usually lack a good measure of noncompliance. Some surveys include explicit questions on compliance (e.g., Forest and Sheffrin, 2002), but given its sensitivity, the reliability of such self-reported data is unclear (Elffers et al., 1992). On the other hand, studies such as Pissarides and Weber (1989), Lyssiotou et al. (2004), and Hurst et al. (2014) have relied on indirect methods, employing econometric models that contrast surveyed income and consump­tion. These, however, are inevitably cruder and allow for a less detailed analysis of compliance.

Finally, laboratory experiments are common in tax compliance research, (Alm et al., 1992, 2009, 2012; Laury and Wallace, 2005). Although experiments allow one potential determinant to be isolated from the rest and for clearer conclusions to be drawn about causality, it is unclear how well conditions in the laboratory reflect actual behavior, not least as the subjects are typically students without substantial experience paying taxes.

Overall, there is substantial evidence on factors influencing people’s decision to evade taxes. There are also studies showing that tax noncompliance is more prevalent for income sources that are less easily tracked by the tax authority (see Klepper and Nagin, 1989; Kleven et al., 2011). For example, the extent of underreporting income from self-employment is notably higher compared to wages and salaries because the latter are usually subject to third-party reporting (i.e., by employers), which reduces opportu­nities for evasion (though it does not necessarily eliminate these). Fewer studies have focused on the distributional implications of tax noncompliance (e.g., Doerrenberg and Duncan, 2013; Johns and Slemrod, 2010), some in combination with microsimula­tion modeling (Benedek and Lelkes, 2011; Leventi et al., 2013). For reviews of theoret­ical and empirical literature on tax evasion, see Andreoni et al. (1998), Slemrod (2007) and Alm (2012).

However, given the highly specific datasets that are often involved in the study of tax compliance, it is not straightforward to utilize previous findings in tax-benefit models, nor is it easy to provide one’s own estimates with the type of data commonly used for microsimulation. This helps to explain why attempts to account for tax noncompliance in tax-benefit models seem to remain very limited (e.g., Ceriani et al., 2013; Matsaganis and Leventi, 2013). On the other hand, this may also reflect the fact that microsimulation studies lack details on such adjustments. Therefore, the first step towards improving the modeling of tax noncompliance (as well as benefit take-up) is increasing transparency about how this is handled (if at all) in existing models and studies.

24.4.3 Assessing the Reliability of Microsimulation Estimates

The overall credibility of a microsimulation model in simulating the effects of a given tax­benefit policy encompasses different aspects, some of which are interrelated, and include the application of “sound principles of inference in the estimation, testing and validation” (Klevmarken, 2002).

First, the reliability of a microsimulation model is closely tied to its validation and transparency, which are indicated by the extent to which solid documentation exists for the internal features of the model and the validation of the results against external sta­tistics. Unfortunately, a high level of transparency does not characterize many of the microsimulation models used in the academic and policy literature, which tend to be “black boxes.” Good practice is to provide a detailed description of all tax-benefit com­ponents simulated, including details of assumptions used, as well as information about the input data and related transformations or imputations. Documented validation of the out­put against external statistics on benefit recipients and taxpayers and total expenditure/ revenue is also an important component of the informed use of microsimulation models.

Nevertheless, such validation is not a comprehensive assessment for three reasons. First, as illustrated in Section 24.4.1, microsimulation estimates and the information avail­able in official statistics may not be comparable conceptually. Second, in some countries, limited external information is available, and in all it is rarely available without a time delay. Third, although it is possible to validate results for existing and past systems, it is usually not possible to find independent estimates of the effects of policy reforms. A correct baseline does not ensure that the model or its input data can correctly estimate the effect of a reform.

In addition, as mentioned by Wolf (2004), a persistent failure of most microsimulation applications is the lack of recognition of the degree of statistical uncertainty associated with the results, some of which is inherent in the sampling process that underlies the input micro-data and some of which is propagated from simulation errors and estimated param­eters. The accuracy of the underlying data, the correct and detailed representation of the tax-benefit rules, and the actual implementation of the policy parameters in the simulation code determine the point-estimate of the simulated policy. Nevertheless, the correct inter­pretation of the results should take into account their statistical inference—an aspect often neglected in the microsimulation literature—which also depends on the nature of the model and whether it is purely deterministic or also involves probabilistic or econometric specifications.

To start with, simulations are subject to the same degree of sampling error, measure­ment error, and misreporting as any other analysis based on survey data. On the one hand, as discussed in Section 24.4.1, simulations can improve the accuracy of results by simulating the exact rules rather than relying on observed values that might be misreported. On the other hand, the simulation process can introduce other sources of errors due to, for exam­ple, approximations in the simulation of tax benefit rules, adjustments for noncompliance or non-take-up, updating ofmonetary parameters and sociodemographic characteristics to the simulation year, or ignoring behavioral responses or market adjustments.

In the case of simulation of the first-order effects of policy changes, Goedeme et al. (2013) argue that the lack of attention to the statistical significance of the results is unde­sirable and unjustified due to the availability of standard routines embedded in most stan­dard statistical software. Moreover, when comparing the statistics related to different scenarios, they show the importance of taking into account not only the sampling variance of the separate point estimates but also the covariance between simulated and baseline statistics which are based on the same underlying sample. This can lead to a generally high degree of precision for estimates of the effects of a reform on a particular statistic of interest.

The situation is much less straightforward in the case of more complex simulations involving revenue-neutral reforms or behavioral reactions that add additional sources of uncertainty due to the use of estimated wage rates for constructing the budget set and the preference parameters estimated using econometric models. Despite the growing literature on estimating the labor supply effects of policy changes (see Section 24.3.3), there are only a few examples of studies focusing on the analytical properties of the sam­pling distribution of the microsimulation outcomes that are affected by simulation uncer­tainty and estimation uncertainty. The former stems from the simulated choice set that can be different from the one that an agent would choose in reality. The estimation var­iability comes from the sampling variability of the estimated parameters ofthe labor supply model (Aaberge et al., 2000). Pudney and Sutherland (1994) derived the asymptotic sam­pling properties of the most important statistics usually reported in microsimulation stud­ies, taking into account the additional uncertainty introduced by the imposition of revenue neutrality in the construction of the confidence intervals. Pudney and Sutherland (1996) augmented the previous analysis, deriving analytically the asymptoti­cally valid confidence intervals of a number of statistics, allowing for errors associated with sampling variability, econometric estimation of parameters of a multinomial logit model of female labor supply, and stochastic simulation in the calculations. They concluded that sampling error is the largest source of uncertainty, but parameter estimation errors may add additional uncertainty that undermines the practical use of such behavioral models.

The complexity of the analytical solution associated with very detailed microsimula­tion models, rather complex policy simulation and sophisticated econometric models, has lead to the use of more tractable empirical approaches. Creedy et al. (2007) opted for a simulation approach to approximate the sampling distribution of statistics of interest based on the sampling distribution of the estimated parameters. The approach relies on a number of draws from the parameter distribution of the underlying behavioral model. Moreover, they suggest a simpler and more practical approach in which the functional form of the sampling distribution is assumed to be normal, requiring a small number of draws from the parameter distribution and leading to generally accurate results.

Furthermore, to avoid having to assume the normal distribution for stochastic terms, and exploiting the increasingly available computer power, assessing the statistical reliabil­ity of the estimates now commonly relies on resampling methods such as the bootstrap, which allows one to obtain a set of replicated econometric estimates used in one or more simulation runs. The variance of the replicated estimates is then used to capture the var­iability of the statistics of interest. Although the additional uncertainty added by behav­ioral modeling is not found to be critical for most analysis (e.g., Bargain et al., 2014), there are reasons for concern when the estimates refer to specific small demographic groups, and further developments in this research area are needed.

24.5.

<< | >>
Source: Atkinson Anthony, Bourguignon François. Handbook of Income Distribution. Volume 2B. North Holland, 2014. — 2366 p..
More economic literature on Economics.Studio

More on the topic CHALLENGES AND LIMITATIONS:

  1. CHALLENGES AND LIMITATIONS
  2. The Interfaces of Community Groups and Segments
  3. Textual Challenges to Constitutional Unamendability
  4. Contents
  5. Challenges to Dialogue
  6. THE CHALLENGE IN writing about rights in the Pakistani context is to describe the specificity of their articulation without making exaggerated claims about their distribution.
  7. In this chapter, we introduce a framework for learning from experience through reflection, an adaptation of Marsick and Watkins’ model of informal and inci­dental learning (Cseh, Watkins, and Marsick, 1999; Marsick and Watkins, 1990; Watkins and Marsick, 1993) and relate it to the challenges of conflict res­olution.
  8. The Challenge of Ambiguity
  9. Strategies for Improving Communication and Conflict Management
  10. Introduction