The Global Informal Workforce is a fresh look at the informal economy around the world and its impact on the macroeconomy. The book covers interactions between the informal economy, labor and product markets, gender equality, fiscal institutions and outcomes, social protection, and financial inclusion. Informality is a widespread and persistent phenomenon that affects how fast economies can grow, develop, and provide decent economic opportunities for their populations. The COVID-19 pandemic has helped to uncover the vulnerabilities of the informal workforce.


The shadow economy is, by nature, difficult to measure, because agents engaged in shadow economy activities try to remain undetected. The request for information about the extent of the shadow economy and its developments over time is motivated by its political and economic relevance. Moreover, total economic activity, including official and unofficial production of goods and services, is essential in the design of economic policies that respond to fluctuations and economic development over time and across space. Furthermore, the size of the shadow economy is a core input for estimating the extent of tax evasion and thus deciding how best to control it.

The shadow economy is known by different names, such as the hidden economy, the gray economy, the black economy or lack economy, the cash economy, or the informal economy. All these synonyms refer to some type of shadow economy activities. We use the following definition: the shadow economy includes all economic activities that are hidden from official authorities for monetary, regulatory, and institutional reasons. Monetary reasons include avoiding paying taxes and all social security contributions; regulatory reasons include avoiding governmental bureaucracy or the burden of regulatory framework; and institutional reasons include corruption law, the quality of political institutions, and weak rule of law. For our study, the shadow economy reflects mostly legal economic and productive activities that, if recorded, would contribute to national GDP; therefore, the definition of the shadow economy in this chapter tries to exclude illegal or criminal activities, do-it-yourself activities, or other household activities.1

Empirical research into the size and development of the global shadow economy has grown rapidly (Gerxhani 2003; Feld and Schneider 2010; Schneider 2011, 2015, 2017; Schneider and Williams 2013; Hassan and Schneider 2016; Williams and Schneider 2016). This chapter (1) analyzes the growth of knowledge about the shadow economy in a review covering the past 20 years, concentrating mainly on knowledge about established or new estimation methods; (2) defines or categorizes the shadow economy and new measures of indicator variables, such as the light intensity approach; and (3) presents estimates of the size of the shadow economy for 158 countries over 25 years. We have three concrete goals:

  • 1. To extensively evaluate and discuss the latest developments regarding estimation methods, such as the System of National Accounts (SNA) approach and new micro and macro methods, and the crucial evolution of the macro methodologies—namely the currency demand approach (CDA) and the multiple indicators, multiple causes (MIMIC) model—in tackling the problem of double counting.

  • 2. To present shadow economy estimates for 158 countries from 1991 to 2015 while addressing early criticism. In particular, when using the MIMIC approach, GDP per capita, growth rate of GDP, or first differences in GDP are often used as cause as well as indicator variables. Instead of GDP, we use a light intensity approach as an indicator variable, then run a variety of robustness tests to further assess the validity of our results.2 We, in addition to MIMIC, use a fully independent method, the predictive mean matching (PMM) method by Rubin (1987), which overcomes these calibration problems. This is one of the first attempts both to include the light intensity approach as an indicator variable within MIMIC and to use a full alternative method, such as PMM.3

  • 3. To compare the results of the different estimation methods, showing their strengths and weaknesses, and critically evaluate them.

This chapter is organized as follows. First we draw theoretical considerations and discuss the most important cause variables. Next we discuss methods to estimate the size of the shadow economy. We go on to address the macro methods’ shortcomings, introduce the use of night lights (that is, light intensity) as a proxy for the size of an economy, and discuss additional robustness tests. We also cover the econometric results of the MIMIC estimations of the size of the shadow economy for 158 countries and critically evaluate them. Later on we compare the MIMIC results with micro survey results and SNA discrepancy method results before summarizing our findings and providing a conclusion.

Theoretical Considerations

Individuals are rational calculators who weigh costs and benefits when considering breaking the law. Their decisions to partially or completely participate in the shadow economy are choices overshadowed by uncertainty because they involve a trade-off between gains—if the activities are not discovered—and losses, if the activities are discovered and penalized.

Shadow economic activities, SE, thus negatively depend on the probability of detection, p, and potential fines, f, and positively on the opportunity costs of remaining formal, denoted as B. Opportunity costs are positively determined by the burden of taxation, T, and high labor costs, W—because of labor market regulations, individual income generated in the shadow economy is usually categorized as labor income rather than capital income. Hence, the higher the tax burden and labor costs, the more incentives individuals have to avoid these costs by working in the shadow economy. The probability of detection, p, itself depends on enforcement actions, A, taken by the tax authority and on the facilitating activities, F, individuals undertake to reduce the detection of shadow economic activities. Such calculations suggest the following structural equation:


Hence, shadow economic activities may be defined as those economic activities and income earned that circumvent government regulation, taxation, or observation. More narrowly, the shadow economy includes monetary and non-monetary transactions of a legal nature, hence, all productive economic activities that would generally be taxable were they reported to the state (tax) authorities. Such activities are deliberately concealed from public authorities to avoid payment of income, value added, or other taxes and social security contributions or to avoid compliance with certain legal labor market standards, such as minimum wages, maximum working hours, or safety standards and administrative procedures.

The shadow economy thus focuses on productive economic activities that would normally be included in national accounts but that remain underground because of tax or regulatory burdens.4 Although such legal activities would contribute to a country’s value added, they are not captured in national accounts because they are produced in illicit ways. Informal household economic activities such as do-it-yourself projects and neighborly help are typically excluded from analyses of the shadow economy.5

What are the determinats of the shadow economy? The size of the shadow economy depends on various elements. The literature highlights specific causes and indicators.6 Table 1.1 presents the main causes and indicators that determine the shadow economy.

Table 1.1.

Main Causes and Indicators That Determine the Shadow Economy

article image
article image
article image
Source: Schneider 2017.

To address criticism of the use of official GDP, this chapter relies on data on light intensity from outer space as a proxy for the “true” economic growth achieved by countries. This approach has been also successfully used by Medina, Jonelis, and Cangul (2017) in the context of sub-Saharan African countries.

Estimation Methods

We now describe the methods used to measure the shadow economy,7 highlighting their advantages and drawbacks.8 These approaches, including the model-based approach, can be divided into direct or indirect. We then discuss the MIMIC approach in depth, including its shortcomings and a way to overcome them: a structured, hybrid model–based estimation approach combining the CDA and MIMIC models.

Direct Approaches to Estimation

Three direct and micro methods of measuring the shadow economy9 are briefly presented and critically evaluated: the SNA discrepancy method, representative surveys, and surveys of company managers.10

SNA Discrepancy Method

Gyomai and van de Ven (2014) describe this method in detail, starting with a classification for measuring the nonobserved economy:

  • 1. Underground hidden production. Activities that are legal and create value but are deliberately concealed from public authorities.

  • 2. Illegal production. Productive activities that generate goods and services forbidden by law or unlawful when unauthorized.

  • 3. Informal sector production. Productive activities conducted by incorporated enterprises in the household sector or by other units that are registered or have fewer employees than specified size and that have some market production.

  • 4. Production of households for own (final) use. Productive activities that result in goods or services consumed or capitalized by the households that produced them.

  • 5. Statistical “underground.” All productive activities that should be accounted for in basic data collection programs but are missed when statistical systems are deficient.

Gyomai and van de Ven (2014) provide a precise definition of nonobserved estimates to reach the goal of exhaustive estimates.

Hidden Activities

As stated in SNA 2008 § 6.40 certain activities may clearly fall in the production boundary of the SNA and also be legal but are deliberately concealed from public authorities to avoid (1) paying income tax, value added, or other payments; (2) paying social security contributions; (3) having to meet certain legal standards, such as minimum wages, maximum hours, and safety or health standards; and (4) complying with certain administrative procedures, such as completing statistical questionnaires or other administrative forms.

Illegal Activities

SNA 2008 § 6.43 describes two kinds of illegal production: (1) the production of goods or services whose sale, distribution, or possession is forbidden by law; and (2) production activities that are usually legal but become illegal when carried out by unauthorized producers, for example, unlicensed medical practitioners.

SNA 2008 § 6.45 indicates that the production boundary encompasses both kinds of illegal production, provided they are genuine production processes whose outputs consist of goods or services for which there is an effective market demand.

With this classification, Gyomai and van de Ven (2014) provide a comprehensive and useful categorization of shadow economy and underground activities. This estimation method is applied by National Statistical Offices and is explained in detail in their handbook for measuring the nonobserved economy (Organisation for Economic Co-operation and Development [OECD] 2010). The authors argue that the nonobserved economy is estimated at three stages during the integrated production of national accounts. First, data sources with identifying biases on reporting on scope are corrected through imputations. Second, upper-bound estimates are used to access the maximum possible amount of nonobserved economy activity for a given industrial activity or product group on the basis of a wide array of available data. And third, special purpose surveys are conducted for areas where regular surveys provide little guidance and small-scale models built to indirectly estimate areas where direct observation and measurement are not feasible.

Figure 1.1 shows how nonobserved economy producers are classified to reach estimates with the SNA method.

Figure 1.1.
Figure 1.1.

Classification of the Nonobserved Economy

Source: Van de Ven 2017.

Classification of the nonobserved economy is a careful procedure that considers all possible situations to achieve an exhaustive estimation. The national accounts method to capture all nonobserved economic activities combines several classifications to yield four nonobserved economy categories:

  • Economic underground: N1 + N6

  • Informal (and own account production): N3 + N4 + N5

  • Statistical underground: N7

  • Illegal: N2

Much work has been done on the first three categories, but less so on illegal activities; however, the European Union has shown increased interest in accounting for illegal activities since their inclusion has become mandatory with the introduction of the European System of National and Regional Accounts.

In general, discrepancy analysis is performed at a disaggregated level, and the nature of adjustment allows various nonobserved economy categories to be at least partly identified. The methodological descriptions countries provide the SNA reveal that country practices in adjusting for nonobserved economies are often quite similar.

Still, substantial differences exist between various OECD countries. Table 1.2 presents nonobserved economy adjustments by informality type for 16 OECD countries from 2011 to 2012. The total nonobserved economy as a percentage of GDP varies considerably.11 Also the adjustments in the different categories are considerable. This method of discrepancy analysis reveals that some countries have large shadow economies, such as Italy with 17.5 percent of official GDP, followed by Mexico with 15.9 percent, the Slovak Republic with 15.6 percent, and Poland with 15.4 percent. The smallest shadow economy here is in Norway, with 1.0 percent.

Table 1.2.

Nonobserved Economy Adjustments, by Informality Type, 2011–12

(Percent of GDP)

article image
Source: Gyomai and van de Ven 2014.Note: N1 to N7 are classifications in the System of National Accounts; values in parentheses are the percentage of that adjustment type within the total nonobserved economy. Ellipses indicate data not available.

Representative Surveys

Representative surveys are often used to get some micro knowledge about the size of the shadow economy and shadow labor markets.12 These surveys are designed to investigate public perceptions of the shadow economy, actual participation in shadow economy activities, and opinions about shadow practices. As an example, the Lithuanian Free Market Institute and its partner organizations for Belarus, Estonia, Latvia, Poland, and Sweden designed surveys to gauge the public’s experiences with the shadow labor market. The surveys were administered between May 22 and June 15, 2015. The target audience included local residents ages 18 to 75 years, yielding 6,000 respondents across the six countries. The most important results for our purposes are presented in Tables 1.3 and 1.4.13

Table 1.3.

Undeclared Working Hours as a Proportion of Normal Working Hours, 2015

article image
Source: Zukauskas and Schneider 2016.Note: The values for the experience of friends or relatives in the shadow labor market and average weekly undeclared hours are from a survey, whereas normal average weekly working hours are from the Eurostat database for 2014. In the absence of such data for Belarus, it was estimated as an average of normal working hours for Central and Eastern European countries that belong to the European Union.
Table 1.4.

Aggregated Shadow Wages as a Proportion of GDP, 2015

article image
Source: Zukauskas and Schneider 2016.Note: Undeclared hours worked per year are calculated as follows: Shadow Frequency/100 × Average Undeclared Weekly Hours Worked by Persons Who Performed Shadow Activities × 52 × Total Population of Individuals Ages 18–74 Years. The values for shadow frequency, average undeclared weekly hours, and average undeclared hourly wage are from a survey, whereas the population of individuals ages 18–74 years and GDP at current prices are from the Eurostat database for 2014.
Table 1.5.

Size of the Shadow Economy in Baltic Countries, 2009–15

(Percent of GDP)

article image
Source: Putnins and Sauka 2015.

Table 1.3 shows undeclared working hours as a proportion of normal working hours from 2015. Undeclared hours, as a share of normal working hours on the basis of a weekly calculation, vary between 4.2 percent in Sweden and 20.7 percent in Poland. This variation is not unexpected, because the shadow economy in Sweden is much smaller than the one in Poland. If one considers the average weekly undeclared hours worked by respondents with shadow experience, the range is much narrower, ranging between 25.5 hours in Poland and 16.8 hours in Lithuania.

Table 1.4 shows the extent of aggregated shadow wages as a proportion of GDP. Sweden has by far the lowest, with 1.7 percent of GDP as shadow employment; Belarus, the largest, with 32.8 percent, followed by Poland with 24.0 percent.

Surveys of Company Managers

Putnins and Sauka (2015)—and, in a similar way, Reilly and Krstic (2018)—use surveys of company managers as a micro approach to measuring the size of the shadow economy. Both studies combine misreported business income and misreported wages as a percentage of GDP. The method produces detailed information on the structure of the shadow economy, especially in the service and manufacturing sectors. Researchers survey company managers with the premise that these respondents are most likely to know how much business, income, and wages are unreported because of their unique position in dealing with observed and non-observed income.

Putnins and Sauka (2015) and Reilly and Krstic (2018) use a range of survey-designed features to maximize the truthfulness of responses. Their method combines estimations of misreported business income, unregistered or hidden employees, and unreported wages to calculate a total estimate of the size of the shadow economy as a percentage of GDP. This approach is different from most other studies of the shadow economy, which largely focus on either macroeconomic indicators or surveys about households. Putnins and Sauka have developed first results for Estonia, Latvia, and Lithuania. All countries, as Table 1.5 shows, demonstrate a decline in the size of the shadow economy from 2009 to 2015. The largest shadow economy is in Latvia, with a 27.8 percent average from 2009 to 2015, followed by Estonia with 17.4 percent, and Lithuania with 16.4 percent.

Indirect Approaches to Estimation

Indirect approaches, alternatively called indicator approaches, are mostly macroeconomic in nature. These are in part based on a number of approaches, including (1) discrepancy between national expenditure and income statistics, (2) discrepancy between the official and actual labor forces, (3) the “electricity consumption” approach, (4) the “monetary transaction” approach, and (5) the “currency demand” approach. The MIMIC approach will also be discussed extensively here.

Discrepancy between National Expenditure and Income Statistics

If those who work in the shadow economy hid their incomes for tax purposes but not their expenditure, then the difference between national income and national expenditure estimates could be used to approximate the size of the shadow economy. This approach assumes that all components on the expenditure side are measured without error and constructed so that they are statistically independent from income factors.14

Discrepancy between Official and Actual Labor Force

If total labor force participation is assumed to be constant, a decline in official labor force participation can be interpreted as an increase in the importance of the shadow economy. Fluctuation in the participation rate might have many other explanations, such as position in the business cycle, difficulty in finding a job, and education and retirement decisions, but these estimates represent weak indicators of the size of the shadow economy.15

Electricity Consumption Approach

Kaufmann and Kaliberda (1996) endorse the idea that electricity consumption is the single best physical indicator of overall (official and unofficial) economic activity. Using findings that indicate that electricity use to overall GDP elasticity is close to 1, these authors suggest using the difference between growth of electricity consumption and growth of official GDP as a proxy for growth of the shadow economy. This method is simple and appealing, but its many drawbacks include the fact that (1) not all shadow economy activities require a considerable amount of electricity (for example, personal services) or they may use other energy sources (coal, gas), hence only part of the shadow economy growth is captured; and (2) electricity to overall GDP elasticity might significantly vary across countries and over time.16

Monetary Transaction Approach

Using Fischer’s quantity equation, Money * Velocity = Prices * Transactions, and assuming that there is a constant relationship between the money flows related to transactions and the total (official and unofficial) value added, that is, Prices * Transactions = k (Official GDP + Shadow Economy), it is reasonable to derive the equation Money * Velocity = k (Official GDP + Shadow Economy). The stock of money and official GDP estimates are known, and money velocity can be estimated. Thus, as Feige (1979) posited, if the size of the shadow economy as a proportion of the official economy is known for a benchmark year, then the shadow economy can be calculated for the rest of the sample. Although theoretically attractive, this method has several weaknesses: (1) the assumption that k would be constant over time seems arbitrary and (2) other factors, such as the development of checks and credit cards, could also affect the desired amount of cash holdings and thus velocity.17

Currency Demand Approach

Assuming that informal transactions take the form of cash payments to evade observation by the authorities, an increase in the size of the shadow economy will, consequently, increase demand for currency (Cagan 1958). To isolate this “excess,” Tanzi (1980) suggests using a time-series approach in which currency demand is a function of conventional factors, such as the evolution of income, payment practices, and interest rates, as well as factors causing people to work in the shadow economy, such as the direct and indirect tax burden, government regulation, and the complexity of the tax system. Several problems, however, are associated with this method and its assumptions: (1) the CDA may underestimate the size of the shadow economy because not all transactions use cash as a means of exchange, (2) currency demand deposits may increase because of a slowdown in demand deposits rather than an increase in currency used in informal activities, (3) it seems arbitrary to assume equal velocity of money in both the shadow and the formal economies, and (4) the assumption of no shadow economy in a base year is arguable (Cagan 1958; Gutmann 1977; Tanzi 1980, 1983; Schneider 1997; Johnson, Kaufmann, and Zoido-Lobatón 1998b).

MIMIC Approach

This method explicitly considers several causes, as well as the multiple effects, of the shadow economy. The method uses associations between the observable causes and the effects of an unobserved variable, in this case the shadow economy, to estimate the variable itself (Loayza 1996; also see Vuletin 2008; Schneider 2010, 2015; Feld and Schneider 2010; Slemrod and Weber 2012, Abdih and Medina 2013; and Williams and Schneider 2016).

The Model or Macro MIMIC Approach

The MIMIC model is a special type of structural equation modeling (SEM) that is widely applied in psychometrics and social science research and is based on the statistical theory of unobserved variables developed in the 1970s by Zellner (1970) and Joreskog and Goldberger (1975). The MIMIC model is a theory-based approach to confirm the influence of a set of exogenous causal variables on the latent variable (the shadow economy), as well as the effect of the shadow economy on macroeconomic indicator variables.

At first, it is important to establish a theoretical model explaining the relation between the exogenous variables and the latent variable. Therefore, the MIMIC model is considered to be a confirmatory rather than an explanatory method. The hypothesized path of the relations between the observed variables and the latent shadow economy on the basis of our theoretical considerations is depicted in Figure 1.2.

Figure 1.2.
Figure 1.2.

MIMIC Estimation Procedure

Source: Schneider, Buehn, and Montenegro 2010.Note: MIMIC = multiple indicators, multiple causes.

The pioneers to apply the MIMIC model to measure the size of the shadow economy in 17 OECD countries were Frey and Weck-Hanneman (1984). Following them, scholars such as Buehn, Karmann, and Schneider (2009); Schneider, Buehn, and Montenegro (2010); and Hassan and Schneider (2016) applied the MIMIC model to measure the size of the shadow economy.

Formally, the MIMIC model has two parts: the structural model and the measurement model.

The MIMIC structural and measurement estimation procedures (compare also Figure 1.2) are conducted as follows:

  • 1. Model the shadow economy as an unobservable (latent) variable

  • 2. Describe the relationships between the latent variable and its causes in a structural model: η = Γx + ζ

  • 3. Represent the link between the latent variable and its indicators in the measurement model: y = Λy η + ε


  • η: latent variable (shadow economy)

  • x: (q × 1) vector of causes in the structural model

  • y: (p × 1) vector of indicators in the measurement model

  • Γ: (1 × q) coefficient matrix of the causes in the structural equation

  • Λ : (p > × 1) coefficient matrix in the measurement model y

  • ζ : error term in the structural model

  • ε : (p × 1) vector of measurement error in y.

The specification of the structural equation is as follows:

article image

The specification of the measurement equation is as follows:

|Employment QuotaChange of Local CurrencyAverage Working Time|=|λ1λ2λ3|×|Shadow Economy|+|ε1ε2ε3|,

where γi and λi are coefficients to be estimated.

Two steps derive the absolute values:

  • 1. The shadow economy remains an unobserved phenomenon (latent variable), which is estimated using causes of illicit behavior (such as tax burden and regulation intensity) and indicators reflecting illicit activities (such as currency demand and official work time). This procedure produces only relative estimates of the size of the shadow economy.

  • 2. The CDA is used to calibrate the relative estimates into absolute ones by using absolute values of the CDA as starting values for the shadow economy.

The benchmarking procedure used to derive real-world values of shadow economic activities has been criticized (Breusch, 2005a, 2005b). Because the latent variable and its unit of measurement are not observed, SEMs only provide estimated coefficients from which one can calculate an index that shows the dynamics of the unobservable variable. Application of the so-called calibration or benchmarking procedure, regardless which is used, requires experimentation and a comparison of the calibrated values in a wide academic debate. At this stage of research, it is unfortunately unclear which benchmarking method is the best or most reliable.18

The economic literature using SEMs is well aware of these limitations. It acknowledges that it is not easy to apply this method to an economic data set but also argues that this does not mean one should abandon the SEM approach. For those following an interdisciplinary approach to economics, SEMs are valuable tools for economic analysis, particularly when studying the shadow economy. Moreover, the objections mentioned should be considered incentives for further research rather than a reason to abandon the method.

Identification Problem with MIMIC Estimates

MIMIC approach estimations “produce” only relative weights. We need another approach to normalize these estimates, and the estimates’ validity depends on the reliability of this second approach. Hence it is difficult to draw statistically confirmed conclusions about causal relations in the real world, not only in the model built from these estimates.

Kirchgaessner (2016, 103) correctly argues:

A necessary condition for testing whether a variable x has a causal impact on a variable y is that the two variables are measured independently. The MIMIC model approach assumes that causal relations exist and, therefore, estimates a linear combination of these (supposedly) causal variables that more or less fits several indicator variables. This linear combination is assumed to be a representation of the unknown variable shadow economy.

This calculation of the shadow economy is not an empirical test of the actual existence of a shadow economy. Neither does the calculation demonstrate that the used causal or explanatory variables have a statically significant effect on the “true” shadow economy. Kirchgaessner (2016, 103) argues further that “significant test statistics in the structural model only show that the used explanatory (or causal) variables contribute significantly to the variance of the constructed variable, shadow economy. We have to assume that this construction represents the shadow economy to make statements about possible causal relations.” Hence, these causal variables cannot be used again in subsequent studies to identify policy variables that might reduce or increase the shadow economy. If this is done, a statistically significant relation must automatically or trivially result, argue Feld and Schneider (2016, 115).

To overcome this problem, Kirchgaessner (2016) suggests using other macro approaches, such as the measure of electricity consumption, which devises the size of the shadow economy independently from the causes used in the MIMIC model. Then one can check whether a tax increase leads to a rise in the shadow economy. To conclude, caution is warranted when using shadow economy estimates to test the effect of a tax reduction. This is only possible if the shadow economy series is derived from an approach in which the tax variable has not been used for the construction of the shadow economy.

Structured, Hybrid Model–Based Estimation

Dybka and others’ (2017) novel hybrid procedure addresses previous critique of the CDA and MIMIC models by Feige (1996) and Breusch (2016), particularly misspecification in the CDA equations and “vague” transformation of the latent variable obtained through the MIMIC model into interpretable levels and paths of the shadow economy.

Dybka and others’ (2017) proposal is based on a new identification method for the MIMIC model, referred to as “reverse standardization.” Reverse standardization supplies the MIMIC model with panel-structured information on the latent variable’s mean and variance obtained from the CDA estimates, treating this information as given in the restricted full-information maximum likelihood function. This approach does not require the choice of an externally estimated reference point for benchmarking or adopting other ad hoc identifying assumptions (such as unity restriction on a selected parameter in the measurement equation).

Furthermore, the proposed estimation procedure directly addresses the numerical problem of negative variances in the MIMIC estimation, largely disregarded in previous off-the-shelf software. The nonnegativity restriction on variances within the MIMIC framework can materially affect the significance, specification decisions, and measurement results. Paying due respect to the (intuitive) constraint on the nonnegativity of variances may lead to a surprising result of flattening the trajectory of the shadow economy.

Also, the analysis of variance decomposition of SE estimated by our hybrid strategy confirms findings from the previous literature by showing that as much as 97.2 to 98.2 percent of SE variance in the panel is due to the CDA component (between cross-sections), whereas only the small remaining fraction is due to MIMIC’s fine tuning. The latter finding may lead to a legitimate question on the actual contribution of MIMIC models to shadow economy measurement.

First, Dybka and others (2017) estimate and extend a panel version of the CDA equation using both frequent and neglected variables (describing the development of an electronic payment system) and abandon the controversial assumption that the share of the shadow economy in the total economy is zero.

Second, Dybka and others (2017) estimate a MIMIC model by maximizing a (full-information) likelihood function reformulated in two ways: (1) instead of anchoring the index of an arbitrary time period and using arbitrary normalizations or other discretionary corrections, they use the means and variance estimated in the CDA model; and (2) they constrain the parameter vector to explicitly assume away the negative variances of structural errors and measurement errors. Their hybrid model proposes a solution to the long-standing problem of identification in the MIMIC model, which, in many ways, outperforms previous approaches to just-identification. Their approach clearly implies a scale and unit of measurement, avoids obscure ad hoc corrections, and paves the way to the construction of a sensible confidence interval. This new method is a promising approach to overcoming the usual critiques of the CDA and the MIMIC model.

In Table 1.6, statistical offices’ shadow economy estimates are compared with the MIMIC estimates derived for this chapter. Macro and adjusted MIMIC values are shown. If we compare the results, we see that within each method the size of the shadow economy varies considerably but is on average much smaller than the macro and adjusted values. The adjusted MIMIC values come close to the values of Dybka and others (2017) for Bulgaria and Switzerland. Values from Dybka and others (2017) and those from the statistical offices are in a similar range for Bulgaria, Israel, Mongolia, Sweden, and the United Kingdom, if using the estimation result of the FGLS44-AR variant. For Croatia, Dybka and others (2017) obtain considerably higher values than those provided by the statistical offices. In the case of Moldova it is the opposite. To summarize, the Dybka and others (2017) estimation method is promising and most of the values are considerably lower than those obtained using the traditional macro methods of the CDA and MIMIC.

Table 1.6.

Shadow Economy Estimates from Statistical Offices and from Currency Demand Models, 2009–15

article image
Sources: Dybka and others 2017, 22, Table 7, for FGLS, FGLS44, and FGLS44-AR; Gyomai and van de Ven 2014 for the data of statistical offices; and authors for macro and adjusted MIMIC values.Note: FGLS = feasible generalized least square; FGLS44 and FGLS44-AR = specific FGLS estimations; MIMIC = multiple indicators, multiple causes; N/A = not applicable; . . . = not available.

The “Double Counting” Problem

Another problem with macro approaches such as the MIMIC or CDA is that they use causal factors such as tax burden, unemployment, self-employment, and regulation, which are also responsible for people undertaking do-it-yourself activities or asking friends and neighbors for help. Hence, do-it-yourself activities, neighbors’ or friends’ help, and legally bought material for shadow economy activities are included in these macro approaches. This means that in these macro approaches (including the electricity approach) a “total” shadow economy is estimated that includes do-it-yourself activities, neighbors’ help, legally bought material, and smuggling.

In Table 1.7, a decomposition is undertaken for shadow economy activities in Estonia and Germany. Table 1.7 starts with the macro MIMIC estimate, as an average value for 2009 to 2015, of 24.94 percent of GDP for Estonia and 9.37 percent for Germany. Legally bought material for shadow economy or do-it-yourself activities and friends’ help is deducted. Then illegal activities are deducted. Furthermore, do-it-yourself activities and neighbors’ help are deducted. These subtractions yield a corrected shadow economy roughly two-thirds of the macro size of the shadow economy: 65 percent for Estonia and 64.2 percent for Germany. This correction factor is used to adjust the size of the shadow economy using the MIMIC method. The results for 31 European countries for 2017 are presented in Figure 1.3. The shadow economy appears considerably smaller, perhaps a more realistic value of its actual size, using a macro method.

Table 1.7.

Decomposition of Shadow Economy Activities in Estonia and Germany, 2009–15


article image
Source: Authors, based on Enste and Schneider (2006) and Buehn and Schneider (2013).Note: DIY = do-it-yourself.

The total shadow economy is estimated by the multiple indicators, multiple causes model and calibrated by currency demand procedures.

Illegal activities include, for example, smuggling.

DIY activities and neighbors’ help do not include legally bought material, which is included in (2).

Figure 1.3.
Figure 1.3.

The Size of the Shadow Economy in Selected European Countries, 2017

(Percent of GDP)

Source: Authors’ calculations.Note: MIMIC = multiple indicators, multiple causes.

Mimic Estimation Results

In Tables 1.8 through 1.10, each of which includes six specifications, MIMIC estimation results for our entire sample of 158 countries is presented for 1991 to 2015.19 Table 1.8 shows the estimation results for the entire sample of 158 countries. All cause variables (trade openness, GDP per capita, unemployment, size of government, fiscal freedom, rule of law, control of corruption, and government stability) have the theoretically expected signs, and most are highly statistically significant. The indicator variables also have the theoretical expected signs and are highly statistically significant. The test statistics are satisfactory.

Table 1.8.

MIMIC Model Estimation Results, All Sample Countries, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.**p < 0.05; ***p < 0.01.

Table 1.9 shows the estimation results for 105 low-income developing countries. Here the cause variable rule of law is not statistically significant in specification 1, nor is control of corruption in specification 2. These variables are significant and show the expected sign in the other specifications. The indicator variable labor force participation rate is again highly statistically significant.

Table 1.9.

MIMIC Model Estimation Results, Low-Income Developing Countries, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.

Results for 26 advanced economies are presented in Table 1.10. Here trade openness is not statistically significant in all specifications, but in all other specifications, except size of government and government stability, most cause variables have the expected signs and are statistically significant.20 The indicator variables are all statistically significant and have the expected signs.

Table 1.10.

MIMIC Model Estimation Results, Advanced Economies, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.

Alleviation of Potential Shortcomings

Even though the standard MIMIC model of Schneider (2010) and others has been widely used in the literature, it has also been criticized for (1) the use of GDP (GDP per capita and growth of GDP per capita) as both cause and indicator variables; (2) the method relying on another independent study to calibrate from standardized values to estimate the size of shadow economy in percentage of GDP; and (3) the estimated coefficients being sensitive to alternative specifications, the country sample, and time span chosen.21 Points 2 and 3 will not be discussed in this chapter because they are extensively discussed in Schneider (2016).22

Night Lights (or Light Intensity) Approach

This analysis addresses the first criticism as follows: instead of using GDP per capita and growth of GDP per capita as both cause and indicator variables, we use the night lights approach developed by Henderson, Storeygard, and Weil (2012) to independently capture economic activity.23 They use data on light intensity from outer space as a proxy for “true” economic growth.24 This approach also uses the estimated elasticity of light intensity with respect to economic growth to produce new estimates of national output for countries deemed to have low statistical capacity. Therefore, by using the night lights approach, we address MIMIC criticisms related to the endogeneity of GDP in a novel way, which is totally independent from problematic GDP measures traditionally used (Medina, Jonelis, and Cangul 2017).

In Tables 1.11 through 1.13, each of which includes six alternative specifications, the MIMIC estimation results using light intensity are shown for 1991 to 2015 for different country samples, depending on data availability. Table 1.11 shows the estimation results for all countries and uses light intensity as an indicator variable. All cause variables (trade openness, unemployment, size of government, fiscal freedom, rule of law, control of corruption, and government stability) have the theoretically expected signs and most are highly statistically significant, except control of corruption. The indicator variables also have the theoretical expected signs and are highly statistically significant. The test statistics are satisfactory.

Table 1.11.

MIMIC Model Estimation Results Using Night Lights Instead of GDP, All Sample Countries, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.

Table 1.12 shows the estimation results for 103 low-income developing countries. Here, the cause variable unemployment is not statistically significant, nor are rule of law and control of corruption. The indicator variable labor force participation is again statistically significant.

Table 1.12.

MIMIC Model Estimation Results Using Night Lights Instead of GDP, Low-Income Developing Countries, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.

The results for 24 advanced economies are presented in Table 1.13. Here, trade openness is not statistically significant in all specifications, but in all other specifications, most cause variables are statistically significant, except government stability. The indicator variables are all statistically significant and have the expected signs.

Table 1.13.

MIMIC Model Estimation Results Using Night Lights Instead of GDP, Advanced Economies, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.

Predictive Mean Matching

PMM (Rubin 1987) treats the empirical challenge in the estimation of the size of the shadow economy as a missing data problem: survey-based estimates of the size of the shadow economy are available for several, but not all, countries.25

Missing data can result from three mechanisms: (1) missing completely at random (MCAR), (2) missing at random (MAR), or (3) missing not at random (MNAR) (Little and Rubin 2002). PMM analysis assumes that for the shadow economy, the mechanism is MAR. This means that the probability that an observation is missing can depend on observed covariates of nonmissing units and missing units, but it cannot depend on missing data on the size of the shadow economy. In other words, the probability that a country is missing data on its shadow economy can depend on characteristics relevant for the shadow economy, but the size of the shadow economy itself should not be a factor. This assumption can be challenged because one can argue that a large shadow economy would be difficult to measure, resulting in missing data. Furthermore, a large shadow economy can be associated with institutional weaknesses and associated capacity constraints that would also make it less likely to be measured. However, available survey data are available for large informal economies as well, such as Burundi and Niger. Therefore, at least in practice, the MAR assumption is somewhat validated but would have to be checked through sensitivity analyses that would operate under MNAR.

The objective is to match the countries where data exist to the those where data are missing using characteristics that would be relevant to the size of the shadow economy.

One challenge inherent in the empirical problem of estimating the size of the shadow economy is that, for many countries, institutional capacity constraints make it hard to estimate. The shadow economy is complex, encompassing many related factors that in any estimation procedure may produce problems of endog-eneity and other empirical challenges. A principal constraint in this exercise is that those countries for which some estimation of the shadow economy is available are not similar to countries where this is missing.

PMM circumvents this challenge somewhat by producing multiple data sets using a Bayesian setup. Therefore, where data for similar countries are lacking, the method is able to compensate by taking advantage of the inherent uncertainty associated with a missing data problem.

The other advantage of PMM is that in its actual estimation step, it is non-parametric. It does not suffer from any problems associated with a regular regression method in which dissimilar countries would be estimated (1) using the same covariates and (2) assuming linear extrapolations across covariate distributions that may be different and far apart from one another. The principle of similarity in PMM prevents this fundamental problem: it matches countries lacking data to countries that have data on the basis of their similarity. But how is this similarity itself estimated? This is the crux of the method. Similar to PMM, propensity score matching is also a promising candidate. The constraint with propensity score matching in this case, however, is that not enough similar observations are matched to run separate regressions or even make nonparametric estimates for each group because of the number of estimations required.

The similarity principle for PMM is established using a linear regression. Here, we estimate the following simple ordinary least squares model:


where Y is the size of the shadow economy as a percentage of GDP, GE is a government effectiveness index, RQ is a regulatory quality index, C is a corruption index, ROL is a rule of law index, BF is a business freedom index, SE is self-employment levels, HDI is the Human Development Index, and E is an education variable.

The distinctive feature of PMM is that this regression is not used to estimate the size of the shadow economy, but rather as a matching tool. For matching, the following seven stages are computed using the SAS Proc MI procedure:26

  • 1. A random draw is made from the posterior predictive distribution of the estimated covariate coefficient matrix β., resulting in a new covariate coefficient matrix β*¯.

  • 2. Using β*¯., we predict Y* for all countries.

  • 3. The algorithm then identifies countries that had an actual Yi and whose predicted Y* are closest to the predicted Y* of the countries missing the data. Hence matches between Y*iobs and Y*imiss: predicted values for the outcome variable originally missing and originally having an estimate of the size of the shadow economy.

  • 4. Each country with missing data is assigned to a group that has similar countries with data from the previous procedure.

  • 5. In each group, the MI procedure randomly selects a match to the countries missing the outcome and assigns the observed outcome from the match to be the estimated outcome variable.

  • 6. Steps 1 to 5 are repeated five times, generating five distinct data sets with imputed values of the shadow economy, mimicking the inherent variability caused by the uncertainty associated with the missing data mechanism.

  • 7. To produce a final estimate, the five data sets for the size of the shadow economy are averaged.27

The PMM results are consistent with the rankings produced by the MIMIC method (see Table 1.14), with Spearman’s rank correlation at 61 percent and statistical significance at 1 percent. Furthermore, when the MIMIC and PMM samples are divided into three subgroups of shadow economy sizes, specifically “lower than 20 percent of GDP,” “between 20 and 30 percent of GDP,” and “higher than 30 percent of GDP,” more than 60 percent of countries coincide between samples.

Table 1.14.

The Size of the Shadow Economy Using the Predictive Mean Matching Method, 1991–2015

(Percent of GDP)

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; PMM = predictive mean matching.

Average from 1991 to 2015.

Average from 1991 to 2015; results from this chapter’s MIMIC estimations.

Additional Robustness Tests

This section further tests the robustness of the results by fully removing the effects of GDP, dropping both GDP per capita as cause and growth of GDP per capita as indicator.

MIMIC estimation results for 1991 to 2015 for different country samples, depending on data availability, are presented in Tables 1.15, 1.16, and 1.17. Results include six alternative specifications per table. These results are consistent with those in the previous sections.

Table 1.15.

MIMIC Model Estimation Results Excluding GDP and GDP per Capita, All Countries, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.
Table 1.16.

MIMIC Model Estimation Results Excluding GDP and GDP per Capita, Low-Income Developing Countries, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.**p < 0.05; ***p < 0.01.
Table 1.17.

MIMIC Model Estimation Results Excluding GDP and GDP per Capita, Advanced Economies, 1991–2015

article image
Source: Authors.Note: MIMIC = multiple indicators, multiple causes; RMSEA = root mean square error of approximation.*p < 0.1; **p < 0.05; ***p < 0.01.

Results for 158 Countries Using MIMIC

In Table 1.18, the most important results for the 158 countries are shown.28 The mean value of the size of the shadow economy of the 158 countries is 31.9 percent, and the median is 32.3 percent. The similar values indicate that there is not a strong deviation. The three largest shadow economies are Georgia with 64.9, Bolivia with 62.3, and Zimbabwe with 60.6. The three smallest shadow economies are Switzerland with 7.2, the United States with 8.3, and Austria with 8.9. The average shadow economy comes close to Equatorial Guinea with 31.8 percent and Suriname with 32.2 percent of official GDP.

Table 1.18.

Summary Statistics of the Shadow Economy, 158 Selected Economies, 1991–2015

article image
article image
article image
Source: Authors.