Thursday, 4 September 2014

Mismeasuring Long Run Growth. The Bias from Spliced National Accounts

Leandro Prados de la Escosura
is Professor in Economic History at
Universidad Carlos III de Madrid
Last April it was made public that Nigeria’s GDP figures for 2013 had been revised upwards by 89 per cent, as the base year for its calculation was brought forward from 1990 to 2010 (Financial Times April 7, 2014). As a result, Nigeria became the largest economy in Sub Saharan Africa. Though spectacular, this is not an exceptional case. Ghana (2010), Argentina’s (1993) or Italy’s (1987) also experienced dramatic upward revisions of their GDP. 

How should this revision affect GDP time series and, consequently, the country’s relative position? Should the existing historical series be re-scaled in the same proportion? 

Official national accounts are usually available from mid-twentieth century onwards, but often only for the latest decades. Furthermore, official national accounts are only constructed in a homogeneous way for short periods. Hence, the output of national accounts needs to be spliced with historical national accounts. Thus, when a homogeneous long-run GDP series is required, various sets of national accounts using different benchmark years and often constructed with dissimilar methodologies need to be spliced. The alternative choice of splicing procedures to derive a single GDP series may result in substantial differences in levels and growth rates and, hence, in significant biases in the assessment of economic performance over time.

National accounts rely on complete information on quantities and prices in order to compute GDP for a single benchmark year, which is, then, extrapolated forward on the basis of limited information for a sample of goods and services. To allow for changes in relative prices and, thus, to avoid that forward projections of the current benchmark become non representative, national accountants periodically replace the current benchmark with a new and closer GDP benchmark. The new benchmark is constructed, in part, with different sources and computation methods. Often far from negligible differences in the new benchmark year between ‘new’ and ‘old’ national accounts stem from statistical (sources and estimation procedures) and conceptual (definitions and classifications) bases. Once a new benchmark has been introduced, newly available statistical evidence would not be taken on board to avoid a discontinuity in the existing series. Thus, the coverage of new economic activities partly explains the discrepancy between the new and old series. As a result, a problem of consistency between the new and old national account series emerges.

Is there a solution to this inconsistency problem? The obvious option would be computing GDP for the years covered by the old benchmark with the same sources and procedures employed in the construction of the new benchmark. However, this option is beyond the resources of an independent researcher. The challenge is, then, establishing the extent to which conceptual and technical innovations in the new benchmark series hint at a measurement error in the old benchmark series. In particular, whether the discrepancy in the overlapping year between the new benchmark (in which GDP is estimated with ‘complete’ information) and the old benchmark series (in which reduced information on quantities and prices is used to project forward the ‘complete’ information estimate from its initial year) results from a measurement error in the old benchmark’s initial year estimate.

A simple solution, widely used by national accountants (and implicitly accepted in international comparisons), is the backward projection, or retropolation, approach, that accepts the reference level provided by the most recent benchmark estimate (YT) and re-scales the earlier benchmark series (Xt) with the ratio between the new and the old series for the year (T) at which the two series overlap (YT/XT).
Underlying this procedure is the implicit assumption of an error level in the old benchmark’s series whose relative size is constant over time. In other words, no error is assumed to exist in the old series’ rates of variation that are, hence, retained in the spliced series YRt . Official national accountants have favoured this procedure of linking national accounts series on the grounds that it preserves the earlier benchmark’s rates of variation.

Usually the most recent benchmark provides a higher GDP level for the overlapping year, as its coverage of economic activities is wider. Thus, the backwards projection of the new benchmark GDP level with the available growth rates -computed at the previous benchmark’s relative prices- implies a systematic upwards revision of GDP levels for earlier years. This one-sided upward revision effect on the levels of spliced GDP series is hardly noticeable when discrepancies between the new and old benchmarks are small for the overlapping year and the considered time span is short. However, as the time horizon expands and earlier series are re-scaled once and again to match newer ones, the gap tends to deepen significantly.

An alternative to the backward projection linkage is provided by the interpolation procedure that accepts the levels computed directly for each benchmark-year as the best possible estimates, on the grounds that they have been obtained with ‘complete’ information on quantities and prices, and distributes the gap or difference between the ‘new ‘and ‘old’ benchmark series in the overlapping year T at a growing rate.
Contrary to the retropolation approach, the interpolation procedure assumes that the error is generated between the years 0 and T. Consequently, it modifies the annual rate of variation between benchmarks (usually upwards) while keeps unaltered the initial level –that of the old benchmark-. As a result, the initial level will be probably lower than the one derived from the retropolation approach.

The choice of linkage procedure makes a significant difference for GDP levels and growth rates. When the levels for earlier years are re-scaled upwards with the retropolation procedure, the country in question becomes retrospectively richer. Alternatively, interpolating each original benchmark tends to raise the economy’s rate of growth and, hence, casts a lower initial GDP level. Which method is preferable? A practical answer may be derived from the analysis of Spain’s experience, a country that went through a process of deep structural change during the second half of the twentieth century.

The figure below presents the GDP levels resulting from splicing national accounts through non-linear interpolation relative to the levels derived through extrapolation. It can be noticed how the over-exaggeration of GDP levels cumulates over time when the extrapolation method is used.

Ratio of spliced interpolated series to retropolated series, 1954-2013 (GDP at current prices). 

Differences between the results of the interpolation and retropolation procedures appear much more dramatic when placed in a long run perspective, that is, when the spliced national accounts are projected backwards into the nineteenth century with volume indices taken from historical accounts series. This is due to the fact that most countries grew at a slower pace before 1950, so its per capita GDP level by mid-twentieth century determines its earlier relative position in country rankings.

Thus, the choice of splicing procedure can result in far from negligible differences in the relative position of a country in terms of per capita income over the long run. As an illustration I present Spain’s relative position to France derived with retropolation and interpolation splicing methods below.

Spain’s Real Per Capita GDP (France = 1). Alternative Splicing Results (2011 EKS $)
According to the retropolation splicing procedure, by mid-nineteenth century, real per capita GDP in Spain would have been similar, if not superior, to that of France. If, alternatively, the relative position that results for Spain from the interpolation splicing procedure represents about 80 percent of the French. When the period 1850-1913 is considered, Spain would match France’s real income per head, according to the retropolated series, and reach only four-fifths if the interpolated series are employed. These proportions hardly alter if the period under comparison is extended to 1935. It can be conclude that whatever the measurement error embodied in the interpolation procedure may be, its results appear far more plausible than those resulting from the conventional retropolation approach.

The bottom line is that splicing national accounts must be handled with extreme care, especially when countries have experienced intense growth and deep structural change, as there is a risk to bias their income levels upwards and, consequently, their growth rates downwards. A systematic revision of national accounts splicing in fast growing countries over the last half a century using the interpolation approach would most probably reduce their initial per capita GDP levels while rise their growth with the result of a more intense and widespread catching up to the Core countries.

This blog post was written by Leandro Prados de la Escosura (Universidad Carlos III and CEPR)


de la Fuente Moreno, A. (2014), “A Mixed Splicing Procedure for Economic Time Series”, Estadística Española 56 (183): 107-121.

Maddison, A. (1991b), “A Revised Estimate of Italian Economic Growth 1861-1989”, Banca Nazionale del Lavoro Quarterly Review 177: 225-241.

Prados de la Escosura, L. (2014), Mismeasuring Long Run Growth. The Bias from Spliced National Accounts, EHES Working Paper 60.


  1. Dear Leandro, dear Kerstin,

    Great post : thanks !

    I am just wondering about the French comparison.
    First, it would be nice to know how was the French GDP was computed : interpolated or retropolated? Can we make the comparison if it was not computed using the same method as the Spanish one ?

    Second, I have always doubts when constant-through-time-price-level GDP numbers are used to make constant-through-space-price-level GDP comparisons. The French real GDP weights good-level prices in a way that makes sense to compare -- say -- 1900 and 2010 French GDP. The Spanisr real GDP weights good-level prices in a way that makes sense to compare -- say -- 1900 and 2010 Spanish GDP. These are different weighting schemes : can their 1900 results really be compared directly without strong assumption on comparative structural changes in both economies ?

  2. A Reply from Leandro Prados:
    Many thanks, Guillaume, for your accurate remarks.
    On the first question, you are right. The precise way of comparing two countries’ GDP per head would require that their estimates were obtained through the same procedure. However, in international datasets each country’s estimates have been derived with a different method rendering comparisons difficult. French GDP per head volume series used in my paper are those of the Maddison Project Database rebased to the 2011 EKS $ benchmark (a similar comparison is carried out in the paper’s appendix using 1990 Geary-Khamis $). Most probably, the spliced French series for the modern national accounts era were derived through “retropolation” and, therefore, the comparison with the “interpolated” series for Spain would lack homogeneity. My point is actually that, splicing GDP series through “retropolation” introduces a larger (upward) bias the deeper and faster a country’s structural change (relative price change) has been. In the case of France and Spain, the latter experienced faster structural transformation since the mid-20th century, so the “retropolation” method will result in over-exaggerated GDP per head levels for Spain. Since growth is generally less intense across countries before 1950, the upward bias in the GDP per head level for, say, 1950, will percolate through the entire historical series. This is illustrated by the black dotted line in Figure 2 of my post. Thus, even if the comparison is not ideal, it makes more sense when Spain’s relative position to France (or any advanced country) is computed with a Spanish GDP series obtained through “interpolation” than through “retropolation” (actually, the series for Spain in Maddison and Maddison Project databases are those I constructed using a previous linear interpolation approach). The bottom line is that we need to construct historical GDP series that use modern national accounts series spliced with the non-linear interpolation approach in order to make comparisons meaningful. My paper is a call for this cooperative effort.

    See the next comment for Leandro's reply to the second question.

  3. Continued reply from Leandro Prados:
    As regards your second question, I entirely agree. I made the same point in an old paper (Prados de la Escosura, 2000). Using per capita income levels obtained through backward projection of PPP-adjusted GDP levels (that is, at “international” prices) for a benchmark year (say, 1990) with volume indices derived at national prices implies a huge index number problem that grows as the time span considered widens. The reason is that this procedure implicitly assumes that the basket of goods and services and the structure of relative prices for the benchmark year remain unaltered over time (something definitively wrong as we teach our students that long run growth is about change in relative prices). Recent papers by Johnson et al (2013), Feenstra et al (2014) and the World Bank’s volume on the new 2011 ICP Round have addressed the inconsistency of combining levels derived at “international” prices for the benchmark year (say 2011, 2005, or 1990) and volume series derived at domestic prices. The obvious alternative is to construct PPPs for each year or, at least, every few years. See the attempt to produce comparable time series respecting all available ICP Rounds’ PPPs in Penn World Table 8.0 (Feenstra et al. 2014).
    Nonetheless, a short-cut to convert current exchange rates into PPP-exchange rates can be used, as I did in my 2000 paper. With this approach it is possible to derive the position of Spain relative to France using current price PPP-adjusted estimates for, say, 1900, with estimates for 1900 Spain obtained alternatively through “retropolation” and the “interpolation” approaches. Spain would have represented a 83 per cent of French GDP per head using the “retropolation” approach, and 68 per cent if, alternatively, the “interpolation” approach had been used. These results are similar to those obtained at 1990 G-K $ (86 and 69 per cent) and both lower than those at 2011 EKS $ (94 and 75 per cent). The resulting tentative estimates confirm, in my view, the over-exaggeration of Spain’s relative position resulting from the “retropolation” method.


    Feenstra, R.,C. R. Inklaar, and M. Timmer (2013), The Next Generation of the Penn World Table, NBER Working Paper No. 19255. Data set available at

    Johnson, S., W. Larson, C. Papageorgiou, and A. Subramanian (2013), ´Is Newer Better? Penn World Tables Revisions and Their Impact on Growth Estimates´, Journal of Monetary Economics 60: 255-275

    Prados de la Escosura, L. (2000), ‘International Comparisons of Real Product, 1820-1990. An Alternative Data Set’, Explorations in Economic History 37: 1-41.

    World Bank (2013), Measuring the Real Size of the World Economy. The Framework, Methodology, and Results of the International Comparisons Programme, Washington D.C.: The World Bank. Data set available at