Wednesday, December 11, 2013

What proxies to use for flow-through effects?

Summary: When assessing altruistic interventions from a long-run perspective, particularly ones which do not act directly on the long-run (like averting human extinction) assessing flow-through effects is essential. In this post I raise some possible metrics to use in assessing flow-through effects. The causal relationships between them are unclear, and exogenous interventions to boost a metric may break correlations with other outcomes: they are intended as a set of candidates to reference in later discussion. Additional candidates are welcome in the comments.

Total GDP
Overall economic output is a proxy (an imperfect one) for the total productive capacities of humanity. As total GDP increases, the share of income required to pay for fixed cost interventions such as asteroid defense or geoengineering falls. On the other hand, fixed cost harms also become more likely. Increased GDP also pushes forward technological advance, although probably less so than total capacities (other factors influence technological change as well).

Total population
We might distinguish population from GDP for several reasons. Some fields may be especially labor-intensive, suffering from Baumol's cost disease, and scaling better with population than GDP. A larger population may increase the absolute supply of 'genius.' Even if additional people are not currently contributing much to GDP, they increase the potential for gains from development or migration.

GDP/GNI per capita/total of log income
One of the most striking features of the last few centuries is that individual incomes in much of the world have climbed far above subsistence levels. In international analyses we might use GDP per capita to summarize this, but adding up the log incomes of individuals may give a more elegant and relevant measure.

In the world today there are very strong correlations between national GDP per capita and many other plausible measures of flow-through effects. For example, the Human Development Index combines income per capita with life expectancy and educational attainment, but since the latter two inputs are highly correlated with income per capita, national HDI remains very similar to national per capita income alone (here's a nice plot from Gapminder), and probably even more similar to national total log income.

High per capita wealth is correlated with significant differences in the World Values Survey, as Robin Hanson discusses in this post.

Educational inputs (years of schooling, distribution of schooling)
Measures here would include total and average years of schooling completed, frequency of tertiary education graduates, and similar. In addition to effects on other measures discussed here, increased education is a plausible candidate for increased "wisdom" on the part of the general population and key decisionmakers.

Educational outputs and quality, test scores
In a conversation with GiveWell, Lant Pritchett of the Center for Global Development says that measures of educational inputs (school constructions, years of school) fall substantially short, and that actual student learning should be a larger focus. The research of economist Eric Hanushek of Stanford, among others, makes the case that student test scores are powerfully correlated with long-run outcomes for individuals (later income, education, crime) and even moreso for economic growth at the national level. And again, this is a promising measure of wisdom.

Life expectancy/healthy life expectancy/biometrics
Life expectancy is an important measure of the welfare of individuals, and healthy life expectancy, adjusted for morbidity is even better. However, it is less clear how this measure relates to long-run impacts.

Something like "working life expectancy" might better capture potential impacts on total capacities (perhaps supplemented by more accurate measures of the productivity effects of aging). Older populations have been observed to be less violent, both individually and at the national level (although the later has many confounds). Older individuals have views that are more characteristic of earlier periods (relative to the young), which could be bad if one believes there is ongoing moral progress from generation to generation.

Other biometric measures of health can capture improved human biological capacities, e.g. the literature using human height to track improvements in nutrition, disease prevalence, and standard of living.

International peace and cooperation
War is a major potential source of global catastrophic and even existential risk. While several of the above measures (such as per capita income) have negative correlations with conflict and violence, other measures are available. These could include:
  • The number of nuclear weapons
  • The number of nuclear-armed states
  • Per capita death rates from violence
  • Rate of wars and civil wars
  • The frequency of military high alerts, confrontations, and interventions
  • Extent of economic ties between states
  • Scores on democracy indexes, based on Democratic Peace Theory
  • The quantity and distribution of military spending, the number of states with no armies
  • High-quality forecasts (e.g. using prediction markets, "super-forecasters" and other enhanced forecasting methods) to track changes in the potential for lower-frequency high-impact conflict
  • Homicide rates and disorganized violence
Institutional quality metrics
Measures of institutional quality face confounding with other positive characteristics, and their definitions may be politically biased, but they both have predictive power and plausibility. Better ones are surely possible, but some noteworthy existing ones include:
Measures of cost-effectiveness in government spending, the accuracy of scientific claims on which policy is based, and government forecasting accuracy could be helpful additions.

Environmental quality
Many environmental impacts will appear in healthy life expectancy, GDP, test scores (e.g. lead and mercury poisoning) and similar measures. However, long-term and lagged effects need to be accounted for as well, e.g. the accumulated burden of greenhouse gas emissions with long lifetimes.

Technological improvement
One fairly broad (although imperfect) measure of technological improvement is total factor productivity (TFP), a measure of efficiency of production in terms of labor, capital, materials, and other inputs. Labor productivity is also useful, and individual technologies have many relevant metrics:

  • Computations per dollar
  • Agricultural yields
  • Solar panel cost per kilowatt-hour
  • Telecommunications bandwidth
  • Natural language translation word error rates
General technological advance increases human capacities and wealth, but accelerates progress towards potentially dangerous technological transitions (affecting the characteristics of society at the time, so long as there are other ongoing changes in society), with the net effect ambiguous. Progress in particular technologies may be better or worse. 

Scientific output and quality
There is a field of scientometrics, but it is focused on inputs, such as the number of scientists, R&D spending, papers published, patents issued, and citations. There is no analog to student test scores to clearly assess quality, although there have been some efforts to trace commercial production to scientific effort (e.g. in pharmaceuticals, where the connections are comparatively clear). Recent efforts to improve scientific methodology suggest measures such as the reproducibility of scientific findings, but these do not track the utility of the findings. Nonetheless, better measures could be very helpful indeed in this area.

And more
Again, I invite new candidate proxies for flow-through effects in the comments.


Nick Beckstead said...

Inequality is notably absent from the list. It's something I haven't thought about as much as I'd like. I haven't heard very convincing empirical or conceptual arguments for important long-run effects (though I haven't looked very hard), but the level of emphasis on this one characteristic from common sense gives me pause. Would be interested in your thoughts.

Under education, perhaps there are relevant metrics of numeracy and scientific literacy to consider?


Perhaps religiosity as an (anti)-measure of rational thought?

Are measures of resources spent on rent-seeking accounted for in the measures of institutional quality you cited?

Brian Tomasik said...

This is an important post! I hope more EAs probe which flow-through effects are relevant in which ways for the far future. I think there hasn't been nearly enough discussion of it.

Among the factors listed, the three that seem to me most clearly positive and relevant to the far future are (1) wisdom, (2) international peace and cooperation, and (3) institutional-quality metrics. The signs are less clear for economic growth and especially technology, though I would love to hear more about whether you think the positives outweigh the negatives in more detail. My piece on differential intellectual progress has further discussion.

Isn't asteroid defense basically irrelevant given its tiny probabilities relative to other side-effects of economic growth? Geoengineering may play a bigger role, though of course, many would argue that faster economic growth would just make environmental problems worse.

Within wisdom, I would include progress in the social sciences and philosophy. These could be measured by scientometrics of the type you discuss for natural sciences. Candidates include number of publications, number of web pages discussing those topics, number and length of Wikipedia articles on those topics, etc. Of course, the proof of some of these domains is in the pudding -- e.g., insofar as they improve democracy, transparency, global cooperation, and so on.

One might also add measures of moral progress like equal treatment of minorities, animal welfare, etc.

As far as inequality, I haven't studied the literature extensively, but I have heard arguments about how it erodes many of the other metrics on Carl's list.

Toby Ord said...

Thanks for this post. A good conversation starter.

Regarding Nick's query about inequality, it is present in the use of total log income instead of GDP. This is a pretty standard way to combine inequality in X with progress in X, where you think levelling down is bad. That said, if this situation is sufficiently different from the standard one, then maybe an explicit inequality measure would be good.

Measures of coordination on various scales (international, national, social) would be good. One idea comes from one of Nick Bostrom's definitions for a singleton. Think of the value we would have if we solved all coordination problems at a given level and then look at what fraction of that we are currently achieving. Crimes rates could give some kind of social coordination estimate, but they tend not to be comparable statistically between times and places as the reporting standards keep changing. War statistics are a good example for the international level as you note. Something to deal with races to the bottom, cartels, rent-seeking, general prisoner's dilemma stuff etc would be nice.

I like the idea of some kind of measure of brute global productivity, not taking into account diminishing returns for individuals because it is not being assessed for its impact on individuals, but I wonder whether we could do better than GDP here? It has some crazy pathologies (e.g. whether we pay each other or swap favours changes it a lot in the context of women entering the workforce, and it counts selling assets as income...). Countries do try to do these things to exploit the bugs in the measure so this can't just be factored out.

Even ignoring the loopholes, I'm not sure the industrial capacity of the earth is linear in GDP. I wouldn't be shocked if it were a logarithmic (or exponential!) relationship. For instance we can ask which things are linear with GDP. Are any major things linear with it? Energy production? Materials production? Scientific advancement on some measure?

If not, we might want to use a transformed version of GDP (or a replacement of it with bug-fixes), such as log GDP or exponentiated GDP. This can presumably change the analysis a lot in terms of how this factor compares to the others. In summary, I really worry about people picking the closest proxy to mind (GDP) when trying to estimate this component of future productivity growth.

Carl said...

I think total GDP does have diminishing returns, and often use log GDP as a way to think about that.

The same goes for population, scientific papers published, etc. Proportional shifts are worth looking at.