
David H. Wolpert, Michael H. Price, Stefani A. Crabtree, Timothy A. Kohler, Jürgen Jost, James Evans, Peter F. Stadler, Hajime Shimao, Manfred D. Laubichler
The concept of history unfolding stochastically is not new; in the context of the history of life on Earth, Stephen J. Gould famously asked what would happen if we could “replay the tape”, which implicitly supposes that an underlying stochastic process generated that tape. Similarly, stochastic process modeling of environmental dynamics has been used to infer exogenous perturbations to the dynamics of human social systems. In addition, the phylogenetic tree reconstructions of human language dynamics have been used as a “clock” to infer the dynamics of socio-political phenomena. There has also been some work directly applying time-series analysis techniques to socio-political datasets.
These are isolated instances though, rather than a systematic scientific program. Here we propose something more fundamental: that by grounding our investigations of human social dynamics in stochastic process models, we can not only better investigate the historical record, but also begin to unify the myriad approaches that have been championed for analyzing that record. Such a program would also potentially allow us to detect drivers for the historical processes that generated that historical record — in particular, drivers that had not already been anticipated in social science models. This might allow the data to drive our formulation of social science models, as an adjunct to the more conventional approach under which we analyze datasets only after we first formulate models (e.g., based on intuitive insight and/or on analogizing with models from other scientific fields). Crucially, as we illustrated with the examples above, both the datasets and computational tools necessary for this vision to become a reality are now coming into being.
It is important to emphasize that we do not argue that one specific stochastic process we have identified generates the dynamics of history. (Indeed, we expect that it will be most fruitful to view history as multiple, interwoven stochastic processes, all with different characteristics.) We are not even advocating whether a time-homogeneous process or time-inhomogeneous process be considered. Ultimately, as in all statistical analyses, the choice of model to fit to the data is governed by considerations of number of data, size of the space of variables, types of variables, etc., with cross-validation used to help winnow the options.
We are also not advocating that one specific state space be used to model the stochastic process(es) of human history. Nor are we arguing which subsequent analyses should be applied to stochastic process(es) models inferred from historical data, e.g., to uncover possible causal relationships among historical variables.
More generally, we are also not arguing that all of historical analysis should be formulated in terms of stochastic processes. Statics, as opposed to dynamics, is also (obviously) an extremely important aspect of historical analysis, as history is not only concerned with time series analysis, but also with revealing the internal structures of societies and the patterns of their interactions at any given single point of time. Indeed, even in those sciences where all phenomena are based on a single dynamic law, like quantum physics (Schrödinger’s equation), much research focuses on statics rather than dynamics.
Finally, we note that a stochastic process formulation is also central to the other historical sciences, ranging from biology to meteorology to geology. So not only does this perspective allow us to unify the analyses of computational history, it also allows us to align how we investigate human history with how it is done in the other historical sciences.
