Add Simulated GDP, Population, and GDP per Capita Data — add_sim_gdp

add_sim_gdp_pop() allows you to add estimated gross domestic product (GDP), population, and GDP per capita data provided by recent updates by Anders, Fariss, Markowitz (and now Barnum) to the original 2020 publication in International Studies Quarterly. The function leans on data available in isard, a spin-off package featuring data that have periodic updates.

Usage

add_sim_gdp_pop(data, keep)

Arguments

data: a data frame with appropriate peacesciencer attributes
keep: an optional parameter, specified as a character vector, about what estimates the user wants to return from this function. If not specified, everything from the underlying data is returned.

Value

add_sim_gdp_pop() takes a (dyad-year, leader-year, leader-dyad-year, state-year) data frame and adds information about the simulated GDP, population, and GDP per capita for that state (or pair of states) in a given year.

Details

You can read more about the data in the documentation for isard.

The function leans on attributes of the data that are provided by one of the "create" functions. Make sure a recognized function (or data created by that function) appear at the top of the proverbial pipe. Users will also want to note that the function accesses two different data sets. Thus, the data set it uses will depend on whatever peacesciencer understands is the "master" data set (communicated in the attributes field for system type).

Users primarily working in the Correlates of War system will be a little disappointed that the simulations the authors provide are demarcated in the Gleditsch-Ward system. The overlap is substantial, but the data the authors provide are at the mercy of the Gleditsch-Ward system for describing the universe of cases that could have a GDP, a population, or a GDP per capita. There will be conspicuous missingness for Correlates of War data concerning Serbia (1916, 1917), Morocco (1905-1912), Egypt (1856-1882), Saudi Arabia (1927-1931), and Laos (1953). Interested users may want to explore some imputation procedures, potentially leveraging older versions of the data.

Fariss et al. (2022) provide multiple variations of GDP and GDP per capita in their simulations, but the data I provide follow their suggested defaults. The GDP per capita is demarcated in constant 2011 international dollars (purchasing power parity (PPP)), GDP is expenditure-side real GDP in millions of 2017 international dollars (PPP). The simulated population estimate is in millions of people. The Maddison Project Database is the source of simulations for GDP per capita while Penn World Table is the source of simulations for GDP and population. You can use the latter two metrics and create another version of GDP per capita if you like.

The data in isard include simulated standard deviations around the estimate. It's understandable that users are interested in just the point estimate but the variation of uncertainty around the estimate is also important. You should consider incorporating it into your analyses. Be mindful that the data are fundamentally state-year and that extensions to leader-level data should be understood as approximations for leaders in a given state-year.

The keep argument must include one or more of the estimates included in the cw_gdppop or gw_gdppop data in the isard data. Otherwise, it will return an error that it cannot subset columns that do not exist.

References

Please cite Miller (2022) for peacesciencer. Beyond that, consult the documentation in isard for additional citations (contingent on which GDP, population, or GDP per capita estimate you are using).

Author

Steven V. Miller

Examples


# just call `library(tidyverse)` at the top of the your script
library(magrittr)

cow_ddy %>% add_sim_gdp_pop()
#> # A tibble: 2,214,930 × 15
#>    ccode1 ccode2  year mrgdppc1 sd_mrgdppc1 pwtrgdp1 sd_pwtrgdp1 pwtpop1
#>     <dbl>  <dbl> <dbl>    <dbl>       <dbl>    <dbl>       <dbl>   <dbl>
#>  1      2     20  1920    9642.       1468. 1179542.     527271.    107.
#>  2      2     20  1921    9618.       1481. 1170100.     494793.    109.
#>  3      2     20  1922    9837.       1522. 1215550.     491351.    110.
#>  4      2     20  1923   10318.       1636. 1320502.     532126.    112.
#>  5      2     20  1924   10633.       1653. 1348635.     585636.    114.
#>  6      2     20  1925   10871.       1765. 1430219.     610983.    116.
#>  7      2     20  1926   11070.       1720. 1446257.     596361.    117.
#>  8      2     20  1927   11112.       1696. 1477319.     607553.    119.
#>  9      2     20  1928   11204.       1765. 1532243.     663724.    120.
#> 10      2     20  1929   11106.       1744. 1517361.     650671.    122.
#> # ℹ 2,214,920 more rows
#> # ℹ 7 more variables: sd_pwtpop1 <dbl>, mrgdppc2 <dbl>, sd_mrgdppc2 <dbl>,
#> #   pwtrgdp2 <dbl>, sd_pwtrgdp2 <dbl>, pwtpop2 <dbl>, sd_pwtpop2 <dbl>

create_stateyears() %>% add_sim_gdp_pop()
#> Joining with `by = join_by(ccode, year)`
#> # A tibble: 17,511 × 9
#>    ccode cw_name     year mrgdppc sd_mrgdppc pwtrgdp sd_pwtrgdp pwtpop sd_pwtpop
#>    <dbl> <chr>      <dbl>   <dbl>      <dbl>   <dbl>      <dbl>  <dbl>     <dbl>
#>  1     2 United St…  1816   2668.       411.  27951.     12015.   9.16     0.690
#>  2     2 United St…  1817   2656.       421.  28557.     12256.   9.41     0.702
#>  3     2 United St…  1818   2644.       419.  29150.     12349.   9.69     0.711
#>  4     2 United St…  1819   2643.       422.  29640.     12462.   9.95     0.714
#>  5     2 United St…  1820   2657.       415.  30452.     13244.  10.2      0.711
#>  6     2 United St…  1821   2683.       420.  31981.     13850.  10.5      0.743
#>  7     2 United St…  1822   2714.       435.  32809.     14113.  10.8      0.760
#>  8     2 United St…  1823   2749.       423.  34079.     14294.  11.1      0.769
#>  9     2 United St…  1824   2785.       434.  35784.     16068.  11.4      0.798
#> 10     2 United St…  1825   2809.       447.  36887.     15932.  11.8      0.812
#> # ℹ 17,501 more rows

create_stateyears(system = "gw") %>% add_sim_gdp_pop()
#> Joining with `by = join_by(gwcode, year)`
#> # A tibble: 20,652 × 10
#>    gwcode gw_name  microstate  year mrgdppc sd_mrgdppc pwtrgdp sd_pwtrgdp pwtpop
#>     <dbl> <chr>         <dbl> <int>   <dbl>      <dbl>   <dbl>      <dbl>  <dbl>
#>  1      2 United …          0  1816   2668.       411.  27951.     12015.   9.16
#>  2      2 United …          0  1817   2656.       421.  28557.     12256.   9.41
#>  3      2 United …          0  1818   2644.       419.  29150.     12349.   9.69
#>  4      2 United …          0  1819   2643.       422.  29640.     12462.   9.95
#>  5      2 United …          0  1820   2657.       415.  30452.     13244.  10.2 
#>  6      2 United …          0  1821   2683.       420.  31981.     13850.  10.5 
#>  7      2 United …          0  1822   2714.       435.  32809.     14113.  10.8 
#>  8      2 United …          0  1823   2749.       423.  34079.     14294.  11.1 
#>  9      2 United …          0  1824   2785.       434.  35784.     16068.  11.4 
#> 10      2 United …          0  1825   2809.       447.  36887.     15932.  11.8 
#> # ℹ 20,642 more rows
#> # ℹ 1 more variable: sd_pwtpop <dbl>