Add Peace Years to Your Conflict Data — add_peace

add_peace_years() calculates peace years for your ongoing conflicts. The function works for both dyad-year and state-year data generated in peacesciencer. As of the forthcoming v. 0.7.0, add_peace_years() will be superseded for the more generic and versatile add_spells(). Users are free to continue with the function, though I recommend it only for more balanced panels (like state-year or dyad-year), and less for imbalanced panels (like leader-years, or leader-dyad-years). As the change in name implies, add_spells() will have greater flexibility with both cross-sectional units and time.

Usage

add_peace_years(data, pad = FALSE)

Arguments

data: a dyad-year data frame (either "directed" or "non-directed") or state-year data frame
pad: an optional parameter, defaults to FALSE. If TRUE, the peace-year calculations fill in cases where panels are unbalanced/have gaps. Think of a state like Germany disappearing for 45 years as illustrative of this.

Value

add_peace_years() takes a dyad-year or state-year data frame and adds peace years for ongoing conflicts. Dyadic conflict data supported include the Correlates of War (CoW) Militarized Interstate Dispute (MID) data set and the Gibler-Miller-Little (GML) corrections to CoW-MID. State-level conflict data supported in this function include the UCDP armed conflict data and the CoW intra-state war data.

Details

The function internally uses sbtscs() from stevemisc. In the interest of full disclosure, sbtscs() leans heavily on btscs() from DAMisc. I optimized some code for performance.

Importantly, the underlying function (sbtscs() in stevemisc, by way of btscs() in DAMisc) has important performance issues if you're trying to run it when your event data are sandwiched by observations without any event data. Here's what I mean. Assume you got the full Gleditsch-Ward state-year data from 1816 to 2020 and then added the UCDP armed conflict data to it. If you want the peace-years for this, the function will fail because every year from 1816 to 1945 (along with 2020, as of writing) have no event data. You can force the function to "not fail" by setting pad = TRUE as an argument, but it's not clear this is advisable for this reason. Assume you wanted event data in UCDP for just the extrasystemic onsets. The data start in 1946 and, in 1946, the United Kingdom, Netherlands, and France had extrasystemic conflicts. For all years before 1946, the events are imputed as 1 for those countries that had 1s in the first year of observation and everyone else is NA and implicitly assumed to be a zero. For those NAs, the function runs a sequence resulting in some wonky spells in 1946 that are not implied by (the absence of) the data. In fact, none of those are implied by the absence of data before 1946.

The function works just fine if you truncate your temporal domain to reflect the nature of your event data. Basically, if you want to use this function more generally, filter your dyad-year or state-year data to make sure there are no years without any event data recorded (e.g. why would you have a CoW-MID analyses of dyad-years with observations before 1816?). This is less a problem when years with all-NAs succeed (and do not precede) the event data. For example, the UCDP conflict data run from 1946 to 2019 (as of writing). Having 2020 observations in there won't compromise the function output when pad = TRUE is included as an argument.

Finally, add_peace_years() will only calculate the peace years and will leave the temporal dependence adjustment to the taste of the researcher. Importantly, I do not recommend manually creating splines or square/cube terms because it creates more problems in adjusting for temporal dependence in model predictions. In a regression formula in R, you can specify the Carter and Signorino (2010) approach as ... + gmlmidspell + I(gmlmidspell^2) + I(gmlmidspell^3) (assuming you ran add_peace_years() on a dyad-year data frame including the Gibler-Miller-Little conflict data). The Beck et al. cubic splines approach is ... + splines::bs(gmlmidspell, 4). This function includes the spell and three splines (hence the 4 in the command). Either approach makes for easier model predictions, given R's functionality.

References

Armstrong, Dave. 2016. “DAMisc: Dave Armstrong's Miscellaneous Functions.” R package version 1.4-3.

Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42(4): 1260–1288.

Carter, David B. and Curtis S. Signorino. 2010. "Back to the Future: Modeling Time Dependence in Binary Data." Political Analysis 18(3): 271–292.

Miller, Steven V. 2017. “Quickly Create Peace Years for BTSCS Models with sbtscs in stevemisc.” https://svmiller.com/blog/2017/06/quickly-create-peace-years-for-btscs-models-with-stevemisc/

Author

Steven V. Miller

Examples


# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
cow_ddy %>%
add_gml_mids(keep = NULL) %>%
add_cow_mids(keep = NULL) %>%
add_contiguity() %>%
add_cow_majors() %>%
filter_prd()  %>%
add_peace_years()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_gml_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(year, dyad)`
#> Joining with `by = join_by(year, dyad)`
#> # A tibble: 245,720 × 19
#>    ccode1 ccode2  year gmlmidonset gmlmidongoing init1 init2 sidea1 sidea2 orig1
#>     <dbl>  <dbl> <dbl>       <dbl>         <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>
#>  1      2     20  1920           0             0    NA    NA     NA     NA    NA
#>  2      2     20  1921           0             0    NA    NA     NA     NA    NA
#>  3      2     20  1922           0             0    NA    NA     NA     NA    NA
#>  4      2     20  1923           0             0    NA    NA     NA     NA    NA
#>  5      2     20  1924           0             0    NA    NA     NA     NA    NA
#>  6      2     20  1925           0             0    NA    NA     NA     NA    NA
#>  7      2     20  1926           0             0    NA    NA     NA     NA    NA
#>  8      2     20  1927           0             0    NA    NA     NA     NA    NA
#>  9      2     20  1928           0             0    NA    NA     NA     NA    NA
#> 10      2     20  1929           0             0    NA    NA     NA     NA    NA
#> # ℹ 245,710 more rows
#> # ℹ 9 more variables: orig2 <dbl>, cowmidonset <dbl>, cowmidongoing <dbl>,
#> #   conttype <dbl>, cowmaj1 <dbl>, cowmaj2 <dbl>, prd <dbl>, cowmidspell <dbl>,
#> #   gmlmidspell <dbl>
# }