add_peace_years()
calculates peace years for your ongoing conflicts. The function
works for both dyad-year and state-year data generated in peacesciencer. As of the forthcoming
v. 0.7.0, add_peace_years()
will be deprecated for the more generic and versatile add_spells()
. Users
are free to continue with the function, though I recommend it only for more balanced panels (like state-year or dyad-year),
and less for imbalanced panels (like leader-years, or leader-dyad-years). As the change in name implies, add_spells()
will
have greater flexibility with both cross-sectional units and time.
Arguments
- data
a dyad-year data frame (either "directed" or "non-directed") or state-year data frame
- pad
an optional parameter, defaults to FALSE. If TRUE, the peace-year calculations fill in cases where panels are unbalanced/have gaps. Think of a state like Germany disappearing for 45 years as illustrative of this.
Value
add_peace_years()
takes a dyad-year or state-year data frame and adds peace years for ongoing conflicts.
Dyadic conflict data supported include the Correlates of War (CoW) Militarized Interstate Dispute (MID) data set and the
Gibler-Miller-Little (GML) corrections to CoW-MID. State-level conflict data supported in this function include the UCDP
armed conflict data and the CoW intra-state war data.
Details
The function internally uses sbtscs()
from stevemisc. In the interest of full disclosure,
sbtscs()
leans heavily on btscs()
from DAMisc. I optimized some code for performance.
Importantly, the underlying function (sbtscs()
in stevemisc, by way of btscs()
in DAMisc)
has important performance issues if you're trying to run it when your event data are sandwiched by observations
without any event data. Here's what I mean. Assume you got the full Gleditsch-Ward state-year data from 1816 to 2020
and then added the UCDP armed conflict data to it. If you want the peace-years for this, the function will fail because
every year from 1816 to 1945 (along with 2020, as of writing) have no event data. You can force the function to "not fail"
by setting pad = TRUE
as an argument, but it's not clear this is advisable for this reason. Assume you wanted event data
in UCDP for just the extrasystemic onsets. The data start in 1946 and, in 1946, the United Kingdom,
Netherlands, and France had extrasystemic conflicts. For all years before 1946, the events are imputed as 1
for those countries that had 1s in the first year of observation and everyone else is NA and implicitly assumed to be a zero.
For those NAs, the function runs a sequence resulting in some wonky spells in 1946 that are not implied by (the absence of) the
data. In fact, none of those are implied by the absence of data before 1946.
The function works just fine if you truncate your temporal domain to reflect the nature of your event data. Basically,
if you want to use this function more generally, filter your dyad-year or state-year data to make sure there are no years
without any event data recorded (e.g. why would you have a CoW-MID analyses of dyad-years with observations before 1816?). This
is less a problem when years with all-NAs succeed (and do not precede) the event data. For example, the UCDP conflict data
run from 1946 to 2019 (as of writing). Having 2020 observations in there won't compromise the function output when pad = TRUE
is included as an argument.
Finally, add_peace_years()
will only calculate the peace years and will leave the temporal dependence adjustment
to the taste of the researcher. Importantly, I do not recommend manually creating splines or square/cube terms because
it creates more problems in adjusting for temporal dependence in model predictions. In a regression formula in R,
you can specify the Carter and Signorino (2010) approach as
... + gmlmidspell + I(gmlmidspell^2) + I(gmlmidspell^3)
(assuming you ran add_peace_years()
on a dyad-year data frame
including the Gibler-Miller-Little conflict data).
The Beck et al. cubic splines approach is ... + splines::bs(gmlmidspell, 4)
. This function includes
the spell and three splines (hence the 4 in the command). Either approach makes for easier model predictions,
given R's functionality.
References
Armstrong, Dave. 2016. ``DAMisc: Dave Armstrong's Miscellaneous Functions.'' R package version 1.4-3.
Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42(4): 1260--1288.
Carter, David B. and Curtis S. Signorino. 2010. "Back to the Future: Modeling Time Dependence in Binary Data." Political Analysis 18(3): 271--292.
Miller, Steven V. 2017. ``Quickly Create Peace Years for BTSCS Models with sbtscs
in stevemisc
.''
http://svmiller.com/blog/2017/06/quickly-create-peace-years-for-btscs-models-with-stevemisc/
Examples
# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
cow_ddy %>%
add_gml_mids(keep = NULL) %>%
add_cow_mids(keep = NULL) %>%
add_contiguity() %>%
add_cow_majors() %>%
filter_prd() %>%
add_peace_years()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_gml_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> Joining with `by = join_by(year, dyad)`
#> Joining with `by = join_by(year, dyad)`
#> # A tibble: 246,302 × 19
#> ccode1 ccode2 year gmlmidonset gmlmidongoing init1 init2 sidea1 sidea2 orig1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 20 1920 0 0 NA NA NA NA NA
#> 2 2 20 1921 0 0 NA NA NA NA NA
#> 3 2 20 1922 0 0 NA NA NA NA NA
#> 4 2 20 1923 0 0 NA NA NA NA NA
#> 5 2 20 1924 0 0 NA NA NA NA NA
#> 6 2 20 1925 0 0 NA NA NA NA NA
#> 7 2 20 1926 0 0 NA NA NA NA NA
#> 8 2 20 1927 0 0 NA NA NA NA NA
#> 9 2 20 1928 0 0 NA NA NA NA NA
#> 10 2 20 1929 0 0 NA NA NA NA NA
#> # ℹ 246,292 more rows
#> # ℹ 9 more variables: orig2 <dbl>, cowmidonset <dbl>, cowmidongoing <dbl>,
#> # conttype <dbl>, cowmaj1 <dbl>, cowmaj2 <dbl>, prd <dbl>, cowmidspell <dbl>,
#> # gmlmidspell <dbl>
# }