add_peace_years()
calculates peace years for your ongoing conflicts.
The function works for both dyad-year and state-year data generated in
peacesciencer. As of the forthcoming v. 0.7.0, add_peace_years()
will be superseded for the more generic and versatile add_spells()
.
Users are free to continue with the function, though I recommend it only for
more balanced panels (like state-year or dyad-year), and less for imbalanced
panels (like leader-years, or leader-dyad-years). As the change in name implies,
add_spells()
will have greater flexibility with both cross-sectional
units and time.
Arguments
- data
a dyad-year data frame (either "directed" or "non-directed") or state-year data frame
- pad
an optional parameter, defaults to FALSE. If TRUE, the peace-year calculations fill in cases where panels are unbalanced/have gaps. Think of a state like Germany disappearing for 45 years as illustrative of this.
Value
add_peace_years()
takes a dyad-year or state-year data frame and adds
peace years for ongoing conflicts. Dyadic conflict data supported include the
Correlates of War (CoW) Militarized Interstate Dispute (MID) data set and the
Gibler-Miller-Little (GML) corrections to CoW-MID. State-level conflict data
supported in this function include the UCDP armed conflict data and the CoW
intra-state war data.
Details
The function internally uses sbtscs()
from stevemisc. In the
interest of full disclosure, sbtscs()
leans heavily on btscs()
from DAMisc. I optimized some code for performance.
Importantly, the underlying function (sbtscs()
in stevemisc, by
way of btscs()
in DAMisc) has important performance issues if
you're trying to run it when your event data are sandwiched by observations
without any event data. Here's what I mean. Assume you got the full
Gleditsch-Ward state-year data from 1816 to 2020 and then added the UCDP
armed conflict data to it. If you want the peace-years for this, the function
will fail because every year from 1816 to 1945 (along with 2020, as of
writing) have no event data. You can force the function to "not fail" by
setting pad = TRUE
as an argument, but it's not clear this is
advisable for this reason. Assume you wanted event data in UCDP for just the
extrasystemic onsets. The data start in 1946 and, in 1946, the United Kingdom,
Netherlands, and France had extrasystemic conflicts. For all years before
1946, the events are imputed as 1 for those countries that had 1s in the
first year of observation and everyone else is NA and implicitly assumed to
be a zero. For those NAs, the function runs a sequence resulting in some
wonky spells in 1946 that are not implied by (the absence of) the data. In
fact, none of those are implied by the absence of data before 1946.
The function works just fine if you truncate your temporal domain to reflect
the nature of your event data. Basically, if you want to use this function
more generally, filter your dyad-year or state-year data to make sure there
are no years without any event data recorded (e.g. why would you have a
CoW-MID analyses of dyad-years with observations before 1816?). This is less
a problem when years with all-NAs succeed (and do not precede) the event
data. For example, the UCDP conflict data run from 1946 to 2019 (as of
writing). Having 2020 observations in there won't compromise the function
output when pad = TRUE
is included as an argument.
Finally, add_peace_years()
will only calculate the peace years and
will leave the temporal dependence adjustment to the taste of the researcher.
Importantly, I do not recommend manually creating splines or square/cube
terms because it creates more problems in adjusting for temporal dependence
in model predictions. In a regression formula in R, you can specify the
Carter and Signorino (2010) approach as
... + gmlmidspell + I(gmlmidspell^2) + I(gmlmidspell^3)
(assuming you
ran add_peace_years()
on a dyad-year data frame including the
Gibler-Miller-Little conflict data). The Beck et al. cubic splines approach
is ... + splines::bs(gmlmidspell, 4)
. This function includes the spell
and three splines (hence the 4 in the command). Either approach makes for
easier model predictions, given R's functionality.
References
Armstrong, Dave. 2016. “DAMisc: Dave Armstrong's Miscellaneous Functions.” R package version 1.4-3.
Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42(4): 1260–1288.
Carter, David B. and Curtis S. Signorino. 2010. "Back to the Future: Modeling Time Dependence in Binary Data." Political Analysis 18(3): 271–292.
Miller, Steven V. 2017. “Quickly Create Peace Years for BTSCS Models with
sbtscs
in stevemisc
.”
https://svmiller.com/blog/2017/06/quickly-create-peace-years-for-btscs-models-with-stevemisc/
Examples
# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
cow_ddy %>%
add_gml_mids(keep = NULL) %>%
add_cow_mids(keep = NULL) %>%
add_contiguity() %>%
add_cow_majors() %>%
filter_prd() %>%
add_peace_years()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_gml_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(year, dyad)`
#> Joining with `by = join_by(year, dyad)`
#> # A tibble: 245,720 × 19
#> ccode1 ccode2 year gmlmidonset gmlmidongoing init1 init2 sidea1 sidea2 orig1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 20 1920 0 0 NA NA NA NA NA
#> 2 2 20 1921 0 0 NA NA NA NA NA
#> 3 2 20 1922 0 0 NA NA NA NA NA
#> 4 2 20 1923 0 0 NA NA NA NA NA
#> 5 2 20 1924 0 0 NA NA NA NA NA
#> 6 2 20 1925 0 0 NA NA NA NA NA
#> 7 2 20 1926 0 0 NA NA NA NA NA
#> 8 2 20 1927 0 0 NA NA NA NA NA
#> 9 2 20 1928 0 0 NA NA NA NA NA
#> 10 2 20 1929 0 0 NA NA NA NA NA
#> # ℹ 245,710 more rows
#> # ℹ 9 more variables: orig2 <dbl>, cowmidonset <dbl>, cowmidongoing <dbl>,
#> # conttype <dbl>, cowmaj1 <dbl>, cowmaj2 <dbl>, prd <dbl>, cowmidspell <dbl>,
#> # gmlmidspell <dbl>
# }