Add "Spells" to Data — add_spells • peacesciencer

add_spells() calculates "spells" in your state-year, leader-year, or dyad-year data. The application here is mostly concerned with things like "peace spells" between conflicts in a given cross-sectional unit (e.g. a state or dyad).

Usage

add_spells(data, conflict_event_type = "ongoing", ongo = FALSE)

Arguments

data: an applicable data frame (e.g. leader-year, dyad-year, state-year, as created in peacesciencer)
conflict_event_type: type of event for which spells should be calculated, either "ongoing" or "onset". Default is "ongoing". If "ongoing", the spells are calculated on the presence of an ongoing event. If "onset", spells are calculated on the onset of a conflict event with successive zeros (if observed) calculated as "peace". See Details section for more.
ongo: If TRUE, successive 1s are considered ongoing events and treated as NA after the first 1. If FALSE, successive 1s are all treated as failures. Defaults to FALSE.

Value

add_spells() takes a dyad-year, leader-year, or state-year data frame and adds spells for ongoing conflicts. Dyadic conflict data supported include the Correlates of War (CoW) Militarized Interstate Dispute (MID) data set and the Gibler-Miller-Little (GML) corrections to CoW-MID. State-level conflict data supported in this function include the UCDP armed conflict data and the CoW intra-state war data. Leader-year conflict data supported include the GML MID data.

Details

The function internally uses ps_spells() from stevemisc. In the interest of full disclosure, ps_spells() leans heavily on add_duration() from spduration. I optimized some code for performance.

Thinking of an application like peace-years, add_spells() will only calculate the peace years and will leave the temporal dependence adjustment to the taste of the researcher. Importantly, I do not recommend manually creating splines or square/cube terms because it creates more problems in adjusting for temporal dependence in model predictions. In a regression formula in R, you can specify the Carter and Signorino (2010) approach as ... + gmlmidspell + I(gmlmidspell^2) + I(gmlmidspell^3) (assuming you ran add_spells() on a dyad-year data frame including the Gibler-Miller-Little conflict data). The Beck et al. cubic splines approach is ... + splines::bs(gmlmidspell, 4). This function includes the spell and three splines (hence the 4 in the command). Either approach makes for easier model predictions, given R's functionality.

Thinking of our dyadic analyses of conflict, I've always understood that something like "peace-years" should be calculated on the ongoing event and not the onset of the event. Think of something like the Iran-Iraq War (MID#2115) as illustrative here. The MID (which became a war) started in 1980 and ended in 1988. There are no other bilateral incidents between Iran-Iraq independent of the war, per Correlates of War coding rules. If peace years are calculated at the "onset" of the event, it would list peace-years between the two countries from 1981 to 1988. I've never understood that to make sense, but still I've seen others insist this is the correct way to do it. add_peace_years() would force the calculation on the ongoing event, which I still maintain is correct. add_spells() will allow you to calculate on onsets, even if ongoing events are the default.

The underlying function for add_spells() will stop without a return if there are NAs bracketing observed events. The surest way this will happen is if you're doing something like a dyad-year analysis of inter-state conflicts from 1816 to 2010, but create_dyadyears() created observations from 2011 to 2020 for you as well. Remove those before using this function and confine the temporal domain to just those time-units (e.g. years) for which there is observed event data. See what I do in the example below.

References

Beger, Andreas, Daina Chiba, Daniel W. Hill, Jr, Nils W. Metternich, Shahryar Minhas and Michael D. Ward. 2018. “spduration: Split-Population and Duration (Cure) Regression.” R package version 0.17.1.

Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42(4): 1260–1288.

Carter, David B. and Curtis S. Signorino. 2010. "Back to the Future: Modeling Time Dependence in Binary Data." Political Analysis 18(3): 271–292.

Author

Steven V. Miller

Examples


# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)

aaa <- subset(cow_ddy, year <= 2010)

aaa %>%
add_gml_mids(keep = NULL) %>%
add_cow_mids(keep = NULL) %>%
add_contiguity() %>%
add_cow_majors() %>%
filter_prd()  %>%
add_spells()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_gml_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> Joining with `by = join_by(orig_order)`
#> Joining with `by = join_by(orig_order)`
#> # A tibble: 223,982 × 19
#>    ccode1 ccode2  year gmlmidonset gmlmidongoing init1 init2 sidea1 sidea2 orig1
#>     <dbl>  <dbl> <dbl>       <dbl>         <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>
#>  1      2     20  1920           0             0    NA    NA     NA     NA    NA
#>  2      2     20  1921           0             0    NA    NA     NA     NA    NA
#>  3      2     20  1922           0             0    NA    NA     NA     NA    NA
#>  4      2     20  1923           0             0    NA    NA     NA     NA    NA
#>  5      2     20  1924           0             0    NA    NA     NA     NA    NA
#>  6      2     20  1925           0             0    NA    NA     NA     NA    NA
#>  7      2     20  1926           0             0    NA    NA     NA     NA    NA
#>  8      2     20  1927           0             0    NA    NA     NA     NA    NA
#>  9      2     20  1928           0             0    NA    NA     NA     NA    NA
#> 10      2     20  1929           0             0    NA    NA     NA     NA    NA
#> # ℹ 223,972 more rows
#> # ℹ 9 more variables: orig2 <dbl>, cowmidonset <dbl>, cowmidongoing <dbl>,
#> #   conttype <dbl>, cowmaj1 <dbl>, cowmaj2 <dbl>, prd <dbl>, gmlmidspell <dbl>,
#> #   cowmidspell <dbl>
# }