Add Correlates of War (CoW) Militarized Interstate Dispute (MID) data to dyad-year data frame
Source:R/add_cow_mids.R
add_cow_mids.Rd
add_cow_mids()
merges in CoW's MID data to a dyad-year data frame.
The version of the CoW-MID data in this package is version 5.0.
Arguments
- data
a dyad-year data frame (either "directed" or "non-directed")
- keep
an optional parameter, specified as a character vector, passed to the function in a
select(one_of(.))
wrapper. This allows the user to discard unwanted columns from the directed dispute data so that the output does not consume too much space in memory. Note: the Correlates of War system codes (ccode1
,ccode2
), the observation year (year
), the presence or absence of an ongoing MID (cowmidongoing
), and the presence or absence of a unique MID onset (cowmidonset
) are always returned. It would be foolish and self-defeating to eliminate those observations. The user is free to keep or discard anything else they see fit.If
keep
is not specified in the function, the ensuing output returns everything.
Value
add_cow_mids()
takes a dyad-year data frame and adds dyad-year dispute information
from the CoW-MID data.
Details
Dyads are capable of having multiple disputes in a given year, which can create a problem for merging into a complete dyad-year data frame. Consider the case of France and Italy in 1860, which had three separate dispute onsets that year (MID#0112, MID#0113, MID#0306), as illustrative of the problem. This merging process employs several rules to whittle down these duplicate dyad-years for merging into a dyad-year data frame.
The function will also return a message to the user about the case-exclusion rules that went into this process. Users who are interested in implementing their own case-exclusion rules should look up the "whittle" class of functions also provided in this package.
References
Palmer, Glenn, and Roseanne W. McManus and Vito D'Orazio and Michael R. Kenwick and Mikaela Karstens and Chase Bloch and Nick Dietrich and Kayla Kahn and Kellan Ritter and Michael J. Soules. 2021. "The MID5 Dataset, 2011–2014: Procedures, coding rules, and description" Conflict Management and Peace Science.
Examples
# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
cow_ddy %>% add_cow_mids()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> # A tibble: 2,139,270 × 24
#> ccode1 ccode2 year dispnum cowmidongoing cowmidonset sidea1 sidea2 fatality1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 20 1920 NA 0 0 NA NA NA
#> 2 2 20 1921 NA 0 0 NA NA NA
#> 3 2 20 1922 NA 0 0 NA NA NA
#> 4 2 20 1923 NA 0 0 NA NA NA
#> 5 2 20 1924 NA 0 0 NA NA NA
#> 6 2 20 1925 NA 0 0 NA NA NA
#> 7 2 20 1926 NA 0 0 NA NA NA
#> 8 2 20 1927 NA 0 0 NA NA NA
#> 9 2 20 1928 NA 0 0 NA NA NA
#> 10 2 20 1929 NA 0 0 NA NA NA
#> # ℹ 2,139,260 more rows
#> # ℹ 15 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> # hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> # orig2 <dbl>, fatality <dbl>, hostlev <dbl>, mindur <dbl>, maxdur <dbl>,
#> # recip <dbl>, stmon <dbl>
# keep just the dispute number and Side A/B identifiers
cow_ddy %>% add_cow_mids(keep=c("dispnum","sidea1", "sidea2"))
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> # A tibble: 2,139,270 × 8
#> ccode1 ccode2 year cowmidonset cowmidongoing dispnum sidea1 sidea2
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 20 1920 0 0 NA NA NA
#> 2 2 20 1921 0 0 NA NA NA
#> 3 2 20 1922 0 0 NA NA NA
#> 4 2 20 1923 0 0 NA NA NA
#> 5 2 20 1924 0 0 NA NA NA
#> 6 2 20 1925 0 0 NA NA NA
#> 7 2 20 1926 0 0 NA NA NA
#> 8 2 20 1927 0 0 NA NA NA
#> 9 2 20 1928 0 0 NA NA NA
#> 10 2 20 1929 0 0 NA NA NA
#> # ℹ 2,139,260 more rows
# }