Add Correlates of War (CoW) Militarized Interstate Dispute (MID) data to dyad-year data frame

add_cow_mids() merges in CoW's MID data to a dyad-year data frame. The version of the CoW-MID data in this package is version 5.0.

Usage

add_cow_mids(data, keep)

Arguments

data

a dyad-year data frame (either "directed" or "non-directed")

keep

an optional parameter, specified as a character vector, passed to the function in a select(one_of(.)) wrapper. This allows the user to discard unwanted columns from the directed dispute data so that the output does not consume too much space in memory. Note: the Correlates of War system codes (ccode1, ccode2), the observation year (year), the presence or absence of an ongoing MID (cowmidongoing), and the presence or absence of a unique MID onset (cowmidonset) are always returned. It would be foolish and self-defeating to eliminate those observations. The user is free to keep or discard anything else they see fit.

If keep is not specified in the function, the ensuing output returns everything.

Value

add_cow_mids() takes a dyad-year data frame and adds dyad-year dispute information from the CoW-MID data.

Details

I've planted various flags in the ground about the use of these data versus assorted alternatives.

Dyads are capable of having multiple disputes in a given year, which can create a problem for merging into a complete dyad-year data frame. Consider the case of France and Italy in 1860, which had three separate dispute onsets that year (MID#0112, MID#0113, MID#0306), as illustrative of the problem. This merging process employs several rules to whittle down these duplicate dyad-years for merging into a dyad-year data frame.

The function will also return a message to the user about the case-exclusion rules that went into this process. Users who are interested in implementing their own case-exclusion rules should look up the "whittle" class of functions also provided in this package.

References

Palmer, Glenn, and Roseanne W. McManus and Vito D'Orazio and Michael R. Kenwick and Mikaela Karstens and Chase Bloch and Nick Dietrich and Kayla Kahn and Kellan Ritter and Michael J. Soules. 2021. "The MID5 Dataset, 2011–2014: Procedures, coding rules, and description" Conflict Management and Peace Science.

Author

Steven V. Miller

Examples


# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
cow_ddy %>% add_cow_mids()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> # A tibble: 2,214,930 × 24
#>    ccode1 ccode2  year dispnum cowmidongoing cowmidonset sidea1 sidea2 fatality1
#>     <dbl>  <dbl> <dbl>   <dbl>         <dbl>       <dbl>  <dbl>  <dbl>     <dbl>
#>  1      2     20  1920      NA             0           0     NA     NA        NA
#>  2      2     20  1921      NA             0           0     NA     NA        NA
#>  3      2     20  1922      NA             0           0     NA     NA        NA
#>  4      2     20  1923      NA             0           0     NA     NA        NA
#>  5      2     20  1924      NA             0           0     NA     NA        NA
#>  6      2     20  1925      NA             0           0     NA     NA        NA
#>  7      2     20  1926      NA             0           0     NA     NA        NA
#>  8      2     20  1927      NA             0           0     NA     NA        NA
#>  9      2     20  1928      NA             0           0     NA     NA        NA
#> 10      2     20  1929      NA             0           0     NA     NA        NA
#> # ℹ 2,214,920 more rows
#> # ℹ 15 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> #   hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> #   orig2 <dbl>, fatality <dbl>, hostlev <dbl>, mindur <dbl>, maxdur <dbl>,
#> #   recip <dbl>, stmon <dbl>

# keep just the dispute number and Side A/B identifiers
cow_ddy %>% add_cow_mids(keep=c("dispnum","sidea1", "sidea2"))
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> add_cow_mids() IMPORTANT MESSAGE: By default, this function whittles dispute-year data into dyad-year data by first selecting on unique onsets. Thereafter, where duplicates remain, it whittles dispute-year data into dyad-year data in the following order: 1) retaining highest `fatality`, 2) retaining highest `hostlev`, 3) retaining highest estimated `mindur`, 4) retaining highest estimated `maxdur`, 5) retaining reciprocated over non-reciprocated observations, 6) retaining the observation with the lowest start month, and, where duplicates still remained (and they don't), 7) forcibly dropping all duplicates for observations that are otherwise very similar.
#> See: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
#> # A tibble: 2,214,930 × 8
#>    ccode1 ccode2  year cowmidonset cowmidongoing dispnum sidea1 sidea2
#>     <dbl>  <dbl> <dbl>       <dbl>         <dbl>   <dbl>  <dbl>  <dbl>
#>  1      2     20  1920           0             0      NA     NA     NA
#>  2      2     20  1921           0             0      NA     NA     NA
#>  3      2     20  1922           0             0      NA     NA     NA
#>  4      2     20  1923           0             0      NA     NA     NA
#>  5      2     20  1924           0             0      NA     NA     NA
#>  6      2     20  1925           0             0      NA     NA     NA
#>  7      2     20  1926           0             0      NA     NA     NA
#>  8      2     20  1927           0             0      NA     NA     NA
#>  9      2     20  1928           0             0      NA     NA     NA
#> 10      2     20  1929           0             0      NA     NA     NA
#> # ℹ 2,214,920 more rows
# }