Whittle Duplicate Conflict-Years by Lowest Start Month — whittle_conflicts

whittle_conflicts_startmonth() is in a class of do-it-yourself functions for coercing (i.e. "whittling") conflict-year data with cross-sectional units to unique conflict-year data by cross-sectional unit. The inspiration here is clearly the problem of whittling dyadic dispute-year data into true dyad-year data (like in the Gibler-Miller-Little conflict data). This particular function will keep the observations that have the lowest start month.

Usage

whittle_conflicts_startmonth(data)

wc_stmon(...)

Arguments

data: a data frame with a declared conflict attribute type.
...: optional, only to make the shortcut work

Value

whittle_conflicts_startmonth() takes a dyad-year data frame or leader-dyad-year data frame with a declared conflict attribute type and, grouping by the dyad and year, returns just those observations that have the lowest start month.

Details

Dyads are capable of having multiple disputes in a given year, which can create a problem for merging into a complete dyad-year data frame. Consider the case of France and Italy in 1860, which had three separate dispute onsets that year (MID#0112, MID#0113, MID#0306), as illustrative of the problem. The default process in peacesciencer employs several rules to whittle down these duplicate dyad-years for merging into a dyad-year data frame. These are available in add_cow_mids() and add_gml_mids().

This really should be one of the last exclusion rules a researcher uses. There is no substantive reason to assume the lower start month matters for the cause of isolating "serious" or "severe" disputes in the presence of duplicates. It's really just a way of isolating which duplicated observation happened first where remaining duplicates are otherwise very similar to each other.

wc_stmon() is a simple, less wordy, shortcut for the same function.

References

Miller, Steven V. 2021. "How peacesciencer Coerces Dispute-Year Data into Dyad-Year Data". URL: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html

Author

Steven V. Miller

Examples


# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
gml_dirdisp %>% whittle_conflicts_onsets() %>% whittle_conflicts_startmonth()
#> # A tibble: 9,344 × 39
#>    dispnum ccode1 ccode2  year midongoing midonset sidea1 sidea2 revstate1
#>      <dbl>  <dbl>  <dbl> <dbl>      <dbl>    <dbl>  <dbl>  <dbl>     <dbl>
#>  1    2968      2     20  1979          1        1      0      1         0
#>  2    3900      2     20  1989          1        1      0      1         0
#>  3    3972      2     20  1991          1        1      1      0         1
#>  4    4183      2     20  1997          1        1      0      1         0
#>  5    1665      2     40  1921          1        1      1      0         1
#>  6    1677      2     40  1933          1        1      1      0         1
#>  7    1677      2     40  1934          1        0      1      0         1
#>  8     246      2     40  1960          1        1      1      0         1
#>  9     246      2     40  1961          1        0      1      0         1
#> 10      61      2     40  1962          1        1      1      0         1
#> # ℹ 9,334 more rows
#> # ℹ 30 more variables: revstate2 <dbl>, revtype11 <dbl>, revtype12 <dbl>,
#> #   revtype21 <dbl>, revtype22 <dbl>, fatality1 <dbl>, fatality2 <dbl>,
#> #   fatalpre1 <dbl>, fatalpre2 <dbl>, hiact1 <dbl>, hiact2 <dbl>,
#> #   hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>, orig2 <dbl>, hiact <dbl>,
#> #   hostlev <dbl>, mindur <dbl>, maxdur <dbl>, outcome <dbl>, settle <dbl>,
#> #   fatality <dbl>, fatalpre <dbl>, stmon <dbl>, endmon <dbl>, recip <dbl>, …

cow_mid_dirdisps %>% whittle_conflicts_onsets() %>% whittle_conflicts_startmonth()
#> Joining with `by = join_by(dispnum)`
#> # A tibble: 10,296 × 19
#>    dispnum ccode1 ccode2  year dispongoing disponset sidea1 sidea2 fatality1
#>      <dbl>  <dbl>  <dbl> <dbl>       <dbl>     <dbl>  <dbl>  <dbl>     <dbl>
#>  1    2968      2     20  1979           1         1      0      1         0
#>  2    3900      2     20  1989           1         1      0      1         0
#>  3    3972      2     20  1991           1         1      1      0         0
#>  4    4183      2     20  1997           1         1      0      1         0
#>  5    1665      2     40  1921           1         1      1      0         0
#>  6    1677      2     40  1933           1         1      1      0         0
#>  7    1677      2     40  1934           1         0      1      0         0
#>  8     246      2     40  1960           1         1      0      1         0
#>  9     246      2     40  1961           1         0      0      1         0
#> 10      61      2     40  1962           1         1      1      0         0
#> # ℹ 10,286 more rows
#> # ℹ 10 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> #   hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> #   orig2 <dbl>, stmon <dbl>


# }