Whittle Duplicate Conflict-Years by Highest Fatality — whittle_conflicts

whittle_conflicts_fatality() is in a class of do-it-yourself functions for coercing (i.e. "whittling") conflict-year data with cross-sectional units to unique conflict-year data by cross-sectional unit. The inspiration here is clearly the problem of whittling dyadic dispute-year data into true dyad-year data (like in the Gibler-Miller-Little conflict data). This particular function will keep the observations with the highest observed fatality.

Usage

whittle_conflicts_fatality(data)

wc_fatality(...)

Arguments

data: a data frame with a declared conflict attribute type.
...: optional, only to make the shortcut work

Value

whittle_conflicts_fatality() takes a dyad-year data frame or leader-dyad-year data frame with a declared conflict attribute type and, grouping by the dyad and year, returns just those observations that have the highest observed dispute-level fatality. This will not eliminate all duplicates, far from it, but it's a sensible second cut (after whittling onsets in whittle_conflicts_onsets() the extent to which dispute-level fatality is a good heuristic for dispute-level severity/importance.

Details

Dyads are capable of having multiple disputes in a given year, which can create a problem for merging into a complete dyad-year data frame. Consider the case of France and Italy in 1860, which had three separate dispute onsets that year (MID#0112, MID#0113, MID#0306), as illustrative of the problem. The default process in peacesciencer employs several rules to whittle down these duplicate dyad-years for merging into a dyad-year data frame. These are available in add_cow_mids() and add_gml_mids().

As of writing, the Correlates of War and Gibler-Miller-Little conflict data record some -9s for fatalities. In those cases, dispute-level fatality is momentarily recoded to be .5 (i.e. fatal, but without too many fatalities). This is a missing data problem that Gibler and Miller correct in a forthcoming publication in Journal of Conflict Resolution. Until then, this function makes that kind of determination about disputes with missing fatalities.

wc_fatality() is a simple, less wordy, shortcut for the same function.

References

Miller, Steven V. 2021. "How peacesciencer Coerces Dispute-Year Data into Dyad-Year Data". URL: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html

Author

Steven V. Miller

Examples


# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
gml_dirdisp %>% whittle_conflicts_onsets() %>% whittle_conflicts_fatality()
#> # A tibble: 9,504 × 39
#>    dispnum ccode1 ccode2  year midongoing midonset sidea1 sidea2 revstate1
#>      <dbl>  <dbl>  <dbl> <dbl>      <dbl>    <dbl>  <dbl>  <dbl>     <dbl>
#>  1    2968      2     20  1979          1        1      0      1         0
#>  2    3900      2     20  1989          1        1      0      1         0
#>  3    3972      2     20  1991          1        1      1      0         1
#>  4    4183      2     20  1997          1        1      0      1         0
#>  5    1665      2     40  1921          1        1      1      0         1
#>  6    1677      2     40  1933          1        1      1      0         1
#>  7    1677      2     40  1934          1        0      1      0         1
#>  8     246      2     40  1960          1        1      1      0         1
#>  9     246      2     40  1961          1        0      1      0         1
#> 10      61      2     40  1962          1        1      1      0         1
#> # ℹ 9,494 more rows
#> # ℹ 30 more variables: revstate2 <dbl>, revtype11 <dbl>, revtype12 <dbl>,
#> #   revtype21 <dbl>, revtype22 <dbl>, fatality1 <dbl>, fatality2 <dbl>,
#> #   fatalpre1 <dbl>, fatalpre2 <dbl>, hiact1 <dbl>, hiact2 <dbl>,
#> #   hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>, orig2 <dbl>, hiact <dbl>,
#> #   hostlev <dbl>, mindur <dbl>, maxdur <dbl>, outcome <dbl>, settle <dbl>,
#> #   fatality <dbl>, fatalpre <dbl>, stmon <dbl>, endmon <dbl>, recip <dbl>, …

cow_mid_dirdisps %>% whittle_conflicts_onsets() %>% whittle_conflicts_fatality()
#> Joining with `by = join_by(dispnum)`
#> # A tibble: 10,536 × 19
#>    dispnum ccode1 ccode2  year dispongoing disponset sidea1 sidea2 fatality1
#>      <dbl>  <dbl>  <dbl> <dbl>       <dbl>     <dbl>  <dbl>  <dbl>     <dbl>
#>  1    2968      2     20  1979           1         1      0      1         0
#>  2    3900      2     20  1989           1         1      0      1         0
#>  3    3972      2     20  1991           1         1      1      0         0
#>  4    4183      2     20  1997           1         1      0      1         0
#>  5    1665      2     40  1921           1         1      1      0         0
#>  6    1677      2     40  1933           1         1      1      0         0
#>  7    1677      2     40  1934           1         0      1      0         0
#>  8     246      2     40  1960           1         1      0      1         0
#>  9     246      2     40  1961           1         0      0      1         0
#> 10      61      2     40  1962           1         1      1      0         0
#> # ℹ 10,526 more rows
#> # ℹ 10 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> #   hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> #   orig2 <dbl>, fatality <dbl>


# }