Whittle Duplicate Conflict-Years by Highest Fatality
Source:R/whittle_conflicts_fatality.R
whittle_conflicts_fatality.Rd
whittle_conflicts_fatality()
is in a class of
do-it-yourself functions for coercing (i.e. "whittling") conflict-year
data with cross-sectional units to unique conflict-year data by
cross-sectional unit. The inspiration here is clearly the problem
of whittling dyadic dispute-year data into true dyad-year data (like in
the Gibler-Miller-Little conflict data). This particular
function will keep the observations with the highest observed fatality.
Arguments
- data
a data frame with a declared conflict attribute type.
- ...
optional, only to make the shortcut work
Value
whittle_conflicts_fatality()
takes a dyad-year data frame
or leader-dyad-year data frame with a declared conflict attribute type
and, grouping by the dyad and year, returns just those observations
that have the highest observed dispute-level fatality. This will not
eliminate all duplicates, far from it, but it's a sensible second cut
(after whittling onsets in whittle_conflicts_onsets()
the extent
to which dispute-level fatality is a good heuristic for dispute-level
severity/importance.
Details
Dyads are capable of having multiple disputes in a given year,
which can create a problem for merging into a complete dyad-year
data frame. Consider the case of France and Italy in 1860, which
had three separate dispute onsets that year (MID#0112, MID#0113, MID#0306),
as illustrative of the problem. The default process in peacesciencer
employs several rules to whittle down these duplicate dyad-years for
merging into a dyad-year data frame. These are available in
add_cow_mids()
and add_gml_mids()
.
As of writing, the Correlates of War and Gibler-Miller-Little conflict data record some -9s for fatalities. In those cases, dispute-level fatality is momentarily recoded to be .5 (i.e. fatal, but without too many fatalities). This is a missing data problem that Gibler and Miller correct in a forthcoming publication in Journal of Conflict Resolution. Until then, this function makes that kind of determination about disputes with missing fatalities.
wc_fatality()
is a simple, less wordy, shortcut for the same function.
References
Miller, Steven V. 2021. "How peacesciencer Coerces Dispute-Year Data into Dyad-Year Data". URL: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html
Examples
# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
gml_dirdisp %>% whittle_conflicts_onsets() %>% whittle_conflicts_fatality()
#> # A tibble: 9,504 × 39
#> dispnum ccode1 ccode2 year midongoing midonset sidea1 sidea2 revstate1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2968 2 20 1979 1 1 0 1 0
#> 2 3900 2 20 1989 1 1 0 1 0
#> 3 3972 2 20 1991 1 1 1 0 1
#> 4 4183 2 20 1997 1 1 0 1 0
#> 5 1665 2 40 1921 1 1 1 0 1
#> 6 1677 2 40 1933 1 1 1 0 1
#> 7 1677 2 40 1934 1 0 1 0 1
#> 8 246 2 40 1960 1 1 1 0 1
#> 9 246 2 40 1961 1 0 1 0 1
#> 10 61 2 40 1962 1 1 1 0 1
#> # ℹ 9,494 more rows
#> # ℹ 30 more variables: revstate2 <dbl>, revtype11 <dbl>, revtype12 <dbl>,
#> # revtype21 <dbl>, revtype22 <dbl>, fatality1 <dbl>, fatality2 <dbl>,
#> # fatalpre1 <dbl>, fatalpre2 <dbl>, hiact1 <dbl>, hiact2 <dbl>,
#> # hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>, orig2 <dbl>, hiact <dbl>,
#> # hostlev <dbl>, mindur <dbl>, maxdur <dbl>, outcome <dbl>, settle <dbl>,
#> # fatality <dbl>, fatalpre <dbl>, stmon <dbl>, endmon <dbl>, recip <dbl>, …
cow_mid_dirdisps %>% whittle_conflicts_onsets() %>% whittle_conflicts_fatality()
#> Joining with `by = join_by(dispnum)`
#> # A tibble: 10,536 × 19
#> dispnum ccode1 ccode2 year dispongoing disponset sidea1 sidea2 fatality1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2968 2 20 1979 1 1 0 1 0
#> 2 3900 2 20 1989 1 1 0 1 0
#> 3 3972 2 20 1991 1 1 1 0 0
#> 4 4183 2 20 1997 1 1 0 1 0
#> 5 1665 2 40 1921 1 1 1 0 0
#> 6 1677 2 40 1933 1 1 1 0 0
#> 7 1677 2 40 1934 1 0 1 0 0
#> 8 246 2 40 1960 1 1 0 1 0
#> 9 246 2 40 1961 1 0 0 1 0
#> 10 61 2 40 1962 1 1 1 0 0
#> # ℹ 10,526 more rows
#> # ℹ 10 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> # hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> # orig2 <dbl>, fatality <dbl>
# }