Skip to contents

whittle_conflicts_duration() is in a class of do-it-yourself functions for coercing (i.e. "whittling") conflict-year data with cross-sectional units to unique conflict-year data by cross-sectional unit. The inspiration here is clearly the problem of whittling dyadic dispute-year data into true dyad-year data (like in the Gibler-Miller-Little conflict data). This particular function will keep the observations with the highest estimated duration.

Usage

whittle_conflicts_duration(data, durtype = "mindur")

wc_duration(...)

Arguments

data

a data frame with a declared conflict attribute type.

durtype

a duration on which to filter/whittle the data. Options include "mindur" or "maxdur". The default is "mindur".

...

optional, only to make the shortcut work

Value

whittle_conflicts_duration() takes a dyad-year data frame or leader-dyad-year data frame with a declared conflict attribute type and, grouping by the dyad and year, returns just those observations that have the highest observed dispute-level fatality. This will not eliminate all duplicates, far from it, but it's a sensible cut later into the procedure (after whittling onsets in whittle_conflicts_onsets(), and maybe some other things the extent to which dispute-level duration is a heuristic for dispute-level severity/importance.

Details

Dyads are capable of having multiple disputes in a given year, which can create a problem for merging into a complete dyad-year data frame. Consider the case of France and Italy in 1860, which had three separate dispute onsets that year (MID#0112, MID#0113, MID#0306), as illustrative of the problem. The default process in peacesciencer employs several rules to whittle down these duplicate dyad-years for merging into a dyad-year data frame. These are available in add_cow_mids() and add_gml_mids().

Some conflicts can be of an unknown length and often come with estimates of a minimum duration and a maximum duration. This will concern the durtype parameter in this function. In many/most conflicts, certainly thinking of the inter-state dispute data, dates are known with precision (to the day) and the estimate of minimum conflict duration is equal to the estimate of maximum conflict duration. For some conflicts, the estimates will vary. This does importantly imply that using this particular whittle function with the default (mindur) will produce different results than using this particular whittle function and asking to retain the highest maximum duration (maxdur). Use the function with that in mind.

wc_duration() is a simple, less wordy, shortcut for the same function.

References

Miller, Steven V. 2021. "How peacesciencer Coerces Dispute-Year Data into Dyad-Year Data". URL: http://svmiller.com/peacesciencer/articles/coerce-dispute-year-dyad-year.html

Author

Steven V. Miller

Examples


# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
gml_dirdisp %>% whittle_conflicts_onsets() %>% whittle_conflicts_duration()
#> # A tibble: 9,308 × 39
#>    dispnum ccode1 ccode2  year midongoing midonset sidea1 sidea2 revstate1
#>      <dbl>  <dbl>  <dbl> <dbl>      <dbl>    <dbl>  <dbl>  <dbl>     <dbl>
#>  1    2968      2     20  1979          1        1      0      1         0
#>  2    3900      2     20  1989          1        1      0      1         0
#>  3    3972      2     20  1991          1        1      1      0         1
#>  4    4183      2     20  1997          1        1      0      1         0
#>  5    1665      2     40  1921          1        1      1      0         1
#>  6    1677      2     40  1933          1        1      1      0         1
#>  7    1677      2     40  1934          1        0      1      0         1
#>  8     246      2     40  1960          1        1      1      0         1
#>  9     246      2     40  1961          1        0      1      0         1
#> 10      61      2     40  1962          1        1      1      0         1
#> # ℹ 9,298 more rows
#> # ℹ 30 more variables: revstate2 <dbl>, revtype11 <dbl>, revtype12 <dbl>,
#> #   revtype21 <dbl>, revtype22 <dbl>, fatality1 <dbl>, fatality2 <dbl>,
#> #   fatalpre1 <dbl>, fatalpre2 <dbl>, hiact1 <dbl>, hiact2 <dbl>,
#> #   hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>, orig2 <dbl>, hiact <dbl>,
#> #   hostlev <dbl>, mindur <dbl>, maxdur <dbl>, outcome <dbl>, settle <dbl>,
#> #   fatality <dbl>, fatalpre <dbl>, stmon <dbl>, endmon <dbl>, recip <dbl>, …

cow_mid_dirdisps %>% whittle_conflicts_onsets() %>% whittle_conflicts_duration()
#> Joining with `by = join_by(dispnum)`
#> # A tibble: 10,268 × 20
#>    dispnum ccode1 ccode2  year dispongoing disponset sidea1 sidea2 fatality1
#>      <dbl>  <dbl>  <dbl> <dbl>       <dbl>     <dbl>  <dbl>  <dbl>     <dbl>
#>  1    2968      2     20  1979           1         1      0      1         0
#>  2    3900      2     20  1989           1         1      0      1         0
#>  3    3972      2     20  1991           1         1      1      0         0
#>  4    4183      2     20  1997           1         1      0      1         0
#>  5    1665      2     40  1921           1         1      1      0         0
#>  6    1677      2     40  1933           1         1      1      0         0
#>  7    1677      2     40  1934           1         0      1      0         0
#>  8     246      2     40  1960           1         1      0      1         0
#>  9     246      2     40  1961           1         0      0      1         0
#> 10      61      2     40  1962           1         1      1      0         0
#> # ℹ 10,258 more rows
#> # ℹ 11 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> #   hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> #   orig2 <dbl>, mindur <dbl>, maxdur <dbl>


# }