Show Duplicate Observations in Your Dyad-Year or State-Year Data Frame
Source:R/show_duplicates.R
show_duplicates.Rd
show_duplicates()
shows which data are duplicated
in data generated in peacesciencer. It's a useful diagnostic tool
for users doing some do-it-yourself functions with peacesciencer.
Value
show_duplicates()
takes a dyad-year data frame or
state-year data frame generated in peacesciencer and
shows what observations are duplicated by unique combination of
dyad-year or state-year, contingent on what was supplied to it.
Details
The function leans on attributes of the data that are
provided by the create_dyadyear()
or create_stateyear()
function. Make sure that function (or data created by that function)
appear at the top of the proverbial pipe.
The data returned will also have a new column called duplicated
.
Thus, an implicit assumption in this function is the user does not have
a column in the data with this name that is of interest to the user.
It will be overwritten.
Examples
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
gml_dirdisp %>% show_duplicates()
#> # A tibble: 1,838 × 40
#> dispnum ccode1 ccode2 year midongoing midonset sidea1 sidea2 revstate1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2981 2 40 1983 1 1 1 0 1
#> 2 3058 2 40 1983 1 1 1 0 1
#> 3 1554 2 70 1836 1 1 0 1 0
#> 4 1555 2 70 1836 1 1 1 0 0
#> 5 1556 2 70 1836 1 0 1 0 0
#> 6 1548 2 70 1860 1 0 1 0 0
#> 7 1549 2 70 1860 1 1 1 0 1
#> 8 2347 2 93 1982 1 0 0 1 1
#> 9 2977 2 93 1982 1 1 1 0 1
#> 10 2741 2 95 1988 1 0 1 0 1
#> # ℹ 1,828 more rows
#> # ℹ 31 more variables: revstate2 <dbl>, revtype11 <dbl>, revtype12 <dbl>,
#> # revtype21 <dbl>, revtype22 <dbl>, fatality1 <dbl>, fatality2 <dbl>,
#> # fatalpre1 <dbl>, fatalpre2 <dbl>, hiact1 <dbl>, hiact2 <dbl>,
#> # hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>, orig2 <dbl>, hiact <dbl>,
#> # hostlev <dbl>, mindur <dbl>, maxdur <dbl>, outcome <dbl>, settle <dbl>,
#> # fatality <dbl>, fatalpre <dbl>, stmon <dbl>, endmon <dbl>, recip <dbl>, …
cow_mid_dirdisps %>% show_duplicates()
#> # A tibble: 2,152 × 19
#> dispnum ccode1 ccode2 year dispongoing disponset sidea1 sidea2 fatality1
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2981 2 40 1983 1 1 1 0 0
#> 2 3058 2 40 1983 1 1 1 0 1
#> 3 69 2 42 1916 1 0 1 0 0
#> 4 322 2 42 1916 1 1 1 0 -9
#> 5 1554 2 70 1836 1 1 0 1 0
#> 6 1555 2 70 1836 1 1 1 0 0
#> 7 1548 2 70 1860 1 0 1 0 0
#> 8 1549 2 70 1860 1 1 1 0 -9
#> 9 2 2 200 1902 1 1 1 0 0
#> 10 254 2 200 1902 1 1 0 1 0
#> # ℹ 2,142 more rows
#> # ℹ 10 more variables: fatality2 <dbl>, fatalpre1 <dbl>, fatalpre2 <dbl>,
#> # hiact1 <dbl>, hiact2 <dbl>, hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>,
#> # orig2 <dbl>, duplicated <dbl>