Create Different Kinds of Data in `{peacesciencer}` • peacesciencer

This tutorial is a companion to the manuscript, which shows how to create different kinds of data in peacesciencer. However, space considerations (for ideal publication in a peer-reviewed journal) preclude the full “knitting” experience (i.e. giving the user a preview of what the data look like). What follows is a brief guide that expands on the tutorial section of the manuscript for creating different kinds of data in peacesciencer.

This vignette will lean on the tidyverse package, which will be included in almost anything you should do (optimally) with peacesciencer. I will also load lubridate. Internal functions in peacesciencer use lubridate—it is a formal dependency of peacesciencer—but users may want to load it for doing some additional stuff outside of peacesciencer.

library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr     1.1.4     ✔ readr     2.1.4
#> ✔ forcats   1.0.0     ✔ stringr   1.5.0
#> ✔ ggplot2   3.5.1     ✔ tibble    3.3.0
#> ✔ lubridate 1.9.4     ✔ tidyr     1.3.0
#> ✔ purrr     1.1.0     
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(peacesciencer)
#> The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
#> which was just loaded, will retire in October 2023.
#> Please refer to R-spatial evolution reports for details, especially
#> https://r-spatial.org/r/2023/05/15/evolution4.html.
#> It may be desirable to make the sf package available;
#> package maintainers should consider adding sf to Suggests:.
#> The sp package is now running under evolution status 2
#>      (status 2 uses the sf package in place of rgdal)
#> {peacesciencer} includes additional remote data for separate download. Please type ?download_extdata() for more information.
#> This message disappears on load when these data are downloaded and in the package's `extdata` directory.
library(lubridate)

packageVersion("peacesciencer")
#> [1] '1.2.0'
packageVersion("isard") # a dependency, but not formally required.
#> [1] '0.1.0'
Sys.Date()
#> [1] "2025-07-17"

State-Year Data

The most basic form of data peacesciencer creates is state-year, by way of create_stateyears(). create_stateyears() has two arguments: system and mry. system takes either “cow” or “gw”, depending on whether the user wants Correlates of War state years or Gleditsch-Ward state-years. It defaults to “cow” in the absence of a user-specified override given the prominence of Correlates of War data in the peace science ecosystem. mry takes a logical (TRUE or FALSE), depending on whether the user wants the function to extend to the most recently concluded calendar year (2024). The Correlates of War state system data extend to the end of 2016 while the Gleditsch-Ward state system extend to the end of the 2017. This argument will allow the researcher to extend the data a few years, under the (reasonable) assumption there have been no fundamental composition changes to the state system since these data sets were last updated. mry defaults to TRUE in the absence of a user-specified override.

This will create Correlates of War state-year data from 1816 to 2024.

create_stateyears()
#> # A tibble: 17,511 × 3
#>    ccode cw_name                   year
#>  * <dbl> <chr>                    <int>
#>  1     2 United States of America  1816
#>  2     2 United States of America  1817
#>  3     2 United States of America  1818
#>  4     2 United States of America  1819
#>  5     2 United States of America  1820
#>  6     2 United States of America  1821
#>  7     2 United States of America  1822
#>  8     2 United States of America  1823
#>  9     2 United States of America  1824
#> 10     2 United States of America  1825
#> # ℹ 17,501 more rows

This will create Gleditsch-Ward state-year data from 1816 to 2017.

create_stateyears(system = "gw", mry = FALSE)
#> # A tibble: 19,864 × 4
#>    gwcode gw_name                  microstate  year
#>  *  <dbl> <chr>                         <dbl> <int>
#>  1      2 United States of America          0  1816
#>  2      2 United States of America          0  1817
#>  3      2 United States of America          0  1818
#>  4      2 United States of America          0  1819
#>  5      2 United States of America          0  1820
#>  6      2 United States of America          0  1821
#>  7      2 United States of America          0  1822
#>  8      2 United States of America          0  1823
#>  9      2 United States of America          0  1824
#> 10      2 United States of America          0  1825
#> # ℹ 19,854 more rows

Dyad-Year Data

create_dyadyears() is one of the most useful functions in peacesciencer, transforming the raw Correlates of War state system data (cow_states in peacesciencer) or Gleditsch-Ward state system data (gw_states) into all possible dyad-years. It has three arguments. system and mry operate the same as they do in create_stateyears(). There is an additional argument—directed—that also takes a logical (TRUE or FALSE). The default here is TRUE, returning directed dyad-year data (useful for dyadic conflict analyses where the initiator/target distinction matters). FALSE returns non-directed dyad-year data, useful for cases where the initiator/target distinction does not matter and the researcher cares more about the presence or absence of a conflict. The convention for non-directed dyad-year data is that ccode2 > ccode1 and the underlying code of create_dyadyears() simply takes the directed dyad-year data and chops it in half with that rule.

Here are all Correlates of War dyad-years from 1816 to 2024.

create_dyadyears()
#> Joining with `by = join_by(ccode1, ccode2, year)`
#> # A tibble: 2,214,930 × 3
#>    ccode1 ccode2  year
#>     <dbl>  <dbl> <int>
#>  1      2     20  1920
#>  2      2     20  1921
#>  3      2     20  1922
#>  4      2     20  1923
#>  5      2     20  1924
#>  6      2     20  1925
#>  7      2     20  1926
#>  8      2     20  1927
#>  9      2     20  1928
#> 10      2     20  1929
#> # ℹ 2,214,920 more rows

Here are all Gleditsch-Ward dyad-years with the same temporal domain.

create_dyadyears(system = "gw")
#> Joining with `by = join_by(gwcode1, gwcode2, year, microstate1,
#> microstate2)`
#> # A tibble: 2,599,466 × 5
#>    gwcode1 gwcode2  year microstate1 microstate2
#>      <dbl>   <dbl> <int>       <dbl>       <dbl>
#>  1       2      20  1867           0           0
#>  2       2      20  1868           0           0
#>  3       2      20  1869           0           0
#>  4       2      20  1870           0           0
#>  5       2      20  1871           0           0
#>  6       2      20  1872           0           0
#>  7       2      20  1873           0           0
#>  8       2      20  1874           0           0
#>  9       2      20  1875           0           0
#> 10       2      20  1876           0           0
#> # ℹ 2,599,456 more rows

Major vs. Major Dyad-Years

Consider this section of the vignette as a comparison to the kind of dyad-year data that EUGene would create for a user, apparently on request. EUGene would apparently create these types of dyad-years as specific dyad-year types whereas peacesciencer treats them as case exclusions you can do after the fact given other functionality in the package. For example, here are just major vs. major dyads. For simplicity’s sake, these will all be directed dyad-years at their core (and captured with cow_ddy in the package as a shortcut).

cow_ddy %>% add_cow_majors() %>%
  filter(cowmaj1 == 1 & cowmaj2 == 1)
#> # A tibble: 6,476 × 5
#>    ccode1 ccode2  year cowmaj1 cowmaj2
#>     <dbl>  <dbl> <int>   <dbl>   <dbl>
#>  1      2    200  1898       1       1
#>  2      2    200  1899       1       1
#>  3      2    200  1900       1       1
#>  4      2    200  1901       1       1
#>  5      2    200  1902       1       1
#>  6      2    200  1903       1       1
#>  7      2    200  1904       1       1
#>  8      2    200  1905       1       1
#>  9      2    200  1906       1       1
#> 10      2    200  1907       1       1
#> # ℹ 6,466 more rows

Major vs. Any State Dyad-Years

These are all dyad-years where any state is a major power.

cow_ddy %>% add_cow_majors() %>%
  filter(cowmaj1 == 1 | cowmaj2 == 1)
#> # A tibble: 205,114 × 5
#>    ccode1 ccode2  year cowmaj1 cowmaj2
#>     <dbl>  <dbl> <int>   <dbl>   <dbl>
#>  1      2     20  1920       1       0
#>  2      2     20  1921       1       0
#>  3      2     20  1922       1       0
#>  4      2     20  1923       1       0
#>  5      2     20  1924       1       0
#>  6      2     20  1925       1       0
#>  7      2     20  1926       1       0
#>  8      2     20  1927       1       0
#>  9      2     20  1928       1       0
#> 10      2     20  1929       1       0
#> # ℹ 205,104 more rows

All Contiguous Dyad-Years

These are all dyad-years separated by 400 miles of water or fewer, though the documentation for add_contiguity() cautions that users should be at least a little critical of the contiguity data.

cow_ddy %>% add_contiguity() %>%
  filter(conttype %in% c(1:5))
#> # A tibble: 81,060 × 4
#>    ccode1 ccode2  year conttype
#>     <dbl>  <dbl> <dbl>    <dbl>
#>  1      2     20  1920        1
#>  2      2     20  1921        1
#>  3      2     20  1922        1
#>  4      2     20  1923        1
#>  5      2     20  1924        1
#>  6      2     20  1925        1
#>  7      2     20  1926        1
#>  8      2     20  1927        1
#>  9      2     20  1928        1
#> 10      2     20  1929        1
#> # ℹ 81,050 more rows

All Dyad-Years Within a Set Distance

These are all dyad-years with a minimum distance of some user-specified threshold (in kilometers). This function will lean on add_minimum_distance(), which does have the side effect of truncating the left bound of the temporal domain to—as of right now—1886. These are all Correlates of War dyad-years from 1886 to 2019 separated by 1,000 kilometers or fewer.

cow_ddy %>% 
  # I recommend `use_extdata = TRUE`, but this is quicker.
  add_minimum_distance(use_extdata = FALSE) %>%
  filter(mindist <= 1000)
#> # A tibble: 167,532 × 4
#>    ccode1 ccode2  year mindist
#>     <dbl>  <dbl> <dbl>   <dbl>
#>  1      2     20  1921       0
#>  2      2     20  1922       0
#>  3      2     20  1923       0
#>  4      2     20  1924       0
#>  5      2     20  1925       0
#>  6      2     20  1926       0
#>  7      2     20  1927       0
#>  8      2     20  1928       0
#>  9      2     20  1929       0
#> 10      2     20  1930       0
#> # ℹ 167,522 more rows

Dyadic Dispute-Year Data

Dyadic dispute-year data come pre-processed in peacesciencer. Another vignette show how these are transformed to true dyad-year data, but they are also available for analysis. For example, the (directed) dyadic dispute-year Gibler-Miller-Little (GML) MID data are available as gml_dirdisp. Here, we can add information to these dyadic dispute-years to identify contiguity relationships and Correlates of War major status.

gml_dirdisp %>% add_contiguity() %>% add_cow_majors()
#> # A tibble: 10,276 × 42
#>    dispnum ccode1 ccode2  year midongoing midonset sidea1 sidea2 revstate1
#>      <dbl>  <dbl>  <dbl> <dbl>      <dbl>    <dbl>  <dbl>  <dbl>     <dbl>
#>  1       2      2    200  1902          1        1      1      0         1
#>  2       2    200      2  1902          1        1      0      1         1
#>  3       3    300    345  1913          1        1      1      0         1
#>  4       3    345    300  1913          1        1      0      1         0
#>  5       4    200    339  1946          1        1      0      1         0
#>  6       4    339    200  1946          1        1      1      0         0
#>  7       7    200    651  1951          1        1      1      0         0
#>  8       7    200    651  1952          1        0      1      0         0
#>  9       7    651    200  1951          1        1      0      1         1
#> 10       7    651    200  1952          1        0      0      1         1
#> # ℹ 10,266 more rows
#> # ℹ 33 more variables: revstate2 <dbl>, revtype11 <dbl>, revtype12 <dbl>,
#> #   revtype21 <dbl>, revtype22 <dbl>, fatality1 <dbl>, fatality2 <dbl>,
#> #   fatalpre1 <dbl>, fatalpre2 <dbl>, hiact1 <dbl>, hiact2 <dbl>,
#> #   hostlev1 <dbl>, hostlev2 <dbl>, orig1 <dbl>, orig2 <dbl>, hiact <dbl>,
#> #   hostlev <dbl>, mindur <dbl>, maxdur <dbl>, outcome <dbl>, settle <dbl>,
#> #   fatality <dbl>, fatalpre <dbl>, stmon <dbl>, endmon <dbl>, recip <dbl>, …

Users interested in the Correlates of War MID data will have this available for use as cow_mid_dirdisps. Future updates may change the object names for better standardization, but this is how it is now.

State-Day Data

peacesciencer comes with a create_statedays() function. This is admittedly more proof of concept as it is really difficult to conjure too many daily data sets in peace science, certainly with coverage into the 19th century. No matter, create_statedays() will create these data. It too has the same system and mry arguments (and same defaults) as create_stateyears().

Here are all Correlates of War state-days from 1816 to 2024.

create_statedays()
#> # A tibble: 6,345,986 × 3
#>    ccode cw_name                  date      
#>    <dbl> <chr>                    <date>    
#>  1     2 United States of America 1816-01-01
#>  2     2 United States of America 1816-01-02
#>  3     2 United States of America 1816-01-03
#>  4     2 United States of America 1816-01-04
#>  5     2 United States of America 1816-01-05
#>  6     2 United States of America 1816-01-06
#>  7     2 United States of America 1816-01-07
#>  8     2 United States of America 1816-01-08
#>  9     2 United States of America 1816-01-09
#> 10     2 United States of America 1816-01-10
#> # ℹ 6,345,976 more rows

Here are all Gleditsch-Ward state-days with the same temporal domain.

create_statedays(system = "gw")
#> # A tibble: 7,497,730 × 4
#>    gwcode gw_name                  microstate date      
#>     <dbl> <chr>                         <dbl> <date>    
#>  1      2 United States of America          0 1816-01-01
#>  2      2 United States of America          0 1816-01-02
#>  3      2 United States of America          0 1816-01-03
#>  4      2 United States of America          0 1816-01-04
#>  5      2 United States of America          0 1816-01-05
#>  6      2 United States of America          0 1816-01-06
#>  7      2 United States of America          0 1816-01-07
#>  8      2 United States of America          0 1816-01-08
#>  9      2 United States of America          0 1816-01-09
#> 10      2 United States of America          0 1816-01-10
#> # ℹ 7,497,720 more rows

I can conjure an application where a user may want to think of daily conflict episodes within the Gleditsch-Ward domain. The UCDP armed conflict data have more precise dates than, say, the Correlates of War MID data, making such an analysis possible. However, there are no conflict data before 1946 and you should reflect that with peacesciencer with something like this. This will require lubridate.

create_statedays(system = "gw") %>%
  filter(year(date) >= 1946)
#> # A tibble: 4,540,001 × 4
#>    gwcode gw_name                  microstate date      
#>     <dbl> <chr>                         <dbl> <date>    
#>  1      2 United States of America          0 1946-01-01
#>  2      2 United States of America          0 1946-01-02
#>  3      2 United States of America          0 1946-01-03
#>  4      2 United States of America          0 1946-01-04
#>  5      2 United States of America          0 1946-01-05
#>  6      2 United States of America          0 1946-01-06
#>  7      2 United States of America          0 1946-01-07
#>  8      2 United States of America          0 1946-01-08
#>  9      2 United States of America          0 1946-01-09
#> 10      2 United States of America          0 1946-01-10
#> # ℹ 4,539,991 more rows

State-Month Data

State-months are simple aggregations of state-days. You can accomplish this with a few more extra commands after create_statedays().

create_statedays(system = "gw") %>%
  mutate(year = year(date),
         month = month(date)) %>%
  distinct(gwcode, gw_name, year, month)
#> # A tibble: 246,422 × 4
#>    gwcode gw_name                   year month
#>     <dbl> <chr>                    <dbl> <dbl>
#>  1      2 United States of America  1816     1
#>  2      2 United States of America  1816     2
#>  3      2 United States of America  1816     3
#>  4      2 United States of America  1816     4
#>  5      2 United States of America  1816     5
#>  6      2 United States of America  1816     6
#>  7      2 United States of America  1816     7
#>  8      2 United States of America  1816     8
#>  9      2 United States of America  1816     9
#> 10      2 United States of America  1816    10
#> # ℹ 246,412 more rows

State-Quarter Data

There is some assumption worth belaboring about what a “quarter” would look like in a more general context, but it might look something like this. Again, this is an aggregation of create_statedays().

create_statedays(system = "gw") %>%
  mutate(year = year(date),
         month = month(date)) %>%
  filter(month %in% c(1, 4, 7, 10)) %>%
  mutate(quarter = case_when(
    month == 1 ~ "Q1",
    month == 4 ~ "Q2",
    month == 7 ~ "Q3",
    month == 10 ~ "Q4"
  )) %>%
  distinct(gwcode, gw_name, year, quarter)
#> # A tibble: 82,090 × 4
#>    gwcode gw_name                   year quarter
#>     <dbl> <chr>                    <dbl> <chr>  
#>  1      2 United States of America  1816 Q1     
#>  2      2 United States of America  1816 Q2     
#>  3      2 United States of America  1816 Q3     
#>  4      2 United States of America  1816 Q4     
#>  5      2 United States of America  1817 Q1     
#>  6      2 United States of America  1817 Q2     
#>  7      2 United States of America  1817 Q3     
#>  8      2 United States of America  1817 Q4     
#>  9      2 United States of America  1818 Q1     
#> 10      2 United States of America  1818 Q2     
#> # ℹ 82,080 more rows

Leader-Day (Leader-Month, Leader-Year) Data

peacesciencer has leader-level units of analysis as well, which can be easily created with the modified Archigos (archigos) data in peacesciencer. The data are version 4.1.

archigos
#> # A tibble: 3,409 × 11
#>    obsid    gwcode leadid leader yrborn gender startdate  enddate    entry exit 
#>    <chr>     <dbl> <chr>  <chr>   <dbl> <chr>  <date>     <date>     <chr> <chr>
#>  1 USA-1869      2 81dcc… Grant    1822 M      1869-03-04 1877-03-04 Regu… Regu…
#>  2 USA-1877      2 81dcc… Hayes    1822 M      1877-03-04 1881-03-04 Regu… Regu…
#>  3 USA-188…      2 81dcf… Garfi…   1831 M      1881-03-04 1881-09-19 Regu… Irre…
#>  4 USA-188…      2 81dcf… Arthur   1829 M      1881-09-19 1885-03-04 Regu… Regu…
#>  5 USA-1885      2 34fb1… Cleve…   1837 M      1885-03-04 1889-03-04 Regu… Regu…
#>  6 USA-1889      2 81dcf… Harri…   1833 M      1889-03-04 1893-03-04 Regu… Regu…
#>  7 USA-1893      2 34fb1… Cleve…   1837 M      1893-03-04 1897-03-04 Regu… Regu…
#>  8 USA-1897      2 81dcf… McKin…   1843 M      1897-03-04 1901-09-14 Regu… Irre…
#>  9 USA-1901      2 81dd2… Roose…   1858 M      1901-09-14 1909-03-04 Regu… Regu…
#> 10 USA-1909      2 81dd2… Taft     1857 M      1909-03-04 1913-03-04 Regu… Regu…
#> # ℹ 3,399 more rows
#> # ℹ 1 more variable: exitcode <chr>

create_leaderdays() will create leader-day data from archigos.

create_leaderdays()
#> # A tibble: 5,298,380 × 5
#>    obsid    gwcode leader date       yrinoffice
#>    <chr>     <dbl> <chr>  <date>          <dbl>
#>  1 USA-1869      2 Grant  1869-03-04          1
#>  2 USA-1869      2 Grant  1869-03-05          1
#>  3 USA-1869      2 Grant  1869-03-06          1
#>  4 USA-1869      2 Grant  1869-03-07          1
#>  5 USA-1869      2 Grant  1869-03-08          1
#>  6 USA-1869      2 Grant  1869-03-09          1
#>  7 USA-1869      2 Grant  1869-03-10          1
#>  8 USA-1869      2 Grant  1869-03-11          1
#>  9 USA-1869      2 Grant  1869-03-12          1
#> 10 USA-1869      2 Grant  1869-03-13          1
#> # ℹ 5,298,370 more rows

I do want to note one thing about the leader-level functions in this package. Whereas Correlates of War state system membership is often the default system for a lot of functions (prominently create_stateyears() and create_dyadyears()), the Gleditsch-Ward system is the default system because that is the state system around which the Archigos project created its leader data. Moreover, the leader data aren’t exactly tethered to the Gleditsch-Ward state system for dates either (e.g. there are leader entries for Gleditsch-Ward states that aren’t in the system yet). In a case like this, you can standardize these leader data to either the Correlates of War system or the Gleditsch-Ward system with the standardize argument. By default, the option here is “none” (i.e. return all available leader days recorded in the Archigos data). “cow” or “gw” standardizes the leader data to Correlates of War state system membership or Gleditsch-Ward state system membership, respectively.

create_leaderdays(standardize = "cow")
#> Joining with `by = join_by(gwcode, year)`
#> Joining with `by = join_by(ccode, date)`
#> # A tibble: 4,824,967 × 5
#>    obsid    ccode leader date       yrinoffice
#>    <chr>    <dbl> <chr>  <date>          <dbl>
#>  1 USA-1869     2 Grant  1869-03-04          1
#>  2 USA-1869     2 Grant  1869-03-05          1
#>  3 USA-1869     2 Grant  1869-03-06          1
#>  4 USA-1869     2 Grant  1869-03-07          1
#>  5 USA-1869     2 Grant  1869-03-08          1
#>  6 USA-1869     2 Grant  1869-03-09          1
#>  7 USA-1869     2 Grant  1869-03-10          1
#>  8 USA-1869     2 Grant  1869-03-11          1
#>  9 USA-1869     2 Grant  1869-03-12          1
#> 10 USA-1869     2 Grant  1869-03-13          1
#> # ℹ 4,824,957 more rows

The user may want to think about some additional post-processing on top of this, but this is enough to get started. From there, the same process that creates state-months can create something like leader-months.

create_leaderdays() %>%
  mutate(year = year(date),
         month = month(date)) %>%
  group_by(gwcode, obsid, year, month) %>%
  slice(1)
#> # A tibble: 177,128 × 7
#> # Groups:   gwcode, obsid, year, month [177,128]
#>    obsid    gwcode leader date       yrinoffice  year month
#>    <chr>     <dbl> <chr>  <date>          <dbl> <dbl> <dbl>
#>  1 USA-1869      2 Grant  1869-03-04          1  1869     3
#>  2 USA-1869      2 Grant  1869-04-01          1  1869     4
#>  3 USA-1869      2 Grant  1869-05-01          1  1869     5
#>  4 USA-1869      2 Grant  1869-06-01          1  1869     6
#>  5 USA-1869      2 Grant  1869-07-01          1  1869     7
#>  6 USA-1869      2 Grant  1869-08-01          1  1869     8
#>  7 USA-1869      2 Grant  1869-09-01          1  1869     9
#>  8 USA-1869      2 Grant  1869-10-01          1  1869    10
#>  9 USA-1869      2 Grant  1869-11-01          1  1869    11
#> 10 USA-1869      2 Grant  1869-12-01          1  1869    12
#> # ℹ 177,118 more rows

And here are leader-years, which are pre-packaged as a peacesciencer function. The package also adds some information about leader gender, an approximation of the leader’s age that year (i.e. year - yrborn), and a running count (starting a 1) for the leader’s tenure (in years).

create_leaderyears()
#> # A tibble: 17,686 × 7
#>    obsid    leader gwcode gender leaderage  year yrinoffice
#>    <chr>    <chr>   <dbl> <chr>      <dbl> <dbl>      <dbl>
#>  1 USA-1869 Grant       2 M             47  1869          1
#>  2 USA-1869 Grant       2 M             48  1870          2
#>  3 USA-1869 Grant       2 M             49  1871          3
#>  4 USA-1869 Grant       2 M             50  1872          4
#>  5 USA-1869 Grant       2 M             51  1873          5
#>  6 USA-1869 Grant       2 M             52  1874          6
#>  7 USA-1869 Grant       2 M             53  1875          7
#>  8 USA-1869 Grant       2 M             54  1876          8
#>  9 USA-1869 Grant       2 M             55  1877          9
#> 10 USA-1877 Hayes       2 M             55  1877          1
#> # ℹ 17,676 more rows

Leader Dyad-Year Data

peacesciencer can also create leader dyad-year data by way of create_leaderdyadyears(). You can see some of the underlying code that is creating these data. It’s a lot of code, it would take a lot of time to run from scratch, and the ensuing output is too large to store as an R data object in the package because CRAN hard-caps package size at 5 MB. Instead, users who want these data should first run download_extdata() when they first install or update the package. Therein, they can run create_leaderdyadyears() to create the full universe of leader dyad-year data.

# create_leaderdyadyears() is effectively doing this.
# Let's do the G-W leader dyad-year data for illustration's sake.
# `download_extdata()` will download these data into the package directory.
# Thus, it is *not* downloading the data fresh each time.

the_url <- "https://svmiller.com/R/peacesciencer/gw_dir_leader_dyad_years.rds"
readRDS(url(the_url)) %>%
  declare_attributes(data_type = "leader_dyad_year", system = "gw")
#> # A tibble: 2,336,990 × 11
#>     year obsid1   obsid2   gwcode1 gwcode2 gender1 gender2 leaderage1 leaderage2
#>    <int> <chr>    <chr>      <dbl>   <dbl> <chr>   <chr>        <dbl>      <dbl>
#>  1  1870 AFG-1868 AUH-1848     700     300 M       M               45         40
#>  2  1870 AFG-1868 BAV-1864     700     245 M       M               45         39
#>  3  1870 AFG-1868 BRA-1840     700     140 M       M               45         45
#>  4  1870 AFG-1868 CHN-1861     700     710 M       M               45         35
#>  5  1870 AFG-1868 COS-1870     700      94 M       M               45         39
#>  6  1870 AFG-1868 ECU-1869     700     130 M       M               45         49
#>  7  1870 AFG-1868 GMY-1858     700     255 M       M               45         73
#>  8  1870 AFG-1868 GRC-1863     700     350 M       M               45         25
#>  9  1870 AFG-1868 IRN-1848     700     630 M       M               45         39
#> 10  1870 AFG-1868 JPN-1868     700     740 M       M               45         18
#> # ℹ 2,336,980 more rows
#> # ℹ 2 more variables: yrinoffice1 <dbl>, yrinoffice2 <dbl>

# ^ compare with:
# download_extdata()
# create_leaderdyadyears()