Create and Extend Strategic (International) Rivalry Data in R

This Functionality is Now in `{peacesciencer}` ⤵️

The processes described here have been included in {peacesciencer}, an R package for the creation of all kinds of peace science data. The strategic_rivalries data frame is still in {stevemisc} as a legacy. A slightly modified version is included in {peacesciencer} as the td_rivalries data. You can add strategic rivalry data to state-year or dyad-year data in {peacesciencer} with the add_strategic_rivalries() function. Please check out the website for {peacesciencer} for updates on its continued development.

: Alfred Bettanier's (1887) painting stylizes the Franco-German enmity and its focal point at the 'black spot' of Alsace-Lorraine.

This is another one of those things I find myself doing from time to time in my research so it might be advisable to write it down and remember how to do it.

Rivalries are an important concept in the quantitative study of international conflict. This relates to one of the more important takeaways from the data on inter-state disputes: they’re not randomly assigned across dyads. A few dyads are disproportionately responsible for large conflicts and wars. I use the India-Pakistan dyad in my upper-division conflict course as an illustration of this phenomenon. The dyad was created in 1947 following the contentious partition of British Raj along ill-defined Radcliffe Line. War immediately followed for the cause of territorial consolidation and relations have been tense ever since. Indeed, India and Pakistan have had four wars since their mutual creation and have been in a MID (often a fatal MID) around 70% of their existence. The rivalry scholarship in international relations argues these previous crises lead to the emergence of rivalry relationships in which states view each other as threats and compete against each other. This makes future crises more likely.

There are a number of ways of identifying rivalry relationships. Most classic rivalry scholarship identified rivalries based on past disputes. Diehl and Goertz (2000), for example, develop a rivalry classification that depends on the volume of MIDs in a given window of time. However useful, this “dispute-density” approach uses past dispute history to predict future disputes in a matter not unlike how roll call votes in the past are use to predict roll call votes in the future. Any observed relationship might simply be a tautology.

For that reason, I’ve been to drawn to Thompson’s (2000) and later Thompson and Dreyer’s (2012) “diplomatic history” approach. Herein, researchers look into diplomatic relations to see how states interact with each other, whether diplomatic communiqués treat the other side as threats, and whether military exercises target the other side with that in mind. This approach might be more “perceptive” than the dispute-density approach but it is at least conceptually distinct from the phenomenon we want rivalry to explain (i.e. conflict recurrence).

However, the ensuing data are less of a data set one can download and more a history of rivalry relationships that Thompson and Dreyer summarize in their book. Nevertheless, the appendix of their book, and a few lines of R code, can create some data sets to use in our standard dyad-year modeling approach to conflict onset.

The (Raw) Data
Prep the Data
Create (Non)-Directed Rivalry Year Data

The (Raw) Data

I scanned Thompson and Dreyer’s (2012) book at the appendix and created a spreadsheet that I make available in my {stevemisc} package as a data object titled strategic_rivalries. Alternatively, a .csv of the data is available here. Let’s load that raw data and take a gander at it. The data are purposely minimal right now because it’s a quick scan of the information from the appendix. The goal is to record it in a spreadsheet and extend it later.

# library(tidyverse)
# library(stevemisc)
data("strategic_rivalries")

# alternatively:
# strategic_rivalries <- read_csv("https://gist.githubusercontent.com/svmiller/63dace4aa5a00eddce307d964c7bac23/raw/3941ed5654cff77dc4509ff7b81e664cf8b0875e/strategic_rivalries.csv")

strategic_rivalries %>%
  head(5) %>%
    kable(., format="html",
        table.attr='id="stevetable"',
        caption = "The First Five Rows of the Strategic Rivalry Data",
        align=c("c","l","l", "l", "l", "c","c","l","c","c","c"))

The First Five Rows of the Strategic Rivalry Data
rivalryno	rivalryname	sidea	sideb	styear	endyear	region	type1	type2	type3
1	Austria-France	Austria	France	1494	1918	European GPs	spatial	positional	NA
2	Austria-Ottoman Empire	Austria	Ottoman Empire	1494	1908	European GPs	spatial	positional	NA
4	France-Spain	France	Spain	1494	1700	European GPs	positional	spatial	NA
3	Britain-France 1	Great Britain	France	1494	1716	European GPs	spatial	positional	NA
5	Ottoman Empire-Spain	Ottoman Empire	Spain	1494	1585	European GPs	spatial	positional	NA

rivalryno is a unique rivalry number that starts at 1 and ends at 197. rivalryname is the name of the rivalry that Thompson and Dreyer give this particular observation. In most cases, it’s a simple concatenation of the two participants in an alphabetical order. Cases where there are multiple rivalries in a given dyad will have numbers next to them (e.g. Britain-Spain 1, Britain-Spain 2). sidea and sideb are the participants in the rivalry. The rivalry data are non-directed but sidea is whatever country comes first alphabetically. styear and endyear are the start year and end year (respectively) for the rivalry. Ongoing rivalries have a right bound of 2010, but could conceivably be extended to the present year in almost every case.

region is where Thompson and Dreyer code the rivalry as occurring. These regions that Thompson and Dreyer describe are multiple and mostly consistent across time and space, but users interested in regional rivalries may want to explore these locations and standardize them further. For example, the “Germany-United States 2” rivalry (1933-1945) and “Russia-United States 2” (2007-present) rivalries are both in “Multiple” regions despite some different focal points between the two whereas the “Russia-United States 1” (1945-1989) rivalry is in the “Global” region. Of note: European great power rivalries have their own region (“European GPs”).

Finally, type1, type2, and type3 variables describe the nature of the rivalry and what it concerned, in order of importance. Not every rivalry has a second or third dimension, but every rivalry must have a primary dimension coded in the type1 variable. There are four categories of rivalry in the Thompson and Dreyer (2012) data. “Spatial” rivalries are contested over the control of territory, broadly defined. The Armenia-Azerbaijan rivalry (1991-present) is a good example of an exclusively spatial rivalry since most of the relationship concerns Nagorno-Karabakh. “Posiitonal” rivalries are competitions for relative shares of influence in a region. The Iran-Israel rivalry (1979-present) is a good example of an exclusively positional rivalry since the concern largely hinges on Israel’s misgivings about Iran’s aspirations in the region after the overthrow of the Shah in 1979. “Ideological” rivalries are relationships where two sides contest virtues of competing economic/political systems. The “Costa Rica-Nicaragua 2” (1948-1990) rivalry is the only exclusively ideological rivalry in the data. Therein, democratic Costa Rica and Marxist Nicaragua actively advocated regime change for the other side. “Interventionary” rivalries are new types of rivalries that Thompson and Dreyer introduce to this project. These are relationships in which states intrude into the internal affairs of other states for sake of leverage in the other state’s decision-making. They are often done without clear spatial, positional, or even ideological reasons. The concept borrows from Cliffe’s (1999) discussion of “mutual intervention” in the Horn of Africa and it should be no surprise that all interventionary rivalries in the data are located in Central Africa or East Africa.

Many rivalries have a second dimension but very few have three dimensions. The West Germany-East Germany rivalry (1949-1973) is an accessible three-dimensional rivalry in this data. Therein, the rivalry was primarily ideological (type1), but had a secondary positional aspect (type2) and a minimal, but still important, spatial/territorial element (type3) on top of that.

There is more coding necessary to get the most use out of these data, but the raw data can already communicate some basic descriptive statistics about strategic rivalries. For example, here are the the 10 longest rivalries in the data. It’s unsurprising that the top seven are European great power rivalries.

strategic_rivalries %>%
  mutate(duration = (endyear - styear)+1) %>%
  arrange(-duration) %>%
  head(10) %>%
  select(rivalryname, styear, endyear, region, type1, duration) %>%
  kable(., format="html",
    table.attr='id="stevetable"',
    caption = "The Ten Longest Rivalries in the History of the World",
    align=c("l","c","c","l","l","c"))

The Ten Longest Rivalries in the History of the World
rivalryname	styear	endyear	region	type1	duration
Austria-France	1494	1918	European GPs	spatial	425
Austria-Ottoman Empire	1494	1908	European GPs	spatial	415
Ottoman Empire-Russia	1668	1918	European GPs	spatial	251
Ottoman Empire-Venice	1494	1717	European GPs	spatial	224
Britain-France 1	1494	1716	European GPs	spatial	223
France-Spain	1494	1700	European GPs	positional	207
France-Prussia	1756	1955	European GPs	spatial	200
Colombia-Venezuela	1831	2010	South America	spatial	180
Britain-Russia	1778	1956	European GPs	positional	179
Bolivia-Chile	1836	2010	South America	spatial	175

We can also do a basic summary of the distribution of rivalries by primary rivalry type. The modal category is clearly spatial rivalry, which coincides with over 45% of all primary rivalry types. Positional rivalries over relative shares of influence in a region are the next most common. The least common rivalry type is interventionary. These are rivalries exclusive to Sub-Saharan Africa.

strategic_rivalries %>% group_by(type1) %>% 
  summarize(n = n()) %>% ungroup() %>%
  arrange(-n) %>%
  mutate(percent = paste0(mround(n/sum(n)),"%")) %>%
  kable(., format="html",
        table.attr='id="stevetable"',
        caption = "The Distribution of Rivalries by Primary Rivalry Type, 1494-2010",
        align = c("l","c","c"))

The Distribution of Rivalries by Primary Rivalry Type, 1494-2010
type1	n	percent
spatial	89	45.18%
positional	65	32.99%
ideological	30	15.23%
interventionary	13	6.6%

Prep the Data

A user will need to prep the data a little to get some usable dyad-year data from this raw list of strategic rivalries. Importantly, the data are non-directed but the dyad is “ordered” alphabetically rather than by a numeric coding system (a la Correlates of War [CoW] state system membership). This will make for some headaches in standard dyad-year data because Portugal (ccode: 235) precedes Spain (ccode: 230) in the rivalry list, but will never precede Spain in a non-directed dyad-year design that relies on CoW system membership data.

Fortunately, the {countrycode} package is great for this. We’ll first convert sidea and sideb to a ccodea and ccodeb. The package will take care of almost everything here, though there are a few caveats that I’ll highlight in the code below.

require(countrycode)

strategic_rivalries %>%
  mutate(ccodea = countrycode(sidea, "country.name", "cown"),
         ccodeb = countrycode(sideb, "country.name", "cown")) -> strategic_rivalries

# Austria is "Austria" in the rivalry data, but Austria-Hungary before it.
# We'll fix some of this a bit later too.
strategic_rivalries$ccodea[strategic_rivalries$sidea == "Austria"] <- 300
# Prussia doesn't appear as a partial matching term for successor state Germany
strategic_rivalries$ccodea[strategic_rivalries$sidea == "Prussia"] <- 255 
# countrycode instinctively gives Germany's ccode to West Germany
strategic_rivalries$ccodea[strategic_rivalries$sidea == "West Germany"] <- 260 
# Ottoman Empire doesn't appear as a matching term for successor state Turkey
strategic_rivalries$ccodea[strategic_rivalries$sidea == "Ottoman Empire"] <- 640
# Silly error, but countrycode doesn't know between Vietnams
strategic_rivalries$ccodea[strategic_rivalries$sidea == "North Vietnam"] <- 816


strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Ottoman Empire"] <- 640
# Note: I'm creating this since Venice never appears in the CoW data. I won't ever use it.
# You probably won't either.
strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Venice"] <- 324 
strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Prussia"] <- 255
# countrycode always struggles with Serbia as successor state to Yugoslavia.
strategic_rivalries$ccodeb[strategic_rivalries$sideb == "Serbia"] <- 345 

Next, we’ll create a ccode1 and ccode2 variable that makes these non-directed. The lower country code will always appear first.

strategic_rivalries %>%
  mutate(ccode1 = ifelse(ccodeb > ccodea, ccodea, ccodeb),
         ccode2 = ifelse(ccodeb > ccodea, ccodeb, ccodea)) -> strategic_rivalries

Create (Non)-Directed Rivalry-Year Data

The process of extending these rivalry data into rivalry-year data is effectively identical to what I showed in my guide on how to create country-year, non-directed dyad-year, and directed dyad-year data. People who have read that guide will see what is happening; rowwise() and unnest() are doing all the heavy-lifting here. Do note we need a quick fix for that one Austrian rivalry that actually extends past 1918.

# NRY: non-directed rivalry-years

strategic_rivalries %>%
  # We don't need these two columns and they'll only get in the way.
  select(-ccodea, -ccodeb) %>%
  # Prepare the pipe to think rowwise. If you don't, the next mutate command will fail.
  rowwise() %>%
  # Create a list in a tibble that we're going to expand soon.
  mutate(year = list(seq(styear, endyear))) %>%
  # Unnest the list, which will expand the data.
  unnest() %>%
  # Minor note: ccode change for Austria, post-1918 for rivalryno 79.
  mutate(ccode1 = ifelse(ccode1 == 300 & year >= 1919, 305, ccode1)) -> NRY

This dynamic document isn’t also creating non-directed dyad-year data, but merging non-directed rivalry-year data into non-directed dyad-year data is easy since there would be common keys of ccode1, ccode2, and year. It would look like this.

# Assume an object (NDY) has complete non-directed dyad-year data.
# See: svmiller.com/blog/2019/01/create-country-year-dyad-year-from-country-data/

NRY %>%
  # Let's just select stuff we may want since we don't want too huge a data frame.
  select(ccode1, ccode2, year, type1:type3) %>%
  # Simple mutate: every row means there's an ongoing rivalry. Duh.
  mutate(ongorivalry = 1) %>%
  # And left_join...
  left_join(NDY, .) %>%
  # if ongorivalry is NA, it's actually zero
  mutate(ongorivalry = ifelse(is.na(ongorivalry), 0, ongorivalry)) -> NDY

That’s it. Doing this takes a table of 197 rivalries in Thompson and Dreyer’s appendix, entered to a spreadsheet in about 30 minutes (if I recall that effort correctly), saved as an R data set, and extends it into rivalry-year data to be quickly merged into dyad-year data. Just a few lines of R code from {tidyerse} with some light maintenance from the {countrycode} package are all you need.

This Functionality is Now in {peacesciencer} ⤵️

The (Raw) Data

Prep the Data

Create (Non)-Directed Rivalry-Year Data

This Functionality is Now in `{peacesciencer}` ⤵️