Add fractionalization/polarization estimates from CREG to a data frame
Source:R/add_creg_fractionalization.R
add_creg_fractionalization.Rd
add_creg_fractionalization()
allows you to add information about the
fractionalization/polarization of a state's ethnic and religious groups to
your data.
Value
add_creg_fractionalization()
takes a dyad-year, leader-year,
leader-dyad-year, or state-data frame, whether the primary state
identifiers are from the Correlates of War system or the Gleditsch-Ward
system, and returns information about the fractionalization and
polarization of the state(s) in a given year. The function returns four
additional columns when the data are state-year and returns eight
additional columns when the data are state-year (or leader-year).
The columns returned are the fractionalization of ethnic groups, the
polarization of ethnic groups, the fractionalization of religious groups,
and the polarization of religious groups. When the data are dyad-year
(or leader-dyad-year), the return doubles because it provides information
for both states in the dyad.
Details
Please see the information for the underlying data creg
,
and the associated R script in the data-raw
directory, to see how
these data are generated.
The creg
data have a few duplicates. When standardizing to true CoW
codes, the duplicates concern Serbia/Yugoslavia in 1991 and 1992 as well as
Russia/the Soviet Union in 1991. When standardizing to true Gleditsch-Ward
codes, the duplicates concern Serbia/Yugoslavia in 1991 and Russia/Soviet
Union in 1991. In those cases, the function does a group-by arrange for
the more fractionalized/polarized estimate under the (reasonable, I think)
assumption that these are estimates prior to the dissolution of those
states. If this is problematic, feel free to consult the underlying data
and merge those in manually.
The underlying data have both Gleditsch-Ward codes and Correlates of War
codes. The merge it makes depends on what you declare as the "master"
system at the top of the pipe (i.e. in create_dyadyears()
or
create_stateyears()
). If, for example, you run
create_stateyears(system="cow")
and follow it with
add_gwcode_to_cow()
, the merge will be on the Correlates of War
codes and not the Gleditsch-Ward codes. You can see the script mechanics
to see how this is achieved.
Be mindful that the data are fundamentally state-year and that extensions to leader-level data should be understood as approximations for leaders in a given state-year.
References
Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat and Romain Wacziarg. 2003. "Fractionalization". Journal of Economic Growth 8: 155-194.
Montalvo, Jose G. and Marta Reynal-Querol. 2005. "Ethnic Polarization, Potential Conflict, and Civil Wars" American Economic Review 95(3): 796--816.
Nardulli, Peter F., Cara J. Wong, Ajay Singh, Buddy Petyon, and Joseph Bajjalieh. 2012. The Composition of Religious and Ethnic Groups (CREG) Project. Cline Center for Democracy.
Examples
# \donttest{
# just call `library(tidyverse)` at the top of the your script
library(magrittr)
cow_ddy %>% add_creg_fractionalization()
#> # A tibble: 2,139,270 × 11
#> ccode1 ccode2 year ethfrac1 ethpol1 relfrac1 relpol1 ethfrac2 ethpol2
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 20 1920 NA NA NA NA NA NA
#> 2 2 20 1921 NA NA NA NA NA NA
#> 3 2 20 1922 NA NA NA NA NA NA
#> 4 2 20 1923 NA NA NA NA NA NA
#> 5 2 20 1924 NA NA NA NA NA NA
#> 6 2 20 1925 NA NA NA NA NA NA
#> 7 2 20 1926 NA NA NA NA NA NA
#> 8 2 20 1927 NA NA NA NA NA NA
#> 9 2 20 1928 NA NA NA NA NA NA
#> 10 2 20 1929 NA NA NA NA NA NA
#> # ℹ 2,139,260 more rows
#> # ℹ 2 more variables: relfrac2 <dbl>, relpol2 <dbl>
create_stateyears() %>% add_creg_fractionalization()
#> Joining with `by = join_by(ccode, year)`
#> # A tibble: 17,121 × 7
#> ccode statenme year ethfrac ethpol relfrac relpol
#> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 United States of America 1816 NA NA NA NA
#> 2 2 United States of America 1817 NA NA NA NA
#> 3 2 United States of America 1818 NA NA NA NA
#> 4 2 United States of America 1819 NA NA NA NA
#> 5 2 United States of America 1820 NA NA NA NA
#> 6 2 United States of America 1821 NA NA NA NA
#> 7 2 United States of America 1822 NA NA NA NA
#> 8 2 United States of America 1823 NA NA NA NA
#> 9 2 United States of America 1824 NA NA NA NA
#> 10 2 United States of America 1825 NA NA NA NA
#> # ℹ 17,111 more rows
create_stateyears(system = "gw") %>% add_creg_fractionalization()
#> Joining with `by = join_by(gwcode, year)`
#> # A tibble: 18,637 × 7
#> gwcode statename year ethfrac ethpol relfrac relpol
#> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2 United States of America 1816 NA NA NA NA
#> 2 2 United States of America 1817 NA NA NA NA
#> 3 2 United States of America 1818 NA NA NA NA
#> 4 2 United States of America 1819 NA NA NA NA
#> 5 2 United States of America 1820 NA NA NA NA
#> 6 2 United States of America 1821 NA NA NA NA
#> 7 2 United States of America 1822 NA NA NA NA
#> 8 2 United States of America 1823 NA NA NA NA
#> 9 2 United States of America 1824 NA NA NA NA
#> 10 2 United States of America 1825 NA NA NA NA
#> # ℹ 18,627 more rows
# }