Comparative Public Health: The Political Economy of Human Misery and Well-Being
Source:R/rd-GHR04.R
GHR04.RdThis is a data set for replicating Ghobarah et al. (2004), a reduced form of what they make available on Dataverse for replication. Variables have been renamed for legibility.
Format
A data frame with 182 observations on the following 15 variables.
countrya character vector denoting a country name
iso3ca three-character ISO code for the country
pubhlthexppgdpa numeric vector for public health expenditures as a percentage of GDP
totexphltha numeric vector for total expenditures on health
halea numeric vector for health adjusted life expectancy (in years)
log_gdppca numeric vector for (log-transformed) GDP per capita
ginia numeric vector for income inequality
log_educa numeric vector for (log-transformed) educational attainment
log_vanhanena numeric vector for (log-transformed) racial-linguistic-religious heterogeneity
rivalrya dummy variable indicating the presence of an enduring international rivalry for the country
politya numeric vector communicating a Polity score, as a measure of the democratic nature of the country's regime
prvhlthexpgdpa numeric vector for private spending on health as a percentage of GDP
urban_growtha numeric vector for the pace of urbanization
cwdeathsa numeric vector for civil war deaths
contig_cwa dummy variable communicating whether there is a civil war in a geographically contiguous territory
Source
Ghobarah, Hazem Adam, Paul Huth, and Bruce Russett. 2004. "Comparative Public Health: The Political Economy of Human Misery and Well-Being" International Studies Quarterly 48: 73-94
Details
The three-character ISO code is the only new addition to the data. I add this because the country names they have in the data are not neat and may lead users astray if they wanted to search for a specific observation. The ISO code for Yugoslavia (Serbia and Montenegro) around this time was "SCG".
The data the authors make available come with no .do file to indicate what exactly they used. Some forensic work based on the descriptive statistics they mention led to this reduced form of their data, which almost perfectly replicates their results. The differences are typically in the hundredths, and often in the thousandths, and should be considered "good enough" for replication purposes. The descriptive statistics correspond with what the authors report in their analyses for all variables, except the Polity variable. I have no way of knowing how they got the median they report. It should be 6, not 7.
The only real confusion on my end is why I ended up with one more
observation than they report in Tables 1 and 3, and two more observations than
they report in Table 2. This suggests one (or more?) of their variables
they use has an NA, but I have no way of knowing what it could be.