Attitudes Toward National Spending in the General Social Survey (2018)

This is a toy data set that collects attitudes on toward national spending for various things in the General Social Survey for 2018. I use these data for in-class illustration about ordinal variables and ordinal models.

Usage

gss_spending

Format

A data frame with 2348 observations on the following 33 variables.

year: a numeric constant for the GSS survey year (2018)
id: a unique identifier for the survey respondent
age: a numeric vector for the age of the respondent (min: 18, max: 89)
sex: a numeric vector for the respondent's sex (1 = female, 0 = male)
educ: a numeric vector for the highest year of school completed (min: 0, max: 20)
degree: a numeric vector for the respondent's highest degree (0 = did not graduate high school, 1 = high school, 2 = junior college, 3 = bachelor degree, 4 = graduate degree)
race: a numeric vector for the respondent's race (1 = white, 2 = black, 3 = other)
rincom16: a numeric vector for the respondent's yearly income (min: 1 (under $1,000), max: 26 ($170,000 or over))
partyid: a numeric vector for the respondent's party identification on the familiar seven-point scale. NOTE: D to R partisanship in this variable goes from 0 to 6. 7 = supporters of other parties. You may want to recode this if you want an interval-level measure of partisanship.
polviews: a numeric vector for the respondent's ideology (min: 1 (extremely liberal), max: 7 (extremely conservative))
xnorcsiz: a numeric vector for the NORC size code. This is a measure of what kind of area in which the respondent took the survey (i.e. lives). 1 = city, greater than 250k residents. 2 = city, between 50k-250k residents. 3 = suburbs of a large city. 4 = suburbs of a medium-sized city. 5 = unincorporated area of a large city. 6 = unincorporated area of a medium city. 7 = city, between 10-50k residents. 8 = town, greater than 2,500 residents. 9 = smaller areas. 10 = open country.
news: a numeric vector for how often the respondent reads the newspapers. 1 = everyday. 2 = a few times a week. 3 = once a week. 4 = less than once a week. 5 = never.
wrkstat: a numeric vector for the respondent's work status. 1 = working full-time. 2 = working part-time. 3 = temporarily not working. 4 = unemployed/laid off. 5 = retired. 6 = in school. 7 = house-keeping work. 8 = other.
natspac: a numeric vector for attitudes toward spending on the space program. See details below for this variable and all other variables beginning with nat.
natenvir: a numeric vector for attitudes toward spending on improving/protecting the environment.
natheal: a numeric vector for attitudes toward spending on improving/protecting the nation's health.
natcity: a numeric vector for attitudes toward spending on solving the big city's problems.
natcrime: a numeric vector for attitudes toward spending on halting the "rising crime rate." This question is subtly hilarious.
natdrug: a numeric vector for attitudes toward spending on dealing with drug addiction.
nateduc: a numeric vector for attitudes toward spending on improving the nation's education system.
natrace: a numeric vector for attitudes toward spending on improving the condition of black people.
natarms: a numeric vector for attitudes toward spending on the military/armaments/defense.
nataid: a numeric vector for attitudes toward spending on foreign aid.
natfare: a numeric vector for attitudes toward spending on welfare.
natroad: a numeric vector for attitudes toward spending on highways and bridges.
natsoc: a numeric vector for attitudes toward spending on social security.
natmass: a numeric vector for attitudes toward spending on mass transportation.
natpark: a numeric vector for attitudes toward spending on parks and recreation.
natchld: a numeric vector for attitudes toward spending on assistance for child care.
natsci: a numeric vector for attitudes toward spending on scientific research.
natenrgy: a numeric vector for attitudes toward spending on alternative sources of energy.
sumnat: a numeric vector for the sum total of responses to all the aforementioned spending variables (i.e. those that begin with nat). This creates an interval-ish measure with a nice and mostly normal distribution.
sumnatsoc: a numeric vector for the sum of all responses toward various "social" prompts (i.e. natenvir, natheal, natdrug, nateduc, natrace, natfare, natroad, natmass, natpark, natsoc, natchld). This creates an interval-ish measure with a mostly normal (but small left skew) distribution.

Source

General Social Survey, 2018

Details

For all the variables beginning with nat, note that I rescaled the original data so that -1 = respondent thinks country is spending too much on this topic, 0 = respondent thinks country is spending "about (the) right" amount, and 1 = respondent thinks country is spending too little on this topic. I do this to facilitate reading each nat prompt as increasing support for more spending (the extent to which increasing values means the respondent thinks the country spends too little on a given prompt). I think this is more intuitive.

Also, the natspac, natenvir, natheal, natcity, natcrime, natdrug, nateduc, natrace, natarms, nataid, and natfare have "alternate" prompts in later GSS waves in which a subset of respondents get a slightly different prompt. For example, one set of respondents for natcity gets a prompt of "Solving the problems of the big cities" (the legacy prompt) whereas another set of respondents gets a prompt of "Assistance to big cities" (typically noted as "version y" in the GSS). I, perhaps problematically if I were interested in publishing analyses on these data, combine both prompts into a single variable. I don't think it's a huge problem for what I want the data to do, but FYI.