This toy data set is loosely modified from Wave I of the NESARC data set. Here, my main interest is the number of drinks consumed on a usual day drinking alcohol in the past 12 months, according to respondents in the nationally representative survey of 43,093 Americans.
A data frame with 43093 observations on the following 8 variables.
a numeric vector and sequence from 1 to the number of rows in the data
a numeric vector for the ethnicity/race. 1 = White, not Hispanic. 2 = Black, not Hispanic. 3 = AI/AN. 4 = Asian, Native Hawaiian, Pacific Islander. 5 = Hispanic or Latino.
a numeric vector for the Census region. 1 = Northeast. 2 = Midwest. 3 = South. 4 = West
a numeric vector for age in years
a numeric vector for sex. 1 = female. 0 = male
a numeric vector for marital status. 1 = married. 2 = living with someone as married. 3 = widowed. 4 = divorced. 5 = separated. 6 = never married
a numeric vector for education level, recoded from
s1q6a in the original data. 1 = did not make it to/finish high school. 2 = high school graduate or equivalency. 3 = some college, but no four-year degree. 4 = four-year college degree or more.
a numeric vector for the number of drinks of any alcohol consumed on days drinking alcohol in the past 12 months. This variable is ``as-is'' from the original data set.
National Epidemiologic Survey on Alcohol and Related Conditions (NESARC)—Wave 1 (2001–2002)
You will not want to use the
s2aq8b variable without recoding it first.
Those who cannot recall how much they typically drink (i.e. true ``don't knows''
or missing info) are coded as 99. Non-drinkers are coded as
NA in the
variable and should be recoded as 0. Any value between 1 and 98 in the variable represents the,
for lack of better term, ``true'' number of alcoholic drinks a respondent says s/he typically
consumes on a day drinking alcohol in the past 12 months, though this is evidently preposterous
as a count variable. A person drinking 42 alcoholic drinks a day would not be alive to tell you
they did this. The researcher may want to employ some sensible right censoring here.