Compare Linear Smoother to LOESS Smoother for Your OLS Model
Source:R/linloess_plot.R
linloess_plot.Rd
linloess_plot()
provides a visual diagnostic of the
linearity assumption of the OLS model. Provided an OLS model fit by
lm()
in base R, the function extracts the model frame and creates a
faceted scatterplot. For each facet, a linear smoother and LOESS smoother
are estimated over the points. Users who run this function can assess just
how much the linear smoother and LOESS smoother diverge. The more they
diverge, the more the user can determine how much the OLS model is a good
fit as specified. The plot will also point to potential outliers that may
need further consideration.
Arguments
- mod
a fitted OLS model
- resid
logical, defaults to
TRUE
. IfFALSE
, the y-axis on these plots are the raw values of the dependent variable. IfTRUE
, the y-axis is the model's residuals. Either work well here for the matter at hand, provided you treat the output here as illustrative or suggestive.- smoother
defaults to "loess", and is passed to the 'method' argument for the non-linear smoother.
- se
logical, defaults to
TRUE
. IfTRUE
, gives standard error estimates with the assorted smoothers.- span
a numeric, defaults to .75. An adjustment to the smoother. Higher values permit smoother lines and might be warranted in the presence of sparse pockets of the data.
- ...
optional parameters, passed to the scatterplot (
geom_point()
) component of this function. Useful if you want to make the smoothers more legible against the points.
Value
linloess_plot()
returns a faceted scatterplot as a
ggplot2 object. The linear smoother is in solid blue (with blue
standard error bands) and the LOESS smoother is a dashed black line (with
gray/default standard error bands). You can add cosmetic features to it after
the fact. The function may spit warnings to you related to the LOESS smoother,
depending your data. I think these to be fine the extent to which this is
really just a visual aid and an informal diagnostic for the linearity
assumption.
Details
This function makes an implicit assumption that there is no variable in the regression formula with the name ".y" or ".resid".
It may be in your interest (for the sake of rudimentary diagnostic checks) to disable the standard error bands for particularly ill-fitting linear models.
Examples
M1 <- lm(mpg ~ ., data=mtcars)
linloess_plot(M1)
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 4
#> Warning: neighborhood radius 2
#> Warning: reciprocal condition number 1.8444e-17
#> Warning: pseudoinverse used at 4
#> Warning: neighborhood radius 2
#> Warning: reciprocal condition number 1.8444e-17
#> Warning: pseudoinverse used at 3.98
#> Warning: neighborhood radius 4.02
#> Warning: reciprocal condition number 6.1406e-17
#> Warning: There are other near singularities as well. 16.16
#> Warning: pseudoinverse used at 3.98
#> Warning: neighborhood radius 4.02
#> Warning: reciprocal condition number 6.1406e-17
#> Warning: There are other near singularities as well. 16.16
#> Warning: pseudoinverse used at 2.99
#> Warning: neighborhood radius 1.01
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 4.0401
#> Warning: pseudoinverse used at 2.99
#> Warning: neighborhood radius 1.01
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 4.0401
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
linloess_plot(M1, color="black", pch=21)
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at 4
#> Warning: neighborhood radius 2
#> Warning: reciprocal condition number 1.8444e-17
#> Warning: pseudoinverse used at 4
#> Warning: neighborhood radius 2
#> Warning: reciprocal condition number 1.8444e-17
#> Warning: pseudoinverse used at 3.98
#> Warning: neighborhood radius 4.02
#> Warning: reciprocal condition number 6.1406e-17
#> Warning: There are other near singularities as well. 16.16
#> Warning: pseudoinverse used at 3.98
#> Warning: neighborhood radius 4.02
#> Warning: reciprocal condition number 6.1406e-17
#> Warning: There are other near singularities as well. 16.16
#> Warning: pseudoinverse used at 2.99
#> Warning: neighborhood radius 1.01
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 4.0401
#> Warning: pseudoinverse used at 2.99
#> Warning: neighborhood radius 1.01
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 4.0401
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01
#> Warning: pseudoinverse used at -0.005
#> Warning: neighborhood radius 1.005
#> Warning: reciprocal condition number 0
#> Warning: There are other near singularities as well. 1.01