4.6 Results
4.6.1 summary()
function
The summary()
function provides a nice summary of a model object.
You could also use the str()
function to see the details of
what is included in the model object.
The following examples display the summary for the three models created above.
Displaying the summary of the linear model from above.
Call: lm(formula = Reaction ~ Days + Subject, data = sleepstudy) Residuals: Min 1Q Median 3Q Max -100.540 -16.389 -0.341 15.215 131.159 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 295.0310 10.4471 28.240 < 2e-16 *** Days 10.4673 0.8042 13.015 < 2e-16 *** Subject309 -126.9008 13.8597 -9.156 2.35e-16 *** Subject310 -111.1326 13.8597 -8.018 2.07e-13 *** Subject330 -38.9124 13.8597 -2.808 0.005609 ** Subject331 -32.6978 13.8597 -2.359 0.019514 * Subject332 -34.8318 13.8597 -2.513 0.012949 * Subject333 -25.9755 13.8597 -1.874 0.062718 . Subject334 -46.8318 13.8597 -3.379 0.000913 *** Subject335 -92.0638 13.8597 -6.643 4.51e-10 *** Subject337 33.5872 13.8597 2.423 0.016486 * Subject349 -66.2994 13.8597 -4.784 3.87e-06 *** Subject350 -28.5311 13.8597 -2.059 0.041147 * Subject351 -52.0361 13.8597 -3.754 0.000242 *** Subject352 -4.7123 13.8597 -0.340 0.734300 Subject369 -36.0992 13.8597 -2.605 0.010059 * Subject370 -50.4321 13.8597 -3.639 0.000369 *** Subject371 -47.1498 13.8597 -3.402 0.000844 *** Subject372 -24.2477 13.8597 -1.750 0.082108 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 30.99 on 161 degrees of freedom Multiple R-squared: 0.7277, Adjusted R-squared: 0.6973 F-statistic: 23.91 on 18 and 161 DF, p-value: < 2.2e-16
The summary display starts with the call to lm which generated the model object.
The residual summary is the five number summary for the residuals. This can be used as a quick check for skewed residuals.
The coefficient's summary shows the estimated value, standard error, and p-value for each coefficient. The p-values are from Wald tests of each coefficient being equal to zero. For OLS models this is equivalent to an F-test of nested models with the variable of interest being removed in the nested model.
The display ends with information on the model fit. This is the residual standard error, R squared of the model, and the F-test of the significance of the model verse the null model.
Displaying the summary of the GLM model from above.
summary(modglm)
Call: glm(formula = volunteer ~ sex + extraversion * neuroticism, family = binomial, data = Cowles) Deviance Residuals: Min 1Q Median 3Q Max -1.4749 -1.0602 -0.8934 1.2609 1.9978 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.358207 0.501320 -4.704 2.55e-06 *** sexmale -0.247152 0.111631 -2.214 0.02683 * extraversion 0.166816 0.037719 4.423 9.75e-06 *** neuroticism 0.110777 0.037648 2.942 0.00326 ** extraversion:neuroticism -0.008552 0.002934 -2.915 0.00355 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1933.5 on 1420 degrees of freedom Residual deviance: 1897.4 on 1416 degrees of freedom AIC: 1907.4 Number of Fisher Scoring iterations: 4
The summary display for glm models includes similar call, residuls, and coefficient sections. The glm model fit summary includes dispersion, deviance, and iteration information.
4.6.2 confint()
function
The confint()
function can be applied to all of the
above model object types.
The confint()
function will calculate a profiled
confidence interval when it is appropriate.
The following example displays the confidence intervals for the three models created above.
Displaying the confidence intervals for both of the models.
confint(mod)
2.5 % 97.5 % (Intercept) 274.399941 315.66215 Days 8.879103 12.05547 Subject309 -154.271100 -99.53060 Subject310 -138.502810 -83.76231 Subject330 -66.282660 -11.54216 Subject331 -60.068030 -5.32753 Subject332 -62.202010 -7.46151 Subject333 -53.345770 1.39473 Subject334 -74.202030 -19.46153 Subject335 -119.434040 -64.69354 Subject337 6.216930 60.95743 Subject349 -93.669610 -38.92911 Subject350 -55.901400 -1.16090 Subject351 -79.406330 -24.66583 Subject352 -32.082540 22.65796 Subject369 -63.469440 -8.72894 Subject370 -77.802310 -23.06181 Subject371 -74.520040 -19.77954 Subject372 -51.617950 3.12255
confint(modglm)
Waiting for profiling to be done...
2.5 % 97.5 % (Intercept) -3.35652914 -1.389154923 sexmale -0.46642058 -0.028694911 extraversion 0.09374678 0.241771712 neuroticism 0.03744357 0.185227757 extraversion:neuroticism -0.01434742 -0.002833714
4.6.3 predict()
function
The predict()
function is used to determine the
predicted values for a particular set of values
of the regressors.
The predict()
function can also return the
confidence interval or prediction interval
with the predictions for OLS models.
Syntax and use of the
predict()
functionpredict(modelObj, newObs, interval = intervaltype, level = level, type = type)
The modelObj is an an object returned from a regression function.
The newObs parameter is optional. If it is not provided, the predictions will be for the observed values the model was fit to. The form of newObs is a data.frame with the same columns as used in modelObj.
The intervaltype parameter is available for OLS models. It can be set to
"none"
,"confidence"
, or"prediction"
. The default is none, no interval, and alternatively it can be a confidence interval or a prediction interval.The level parameter is the confidence or prediction level.
The type parameter is used with geralized linear models. The default value is
"link"
, for the linear predictor scale. It can be set to"response"
for predictions on the scale of the response variable.
The following example makes predictions for each of the three models from above.
Predicting new observations.
Predicting subjects at 331 at 10 days and 372 at 8 days.
newObs <- data.frame(Days = c(10, 8), Subject = c("331", "372") ) predict(mod, newObs, interval = "prediction")
fit lwr upr 1 367.0061 302.2256 431.7867 2 354.5216 290.0925 418.9508
Predicting a male with an 8 for extraversion and 15 for neuroticism.
newObsGlm <- data.frame(sex = c("male"), extraversion = c(8), neuroticism = c(15) ) predict(modglm, newObsGlm, type = "response")
1 0.3462704
4.6.4 Extractors
Extractor functions are the preferred method for retrieving information on the model. Some commonly used extractor functions are listed below.
fitted()
The
fitted()
function returns the predicted values for the observation in the data set used to fit the model.residual()
The
residual()
function returns the residual values from the fitted model.hatvalues()
The
hatvalues()
function returns the hat values, leverage measures, that result from fitting the model.Influence measures
The
cooks.distance()
andinfluence()
functions returns the Cook's distance or a set of influence measures that resulted from fitting the model.