This analysis is in preparation for interviews related to linear regression. Focus will be put on the procedure (and how to do it in R), as well as interpretation of the results.
suppressMessages(library(car))prestige <- carData::Prestigem1 <-lm(prestige ~ education + income + women, data = prestige)summary(m1)
Call:
lm(formula = prestige ~ education + income + women, data = prestige)
Residuals:
Min 1Q Median 3Q Max
-19.8246 -5.3332 -0.1364 5.1587 17.5045
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.7943342 3.2390886 -2.098 0.0385 *
education 4.1866373 0.3887013 10.771 < 2e-16 ***
income 0.0013136 0.0002778 4.729 7.58e-06 ***
women -0.0089052 0.0304071 -0.293 0.7702
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.846 on 98 degrees of freedom
Multiple R-squared: 0.7982, Adjusted R-squared: 0.792
F-statistic: 129.2 on 3 and 98 DF, p-value: < 2.2e-16
T-test
F-test
m11 <-lm(prestige ~ education + income, data = prestige)summary(m11)
Call:
lm(formula = prestige ~ education + income, data = prestige)
Residuals:
Min 1Q Median 3Q Max
-19.4040 -5.3308 0.0154 4.9803 17.6889
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.8477787 3.2189771 -2.127 0.0359 *
education 4.1374444 0.3489120 11.858 < 2e-16 ***
income 0.0013612 0.0002242 6.071 2.36e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.81 on 99 degrees of freedom
Multiple R-squared: 0.798, Adjusted R-squared: 0.7939
F-statistic: 195.6 on 2 and 99 DF, p-value: < 2.2e-16
m10 <-lm(prestige ~ education, data = prestige)summary(m10)
Call:
lm(formula = prestige ~ education, data = prestige)
Residuals:
Min 1Q Median 3Q Max
-26.0397 -6.5228 0.6611 6.7430 18.1636
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -10.732 3.677 -2.919 0.00434 **
education 5.361 0.332 16.148 < 2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 9.103 on 100 degrees of freedom
Multiple R-squared: 0.7228, Adjusted R-squared: 0.72
F-statistic: 260.8 on 1 and 100 DF, p-value: < 2.2e-16
anova(m1, m11)
Analysis of Variance Table
Model 1: prestige ~ education + income + women
Model 2: prestige ~ education + income
Res.Df RSS Df Sum of Sq F Pr(>F)
1 98 6033.6
2 99 6038.9 -1 -5.2806 0.0858 0.7702
anova(m1, m10)
Analysis of Variance Table
Model 1: prestige ~ education + income + women
Model 2: prestige ~ education
Res.Df RSS Df Sum of Sq F Pr(>F)
1 98 6033.6
2 100 8287.0 -2 -2253.4 18.3 1.765e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Likelihood ratio test (less used)
lmtest::lrtest(m1, m11) # not sig
Likelihood ratio test
Model 1: prestige ~ education + income + women
Model 2: prestige ~ education + income
#Df LogLik Df Chisq Pr(>Chisq)
1 5 -352.82
2 4 -352.86 -1 0.0892 0.7652
lmtest::lrtest(m1, m10) # sig
Likelihood ratio test
Model 1: prestige ~ education + income + women
Model 2: prestige ~ education
#Df LogLik Df Chisq Pr(>Chisq)
1 5 -352.82
2 3 -369.00 -2 32.37 9.355e-08 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1