11 Linear models: the role of covariates
This section provides hands-on introduction to linear (and generalized linear) models.
Task Fit linear model to compare abundance between the two groups. You can use functions lm or glm, for instance.
11.1 Fitting a linear model
Let us compare two groups with a linear model. We use Log10 abundances since this is closer to the Gaussian assumptions than the absolute count data. Fit a linear model with Gaussian variation as follows:
11.2 Interpreting linear model output
Investigate the model coefficients:
| Estimate | Std. Error | t value | Pr(>|t|) | |
|---|---|---|---|---|
| (Intercept) | 0.64825 | 0.02877 | 22.53405 | 0 | 
| GroupAFR | 0.20313 | 0.04308 | 4.71530 | 0 | 
The intercept equals to the mean in the first group:
## [1] 0.6482493The group term equals to the difference between group means:
print(mean(subset(df, Group == "AFR")$Log10_Abundance) -
      mean(subset(df, Group == "AAM")$Log10_Abundance))## [1] 0.2031287Note that the linear model (default) significance equals to t-test assuming equal variances.
## [1] 4.284318e-0611.3 Covariate testing
Task: Investigate how sex and bmi affect the results.
An important advantage of linear and generalized linear models, compared to plain t-test is that they allow incorporating additional variables, such as potential confounders (age, BMI, gender..):
# Add a covariate:
df$sex <- meta(d)$sex
df$bmi_group <- meta(d)$bmi_group
# Fit the model:
res <- glm(Log10_Abundance ~ Group + sex + bmi_group, data = df, family = "gaussian")We can even include interaction terms:
res <- glm(Log10_Abundance ~ Group * sex * bmi_group, data = df, family = "gaussian")
kable(coefficients(res))| x | |
|---|---|
| (Intercept) | 0.8736029 | 
| GroupAFR | 0.0261450 | 
| sexmale | 0.0022871 | 
| bmi_groupoverweight | -0.3160401 | 
| bmi_groupobese | -0.2117135 | 
| GroupAFR:sexmale | -0.2180267 | 
| GroupAFR:bmi_groupoverweight | 0.5360971 | 
| GroupAFR:bmi_groupobese | 0.2396655 | 
| sexmale:bmi_groupoverweight | 0.0133203 | 
| sexmale:bmi_groupobese | NA | 
| GroupAFR:sexmale:bmi_groupoverweight | NA | 
| GroupAFR:sexmale:bmi_groupobese | NA | 
For more examples on using and analysing linear models, see statmethods regression and [ANOVA](See also statmethods tutorials. Try to adapt those examples on our microbiome example data data sets.