11 Linear models: the role of covariates

This section provides hands-on introduction to linear (and generalized linear) models.

Task Fit linear model to compare abundance between the two groups. You can use functions lm or glm, for instance.

11.1 Fitting a linear model

Let us compare two groups with a linear model. We use Log10 abundances since this is closer to the Gaussian assumptions than the absolute count data. Fit a linear model with Gaussian variation as follows:

11.2 Interpreting linear model output

Investigate the model coefficients:

Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.64825 0.02877 22.53405 0
GroupAFR 0.20313 0.04308 4.71530 0

The intercept equals to the mean in the first group:

## [1] 0.6482493

The group term equals to the difference between group means:

## [1] 0.2031287

Note that the linear model (default) significance equals to t-test assuming equal variances.

## [1] 4.284318e-06

11.3 Covariate testing

Task: Investigate how sex and bmi affect the results.

An important advantage of linear and generalized linear models, compared to plain t-test is that they allow incorporating additional variables, such as potential confounders (age, BMI, gender..):

We can even include interaction terms:

x
(Intercept) 0.8736029
GroupAFR 0.0261450
sexmale 0.0022871
bmi_groupoverweight -0.3160401
bmi_groupobese -0.2117135
GroupAFR:sexmale -0.2180267
GroupAFR:bmi_groupoverweight 0.5360971
GroupAFR:bmi_groupobese 0.2396655
sexmale:bmi_groupoverweight 0.0133203
sexmale:bmi_groupobese NA
GroupAFR:sexmale:bmi_groupoverweight NA
GroupAFR:sexmale:bmi_groupobese NA

For more examples on using and analysing linear models, see statmethods regression and [ANOVA](See also statmethods tutorials. Try to adapt those examples on our microbiome example data data sets.