---
title: "Presenting Model Output"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Presenting Model Output}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.align = "center"
)
```
# Overview
```{r echo=T, message=FALSE, warning=FALSE}
library(thekidsbiostats)
```
We are constantly working on the appearance of the reports we generate for our colleagues and collaborators. This includes managing `.html` and `.docx` output for different use cases, all while navigating the move from [R Markdown](https://rmarkdown.rstudio.com/) to [Quarto](https://quarto.org/). At this stage, we have all but completely abandoned the idea of rendering directly to `.pdf` as we find `.docx` more friendly (tracking changes with collaborators, certain output manipulating functions) and it is readily 'pdf-able' - but more on all that in a future post.
Our primary report output format is `.html` and we have been looking for ways to harness certain Quarto features to improve both the aesthetic of our reports as well as our use of real estate. Often, we are presenting output for a range of models separated by narration. It is important that all models are presented with sufficient context (as opposed to output just dumped on a page) and we've settled on what we think is a nice way to ensure relevant information (like model diagnostics) are readily available (and digestible) to the reader, as opposed to being tucked away at the end --- or worse --- left out altogether.
Below, we present the way we currently present output from a linear regression in our `.html` reports.
# Example Usage
To demonstrate, we will use a slightly modified dataset of Birth Weight data (read more with `?MASS::birthwt`) from the `MASS` package (`MASS::birthwt`).
## Data
```{r}
dat_bwt <- MASS::birthwt
```
The dataset has `r nrow(dat_bwt)` observations (rows) and in the code below we just tidy up some variables prior to running the model.
```{r echo=TRUE}
dat_bwt <- MASS::birthwt
dat_bwt <- dat_bwt %>%
tibble() %>%
mutate(smoke = factor(case_when(smoke == 1 ~ "Yes",
smoke == 0 ~ "No",
T ~ NA_character_)),
ht = factor(case_when(ht == 1 ~ "Yes",
ht == 0 ~ "No",
T ~ NA_character_)))
```
## Building the model
We want to implement the following linear regression (`lm`) model:
+ **Outcome:** infant birth weight (`bwt`).
+ **Exposure of interest:** maternal smoking status during pregnancy (`smoke`).
+ **Covariates:** maternal age (`age`) and history of hypertension (`ht`).
This model can be generated using the `thekids_model` function using the code below:
```{r echo=TRUE}
mod_bwt <- dat_bwt %>%
thekids_model(y = "bwt", x = "smoke", formula = "age + ht")
```
In this, we have passed (*piped*: `%>%`) the data into the first argument, specified the outcome variable as `y`, the exposure of interest as `x`, and finally the remainder of the models `formula` (that is, the covariates we want to look at in our model).
Alternatively, we could run our model in the 'normal' way by passing it into our output processing function `thekids_model_output` --- the workhorse of the function above --- which would look like the following:
```{r echo=TRUE, eval=F}
my_model <- lm(bwt ~ smoke + age + ht, data = dat_bwt)
thekids_model_output(my_model, by = "smoke") # still requires specifying the exposure of interest
```
The objects defined above (`mod_bwt`, `my_model`) are lists that contain outputs relating to our selected model type.
Below, we will see how the output would appear in an `.html` report.
## Model output
### Descriptive statistics
The table below shows summary statistics for all variables in the model by the primary exposure variable (maternal smoking status in pregnancy).
```{r echo=TRUE}
mod_bwt$mod_desc %>%
thekids_table(colour = "DarkTeal",
padding.left = 10, padding.right = 10)
```
The figure below shows the distribution of the primary outcome variable (infant birth weight) by the primary exposure variable (maternal smoking status in pregnancy).
```{r echo=TRUE, fig.height=5, fig.width=5, fig.align='center'}
mod_bwt$mod_desc_plot
```
### Model diagnostics
The four panel plot below shows diagnostic plots that can aid in determining if the required assumptions of model are met. **Based on the diagnostics below,** the model fit is deemed to be good.
```{r echo=TRUE, fig.height=6, fig.width=6, fig.align='center'}
mod_bwt$mod_diag
```
Broadly, what we are looking for in these plots (and why) are:
| | |
|---|---|
| **Linear relationship check**
relatively even bands of points around a flat line at 0 as we move from left to right.
| **Normal distribution of errors check**
the points to fall close to the diagonal line.
|
| **Influential observations check**
the points funnel close to 0 as we move from left to right with no extreme values in the top right or bottom right corners.
| **Homoscedasticity check**
checking residuals are relatively evenly spread across the range of predictions.
|
### Model output
The table below shows the model output for the linear regression model, including the beta coefficient (and 95% confidence interval) and p-values associated with each variable in the model.
```{r echo=TRUE}
mod_bwt$mod_output %>%
thekids_table(colour = "DarkTeal",
padding.left = 10, padding.right = 10)
```
### Model effect
The figure and table below show the predicted values (also known as the *estimated marginal mean*) for the outcome variable (infant birth weight) for each level of the exposure of interest (i.e., *no maternal smoking in pregnancy* or *maternal smoking in pregnancy*), along with a 95% confidence interval. Note, see table footnote for the values used for the other variables in the model.
```{r echo=TRUE, fig.height=4.5, fig.width=5, fig.align='center'}
mod_bwt$model %>%
ggeffects::predict_response("smoke") %>%
plot
```
```{r echo=TRUE}
mod_bwt$model %>%
ggeffects::predict_response("smoke") %>%
ggeffects::print_html()
```
### Interpretation of model
For the example given above:
- There is evidence to suggest that smoking during pregnancy is associated with reduced birth weight.
- Maternal smoking during pregnancy is associated with a `r round(coef(mod_bwt$model)[2],2)`g lower birth weight (95% CI: `r round(confint(mod_bwt$model)[2,1],1)`,`r round(confint(mod_bwt$model)[2,2],1)`).