---
title: "Automating Publication-Ready Tables in R"
description: |
Automatically Creating Report and Publication reday summary tables.
author: "Habtamu Bizuayehu"
website: "https://habtamubizuayehu.com/"
orcid: "https://orcid.org/0000-0002-1360-4909"
LinkedIn: "https://www.linkedin.com/in/habtamu-bizuayehu-94285980/"
date: 2025-05-25
image: "Table_1.png"
highlight-style: github
project:
type: website
output-dir: docs
categories: [Table automation, Data reporting, Flextable]
format:
html:
toc: true
toc-depth: 2
number-sections: false
code-fold: false
code-tools: true
code-summary: "Show the code"
theme: united
self-contained: true
knitr:
opts_chunk:
warning: false
message: false
editor: visual
---
# 📋 Overview: Creating a Publication-Ready Table in R with `flextable`
This section presents a walkthrough of generating a publication-ready table. The aim is to summarize and format data in a professional way. The key steps include:
- Cleaning and preparing data for consistency.
- Creating summary tables for different variables.
- Combining and styling tables for clear visual presentation.
- Automating the process using `flextable`.
The code below demonstrates how to achieve these steps. I will demonstrate how to generate a **clean, publication-ready summary table** using R. The approach combines data manipulation with `dplyr`, summary tabulation with `janitor`, and elegant styling via the `flextable` package.
This process is especially useful for public health datasets with categorical variables like age, gender, or service dates and includes options for handling missing data, calculating row percentages, column percentages and customizing formatting such as font type, table width, and color.
By automating these steps, we can save time, reduce manual formatting errors, and ensure consistency across tables.
🔹 Step 1: Load Required Libraries
```{r, message=FALSE, warning=FALSE}
# Clear environment and console
rm(list = ls())
cat("\014")
# Load required packages
#gc() #garbage collector to clean up the workspace.
options(repos = c(CRAN = "https://cran.rstudio.com"))
# List of required packages for Date and Time Data Analysis
packages <- c(
"lubridate", "dplyr", "stringr", "zoo", "tsibble", "ggplot2", "plotly",
"readr", "forecast", "prettydoc", "janitor", "knitr", "rmarkdown", "yaml",
"flextable", "scales", "gt", "reactable", "tidyverse", "tidyr", "officer"
)
# Install and load packages with descriptions
for (pkg in packages) {
if (!require(pkg, character.only = TRUE)) {
install.packages(pkg)
}
}
# Install quarto if not already installed
if (!requireNamespace("quarto", quietly = TRUE)) install.packages("quarto")
# **Load required libraries**
library(lubridate) # Simplifies the manipulation of dates and times in R (e.g., formatting, extracting components)
library(dplyr) # Provides tools for data manipulation, helpful for filtering and summarizing date-based data
library(stringr) # Simplifies string manipulation
library(tidyr) # Reshaping and tidying data
library(zoo) # For working with time series data, including rolling calculations and handling missing data in time series
library(tsibble) # Provides tools for handling time series data, including date-time indexes and features for forecasting
library(readr) # For reading date-time data from CSV
library(haven) # data from other statistical software formats (SPSS, SAS, Stata)
library(forecast) # Useful for time series forecasting, especially when working with seasonal or trend-based data
library(ggplot2) # For visualizing date-time trends in data (e.g., time series plots)
library(janitor) # Tabulation, cleaning column names, adding totals and proportions
library (plotly) # Interactive visualizations
# Documentation/reporting
library(prettydoc) # Pretty document templates
library(flexdashboard)# Interactive dashboards
library(quarto) # For rendering and publishing documents with the Quarto framework
library(yaml) # YAML document processing
# Tabulation
library(flextable) # for creating styled, publication-ready tables
library(gt) # Grammar of tables
library(reactable) # Interactive tables
library(scales) # for number formatting (e.g., commas)
library(officer) # For exporting tables/reports to Word or PowerPoint
```
```{r}
setwd("C:/Users/User/Desktop/Materials_ Course and proposals/Course Related/DataCamp/Data/")
load("Vaccination_uptake.RDATA")
```
🔹 Step 2: Convert Grouping Variables to Character
Before we tabulate values, we need to ensure all grouping variables have a consistent format. The table-generating function later in this workflow expects character values. If a mix of numeric and character types is used, the output may fail or be misaligned.
```{r, eval=FALSE}
vacc_1524 <- vacc_1524 %>% mutate(across(c(age_group, gender, marital_status, vacc_service_year, vacc_quarter, vacc_weekday), as.character))
```
💡 *Why this matters*: If one variable like `gender` is stored as a character and another like `age_group` is numeric, the table creation function won't process them uniformly. This step ensures compatibility and consistency.
🔹 Step 3: Define a Table Generation Function
This section defines a reusable function that creates a summary table for any categorical variable of interest.
```{r, eval=FALSE}
create_tab <- function(data, var, label) {
# Step 1: Generate counts for non-missing values
tab_main <- data %>%
filter(!is.na({{ var }})) %>% # Exclude missing values
tabyl({{ var }}, vacc_group3) %>% # Cross-tabulate (var × vaccine groups)
mutate(Category = label, .before = 1) %>% # Add category label (e.g., "Gender")
rename(Label = {{ var }}) %>% # Rename variable column for consistency
mutate(Label = as.character(Label)) # Ensure labels are text
# Step 2: Check for and append missing values (if any)
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
... # Similar for other vaccine groups
)
tab_final <- bind_rows(tab_main, missing_row) # Combine with main table
} else {
tab_final <- tab_main
}
return(tab_final)
}
```
Let's break it down:
- **`data`**: The full dataset (`vacc_1524`) you are analyzing.
- **`var`**: A single categorical variable (e.g., `age_group`) whose values will appear as rows.
- **`label`**: A string label that describes the variable for the final table (e.g., `"Age"`).
Inside the function:
- We filter out missing values temporarily to generate the main table using tabyl().
- We count and append missing entries separately, if any.
- We return a single tibble containing counts across vaccine groups by value of the selected variable.
This abstraction allows us to apply one function to multiple variables.
🔹 Step 4: Generate Summary Tables for Each Variable
The code below creates summary tables for several key variables: age, gender, marital status, service year, quarter, and weekday.
```{r eval=FALSE}
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
```
Each result is a well-structured table of vaccine uptake by subgroup.
🔹 Step 5: Combine All Summary Tables
Here, we merge the results into one combined table that can be styled and displayed together.
```{r eval=FALSE}
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
```
We are now working with one long-format table that contains all the variables of interest.
🔹 Step 6: Calculate Row-Wise Percentages and Format Counts
Next, I add percentages for each vaccine type within a row (e.g., among all women aged 20–29, what percent received respiratory vaccines?). I also format the counts with commas for easier reading.
```{r eval=FALSE}
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
```
👉 Note: We skip calculating percentages for “Missing” rows to avoid misleading results.
🔹 Step 7: Show Category Name Once per Block
To improve readability, I only show the category name (like "Age" or "Gender") once per group.
```{r eval=FALSE}
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
```
🔹 Step 8: Count Total Vaccine Recipients (for Header Labels)
We calculate the total number of people who received each type of vaccine to include in the header row.
```{r eval=FALSE}
totals <- vacc_1524 %>% count(vacc_group3) %>% deframe()
```
🔹 Step 9: Style the Table Using `{flextable}`
Now comes the fun part! We format and stylize the table with headers, font choices, alternating colors, and footnotes. This creates a visually appealing table suitable for publication.
```{r eval=FALSE}
# ft_vacc_summary <- tab_combined %>% flextable() %>% set_header_labels(...) %>% add_header_row(...) %>% add_header_row(...) %>% theme_box() %>% set_table_properties(layout = "autofit", width = 1) %>% width(...) %>% align(align = "justify", part = "all") %>% border_remove() %>% set_caption(...) %>% fontsize(...) %>% font(part = "all", fontname = "Times New Roman") %>% bg(part = "header", bg = "lightblue") %>% bg(j = 1:5, i = seq(1, nrow(tab_combined), 2), bg = "lightgray") %>% bg(j = 1:5, i = seq(2, nrow(tab_combined), 2), bg = "red") %>% footnote(...)
```
**Key Steps**:
1. **Set Headers**: Labels columns appropriately such as including multi-row headers that describe each vaccine group with sample size (N)
2. **Add Header Rows**: Includes multi-level headers for clarity.
3. **Themes and Layout**: Applies a boxed theme, adjusts column widths, and aligns text.
4. **Font and Colors**: Uses "Times New Roman" and alternates row colors (e.g., gray and red) for better visuals.
5. **Add Captions, Superscripts and Footnotes**: Provides context to the table.
🔹 Step 10: Display the Table
🔹 Step 11: Clean Up Temporary Objects
To keep your R environment tidy, we remove the temporary tables and variables created along the way.
```{r eval=FALSE}
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday, tab_combined, ft_vacc_summary, totals, create_tab)
```
## **1. Table Without Title and Footer**
This section introduces a basic version of the automated table—minimal and clean without captions or footnotes.
- **Why it matters**: A stripped-down table is great for quick previews or inline summaries.
- **When to use**: Internal drafts, early analysis stages, or dashboards.
- Ensure grouping variables are characters to avoid formatting issues.
- Build one reusable function (`create_tab`) to apply across multiple variables.
- Bind all summary tables into a combined format.
- Add row-wise percentages directly to the table.
```{r}
# ------------1. Table without title and footer -------------------------------------------+
#.......................Ensure grouping variables are characters............................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# ---------------- Table generation function ....................
create_tab <- function(data, var, label) {
data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
}
# .................... Create summary tables ....................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# ......................Combine all tabs ....................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# ....................... Create row % and format as text ....................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
Respiratory = paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")"),
Childhood = paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")"),
Other_grp = paste0(comma(Other), " (", round(100 * Other / Total, 1), ")")
) %>%
ungroup()
# ............................. Remove repeated variable names ....................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# ........................... Get total counts per vaccine group ....................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................... Create flextable ....................
ft_vacc_summary <- tab_combined %>%
select(Category, Label, Respiratory, Childhood, Other_grp) %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row( # Second row
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row( # First row (merged header)
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:4, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1))
# --- Display the table ---
ft_vacc_summary
# --- Clean up environment ---
rm(tab_year, tab_quarter, tab_weekday, tab_combined, totals, create_tab, ft_vacc_summary)
```
## **2 Table With Title and Footer**
This extends the previous section by including a clear **table caption** and an explanatory **footnote**.
🧾 *Why this is important*:
- Captions add context—especially in reports.
- Footnotes help clarify how metrics are calculated (e.g., row percentages).
📘 *Best used when*: Preparing for publication, stakeholder presentations, or final reports.
```{r}
# ------------2. Table with Title and Footer -------------------------------------------
#.......................Ensure grouping variables are characters............................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# ---------------- Table generation function ....................
create_tab <- function(data, var, label) {
data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
}
# .................... Create summary tables ....................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# ......................Combine all tabs ....................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# ....................... Create row % and format as text ....................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
Respiratory = paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")"),
Childhood = paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")"),
Other_grp = paste0(comma(Other), " (", round(100 * Other / Total, 1), ")")
) %>%
ungroup()
# ............................. Remove repeated variable names ....................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# ........................... Get total counts per vaccine group ....................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................... Create flextable ....................
ft_vacc_summary <- tab_combined %>%
select(Category, Label, Respiratory, Childhood, Other_grp) %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row( # Second row
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row( # First row (merged header)
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:4, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph(" Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ", part = "header")
# %>%
# border_inner_h(border = fp_border(color = "black", width = 1)) %>%
# add_header_lines(values = "Table xx: Vaccination uptake across service year, quarter, and weekday.") %>%
# add_footer_lines(values = c(
# "ᵃ Vaccination proportions were calculated per row (row percentage).",
# paste0("ᵇ The vaccine provider type did not include unvaccinated (N = ", format(no_total, big.mark = ","), ")."),
# "ᶜ Vaccination proportions were calculated row-wise (i.e., by row).",
# "GP = General practitioner, PHU = Public Health Unit"
# ))
# Display the table
ft_vacc_summary
# .................... Clean up environment ....................
rm(tab_year, tab_quarter, tab_weekday, tab_combined, ft_vacc_summary, totals, create_tab)
```
## **3. Missing Values with n(%)**
In this version, we **include "Missing" as a separate category** and provide **both counts and percentages**.
🧠 *Why this matters*: Ignoring missing values can skew analysis. This format transparently highlights data completeness.
👍 *Recommended for*: Health datasets, surveys, or administrative data where data quality assessment is key.
```{r}
# ------------3. Missing values with n(%)-------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
Respiratory = paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")"),
Childhood = paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")"),
Other_grp = paste0(comma(Other), " (", round(100 * Other / Total, 1), ")")
) %>%
ungroup()
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................. Create flextable ..................
ft_vacc_summary <- tab_combined %>%
select(Category, Label, Respiratory, Childhood, Other_grp) %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **4. Missing Values with Only Counts**
This variation includes missing data rows but only shows **raw counts**, not percentages.
- ⚖️ *Why use this*: It prevents misleading interpretations—especially when the denominator for percentage is unclear or inconsistent.
- Use this when you want to **acknowledge missing data** but avoid implying proportional meaning.
```{r}
# ------------4. Missing values with only counts-------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages (skip for Missing) ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................. Create flextable ..................
ft_vacc_summary <- tab_combined %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **5. Remove All Lines**
This version **removes all border lines** for a minimalist, text-friendly table.
✨ *Why this helps*: It improves visual integration in Word documents or printed reports where borders may clutter formatting.
🔍 *Ideal for*: Inline reporting, Word exports, or journal submissions requiring custom styling.
```{r}
# ------------5. Lines: remove all lines -------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages (skip for Missing) ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................. Create flextable ..................
ft_vacc_summary <- tab_combined %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
# border_inner_h(border = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **6 . Only Horizontal Lines**
This style keeps only **horizontal dividers** to improve readability without overwhelming the user.
📏 *Implication*: Helps guide the eye across wide tables while maintaining a clean aesthetic.
✅ *Great for*: Formal documents and slide decks where visual clarity is important.
```{r}
# ------------6. Lines: Only Horizantal Lines -------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_3group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages (skip for Missing) ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................. Create flextable ..................
ft_vacc_summary <- tab_combined %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **7. All Lines Visible**
This version includes both **horizontal and vertical grid lines**, mimicking Excel-style tables.
📊 *When to use*: When presenting detailed tables where structure must be visually enforced.
📦 *Applicability*: Regulatory reports, appendices, or complex data breakdowns.
```{r}
# ------------6. Lines: All Lines Visible-------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages (skip for Missing) ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................. Create flextable ..................
ft_vacc_summary <- tab_combined %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_inner_h(border = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **8. Font Formatting: Colour and Font Options**
Here, I demonstrate how to:
- Set font types (e.g., Times New Roman)
- Control font size
- Apply alternating row colors and header backgrounds
🎨 *Why this is valuable*:
- Enhances visual appeal and professionalism.
- Aligns with publication or organizational branding requirements.
- *Best for*: Reports to stakeholders, executive summaries, or printed materials.
```{r}
# ------------ 7. Font Formating: Colour and Font type options adding in the Table -------------------------------------------
library(dplyr)
library(janitor)
library(tibble)
library(scales)
library(flextable)
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages (skip for Missing) ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
ft_vacc_summary <- tab_combined %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption(as_paragraph(as_chunk("Table: Vaccination uptake across service year, quarter, and weekday",
props = fp_text(font.size = 14, font.family = "Times New Roman", bold = FALSE)))) %>%
fontsize(part = "header", size = 12) %>%
fontsize(part = "body", size = 11) %>%
fontsize(part = "footer", size = 10) %>%
font(part = "all", fontname = "Times New Roman") %>%
bg(part = "header", bg = "lightblue") %>%
bg(j = 1:5, i = seq(1, nrow(tab_combined), 2), bg = "lightgray") %>%
bg(j = 1:5, i = seq(2, nrow(tab_combined), 2), bg = "red") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated row-wise (i.e., by row)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up all new objects ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **9. Rowwise Tables (Summary)**
This is not a new table—rather, it's a **conceptual summary** of the row-wise percentage approaches applied in earlier sections.
**Importance**:
- Percentages sum to 100% per row, highlighting within-group distributions.
**Implications**:
- Answers: "Within age group X, what % received each vaccine?"
## **10. Columnwise Tables**
This version calculates **percentages down each vaccine group column**, rather than across each row.
**Importance**:
- Percentages sum to 100% per column, showing group contributions to totals.
**Implications**:
- Answers: "Among Respiratory vaccines, what % are from age group X?"
- *Use this when*: You want to know, for example, what percentage of all Respiratory vaccine users were aged below 25 years.
```{r}
# ------------ 10. Columnwise Tables -------------------------------------------
# Prepare Vaccination Uptake Table (All Variables Using Column Percentages)
# --- Convert selected variables to character type for proper label handling ---
vacc_1524 <- vacc_1524 %>%
mutate(across(c(gender, age_3group, marital_2group, year_3group, vacc_quarter, race_grouped), as.character))
# --- Create general table-building function with optional column percentage flag ---
# This function creates a summary count table of vaccination type by a categorical variable.
# If use_col_percentage is TRUE, the summary will support column-based percentage formatting.
create_tab <- function(data, var, label, use_col_percentage = TRUE) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
# Add a row for "Unknown" if missing values are found
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
na_row <- tibble(
Category = label,
Label = "Unknown",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_combined <- bind_rows(tab_main, na_row)
} else {
tab_combined <- tab_main
}
tab_combined %>%
mutate(use_col_percentage = use_col_percentage)
}
# --- Generate summary tables for each variable using column percentages ---
tab_gender <- create_tab(vacc_1524, gender, "Gender")
tab_age <- create_tab(vacc_1524, age_3group, "Age group")
tab_marital <- create_tab(vacc_1524, marital_2group, "Marital statusᵇ")
tab_year <- create_tab(vacc_1524, year_3group, "Year group")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_race <- create_tab(vacc_1524, race_grouped, "Race group")
# --- Combine all variable-level tables into one ---
tab_combined <- bind_rows(
tab_gender, tab_age, tab_marital,
tab_year, tab_quarter, tab_race
)
# --- Calculate percentages and format values into strings with "Number (%)" ---
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(
use_col = TRUE, # column percentages applied uniformly
# Calculate denominators excluding "Unknown"
Total_Resp = sum(`Respiratory vaccines`[Label != "Unknown"], na.rm = TRUE),
Total_Child = sum(`Childhood vaccines`[Label != "Unknown"], na.rm = TRUE),
Total_Other = sum(Other[Label != "Unknown"], na.rm = TRUE),
# Calculate percentages
Resp_prop = ifelse(Label != "Unknown", round(100 * `Respiratory vaccines` / Total_Resp, 1), NA),
Child_prop = ifelse(Label != "Unknown", round(100 * `Childhood vaccines` / Total_Child, 1), NA),
Other_prop = ifelse(Label != "Unknown", round(100 * Other / Total_Other, 1), NA),
# Format final output as "number (percentage)"
Respiratory = ifelse(is.na(Resp_prop),
paste0(comma(`Respiratory vaccines`)),
paste0(comma(`Respiratory vaccines`), " (", Resp_prop, ")")),
Childhood = ifelse(is.na(Child_prop),
paste0(comma(`Childhood vaccines`)),
paste0(comma(`Childhood vaccines`), " (", Child_prop, ")")),
Other_grp = ifelse(is.na(Other_prop),
paste0(comma(Other)),
paste0(comma(Other), " (", Other_prop, ")"))
) %>%
ungroup()
# --- Final clean-up: labels, superscripts, and hiding "0 (0)" from Unknown row ---
tab_combined <- tab_combined %>%
mutate(
Label = ifelse(Label == "Other", "Otherᶜ", Label),
Respiratory = ifelse(Label == "Unknown" & `Respiratory vaccines` == 0, "", Respiratory),
Childhood = ifelse(Label == "Unknown" & `Childhood vaccines` == 0, "", Childhood),
Other_grp = ifelse(Label == "Unknown" & Other == 0, "", Other_grp)
) %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# --- Get total N for each vaccine group to display in header ---
totals <- table(vacc_1524$vacc_group3)
total_resp <- totals["Respiratory vaccines"]
total_child <- totals["Childhood vaccines"]
total_other <- totals["Other"]
# --- Build flextable for formatted presentation ---
table_vacc_summary <- flextable(tab_combined) %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c("", "",
paste0("Respiratory vaccines (N = ", format(total_resp, big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(total_child, big.mark = ","), ")"),
paste0("Other (N = ", format(total_other, big.mark = ","), ")")),
colwidths = c(1, 1, 1, 1, 1),
top = TRUE
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3),
top = TRUE
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption(as_paragraph(as_chunk(
"Table xx: Column-percentage summary of vaccination uptake across all demographic variables, 2015–2024.",
props = fp_text(font.size = 12, font.family = "Times New Roman", bold = FALSE)
))) %>%
add_footer_lines(values = c(
"ᵃ Vaccination proportions were calculated using column percentages for all variables.",
"ᵇ Marital status was grouped into two: partnered (married) and not-partnered (divorced, widowed, single).",
"ᶜ Other race refers to those identifying as native or other racial groups."
)) %>%
fontsize(part = "header", size = 12) %>%
fontsize(part = "body", size = 11) %>%
fontsize(part = "footer", size = 10) %>%
font(part = "all", fontname = "Times New Roman") %>%
color(part = "header", color = "white") %>%
bg(part = "header", bg = "black") %>%
bg(j = 1:5, i = seq(2, nrow(tab_combined), 2), bg = "white") %>%
bold(j = 1, part = "body")
# --- Display final table ---
table_vacc_summary
# --- Clean up temporary variables (optional good practice) ---
rm(
tab_gender, tab_age, tab_marital, tab_year, tab_quarter, tab_race,
tab_combined, create_tab, totals, total_resp, total_child, total_other
)
```
## **11. Columnwise Tables with NA values**
Columnwise percentage table including NA where applicable - All variables use column percentages (i.e., % down each vaccine group column) - NA values are shown only where relevant (e.g., race group for Respiratory vaccines) - "Unknown" variable categories are excluded from percentage calculations to maintain \~100% totals per column - Useful for subgroup summaries where not all variables apply across all vaccine type
```{r}
# ------------ 11. Columnwise Tables with NA values-------------------------------------------
# Prepare Vaccination Uptake Table (All Variables Using Column Percentages)
# ....................................................................................................
# --- Set working directory and load data (if needed) ---
# setwd("C:/Users/User/Desktop/Materials_ Course and proposals/Course Related/DataCamp/Data/")
# load("Vaccination_uptake.RDATA")
# --- Clean and prepare datasets ---
vacc_1524 <- vacc_1524 %>%
mutate(across(c(gender, age_3group, marital_2group, year_3group, vacc_quarter, race_grouped), as.character))
# Subset data for Respiratory vaccines to calculate column-based percentages (used for race)
respiratory_vaccines <- vacc_1524 %>%
filter(vacc_group3 == "Respiratory vaccines") %>%
mutate(across(race_grouped, as.character))
# --- General table builder ---
create_tab <- function(data, var, label, use_col_percentage = TRUE) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
na_row <- tibble(
Category = label,
Label = "Unknown",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_combined <- bind_rows(tab_main, na_row)
} else {
tab_combined <- tab_main
}
tab_combined %>%
mutate(use_col_percentage = use_col_percentage)
}
# --- Create summary tables ---
tab_gender <- create_tab(vacc_1524, gender, "Gender")
tab_age <- create_tab(vacc_1524, age_3group, "Age group")
tab_marital <- create_tab(vacc_1524, marital_2group, "Marital statusᵇ")
tab_year <- create_tab(vacc_1524, year_3group, "Year group")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
# Race group from respiratory-only subset (denominator = respiratory group only)
tab_race <- create_tab(respiratory_vaccines, race_grouped, "Race group")
# --- Combine all tables ---
tab_combined <- bind_rows(
tab_gender, tab_age, tab_marital,
tab_year, tab_quarter, tab_race
)
# --- Calculate and format column percentages ---
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(
use_col = TRUE,
Total_Resp = sum(`Respiratory vaccines`[Label != "Unknown"], na.rm = TRUE),
Total_Child = sum(`Childhood vaccines`[Label != "Unknown"], na.rm = TRUE),
Total_Other = sum(Other[Label != "Unknown"], na.rm = TRUE),
Resp_prop = ifelse(Label != "Unknown", round(100 * `Respiratory vaccines` / Total_Resp, 1), NA),
Child_prop = ifelse(Label != "Unknown", round(100 * `Childhood vaccines` / Total_Child, 1), NA),
Other_prop = ifelse(Label != "Unknown", round(100 * Other / Total_Other, 1), NA),
Respiratory = ifelse(is.na(Resp_prop),
paste0(comma(`Respiratory vaccines`)),
paste0(comma(`Respiratory vaccines`), " (", Resp_prop, ")")),
Childhood = ifelse(is.na(Child_prop),
paste0(comma(`Childhood vaccines`)),
paste0(comma(`Childhood vaccines`), " (", Child_prop, ")")),
Other_grp = ifelse(is.na(Other_prop),
paste0(comma(Other)),
paste0(comma(Other), " (", Other_prop, ")"))
) %>%
ungroup()
# --- Format labels and rows ---
tab_combined <- tab_combined %>%
mutate(
Label = ifelse(Label == "Other", "Otherᶜ", Label),
Respiratory = ifelse(Label == "Unknown" & `Respiratory vaccines` == 0, "", Respiratory),
Childhood = ifelse(Label == "Unknown" & `Childhood vaccines` == 0, "", Childhood),
Other_grp = ifelse(Label == "Unknown" & Other == 0, "", Other_grp)
) %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# --- Prepare column Ns for header ---
totals <- table(vacc_1524$vacc_group3)
total_resp <- totals["Respiratory vaccines"]
total_child <- totals["Childhood vaccines"]
total_other <- totals["Other"]
# --- Build Flextable ---
table_vacc_summary <- flextable(tab_combined) %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c("", "",
paste0("Respiratory vaccines (N = ", format(total_resp, big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(total_child, big.mark = ","), ")"),
paste0("Other (N = ", format(total_other, big.mark = ","), ")")),
colwidths = c(1, 1, 1, 1, 1),
top = TRUE
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3),
top = TRUE
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption(as_paragraph(as_chunk(
"Table xx: Column-percentage summary of vaccination uptake across all demographic variables, 2015–2024.",
props = fp_text(font.size = 12, font.family = "Times New Roman", bold = FALSE)
))) %>%
add_footer_lines(values = c(
"ᵃ Vaccination proportions were calculated using column percentages. Race group used respiratory vaccine subset only.",
"ᵇ Marital status was grouped into partnered (married) and non-partnered (single, divorced, widowed).",
"ᶜ 'Other' race includes native or other less represented racial identities."
)) %>%
fontsize(part = "header", size = 12) %>%
fontsize(part = "body", size = 11) %>%
fontsize(part = "footer", size = 10) %>%
font(part = "all", fontname = "Times New Roman") %>%
color(part = "header", color = "white") %>%
bg(part = "header", bg = "#000000") %>%
bg(j = 1:5, i = seq(2, nrow(tab_combined), 2), bg = "white") %>%
bold(j = 1, part = "body")
# --- Display final table ---
table_vacc_summary
# --- Cleanup temporary objects ---
rm(
tab_gender, tab_age, tab_marital, tab_year, tab_quarter, tab_race,
tab_combined, create_tab, totals, total_resp, total_child, total_other,
respiratory_vaccines
)
```
## **12. Both Row wise and Column wise Tables**
This table presents **both row and column percentages** for vaccination uptake:
- Row percentages show how each category (e.g., age group, gender) distributes across vaccine types.
- Column percentages are applied only to the *race group*, highlighting the demographic makeup within the respiratory vaccine recipients.
This format helps answer questions like: - *"What proportion of women in each age group received respiratory vaccines?"* (row %) - *"What proportion of respiratory vaccine recipients were Aboriginal?"* (column %)
Unknown values are included but excluded from percentage calculations to avoid distorting group totals.
```{r}
# ------------12. Both Row wise and Column wise Tables -------------------------------------------
# ....................................................................................................
# Prepare Vaccination Uptake Table (Row and Column Percentages)
# ....................................................................................................
# --- Step 1: Prepare data for table formatting ---
# Convert grouping variables to character for cleaner label handling
vacc_1524 <- vacc_1524 %>%
mutate(across(c(gender, age_3group, marital_2group, year_3group, vacc_quarter, race_grouped), as.character))
# Subset data for Respiratory vaccines to calculate column-based percentages (used for race)
respiratory_vaccines <- vacc_1524 %>%
filter(vacc_group3 == "Respiratory vaccines") %>%
mutate(across(race_grouped, as.character))
# --- Step 2: Create table-building function ---
# This function tabulates counts by category and optionally adds missing (Unknown) values
create_tab <- function(data, var, label, use_col_percentage = FALSE) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
# Add a row for missing/unknown if needed
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
na_row <- tibble(
Category = label,
Label = "Unknown",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_combined <- bind_rows(tab_main, na_row)
} else {
tab_combined <- tab_main
}
tab_combined %>%
mutate(use_col_percentage = use_col_percentage)
}
# --- Step 3: Generate frequency tables for each variable ---
tab_gender <- create_tab(vacc_1524, gender, "Gender")
tab_age <- create_tab(vacc_1524, age_3group, "Age group")
tab_marital <- create_tab(vacc_1524, marital_2group, "Marital statusᵇ")
tab_year <- create_tab(vacc_1524, year_3group, "Year group")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_race <- create_tab(respiratory_vaccines, race_grouped, "Race group", use_col_percentage = TRUE)
# --- Step 4: Combine all tables ---
tab_combined <- bind_rows(
tab_gender, tab_age, tab_marital,
tab_year, tab_quarter, tab_race
)
# --- Step 5: Calculate percentages and formatted string outputs ---
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(
use_col = first(use_col_percentage),
# Exclude "Unknown" when calculating column totals to ensure proportions sum to 100
Total_Resp = ifelse(use_col, sum(`Respiratory vaccines`[Label != "Unknown"], na.rm = TRUE),
`Respiratory vaccines` + `Childhood vaccines` + Other),
Total_Child = ifelse(use_col, sum(`Childhood vaccines`[Label != "Unknown"], na.rm = TRUE),
`Respiratory vaccines` + `Childhood vaccines` + Other),
Total_Other = ifelse(use_col, sum(Other[Label != "Unknown"], na.rm = TRUE),
`Respiratory vaccines` + `Childhood vaccines` + Other),
# Calculate proportions
Resp_prop = ifelse(Label != "Unknown", round(100 * `Respiratory vaccines` / Total_Resp, 1), NA),
Child_prop = ifelse(Label != "Unknown", round(100 * `Childhood vaccines` / Total_Child, 1), NA),
Other_prop = ifelse(Label != "Unknown", round(100 * Other / Total_Other, 1), NA),
# Format numbers and percentages
Respiratory = ifelse(is.na(Resp_prop),
paste0(comma(`Respiratory vaccines`)),
paste0(comma(`Respiratory vaccines`), " (", Resp_prop, ")")),
Childhood = ifelse(is.na(Child_prop),
paste0(comma(`Childhood vaccines`)),
paste0(comma(`Childhood vaccines`), " (", Child_prop, ")")),
Other_grp = ifelse(is.na(Other_prop),
paste0(comma(Other)),
paste0(comma(Other), " (", Other_prop, ")"))
) %>%
ungroup()
# --- Step 6: Clean labels and fix formatting for final table ---
tab_combined <- tab_combined %>%
mutate(
Label = ifelse(Label == "Other", "Otherᶜ", Label),
Respiratory = ifelse(Label == "Unknown" & `Respiratory vaccines` == 0, "", Respiratory),
Childhood = ifelse(Label == "Unknown" & `Childhood vaccines` == 0, "", Childhood),
Other_grp = ifelse(Label == "Unknown" & Other == 0, "", Other_grp)
) %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# --- Step 7: Calculate group totals for header ---
totals <- table(vacc_1524$vacc_group3)
total_resp <- totals["Respiratory vaccines"]
total_child <- totals["Childhood vaccines"]
total_other <- totals["Other"]
# --- Step 8: Create styled summary table using flextable ---
table_vacc_summary <- flextable(tab_combined) %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c("", "",
paste0("Respiratory vaccines (N = ", format(total_resp, big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(total_child, big.mark = ","), ")"),
paste0("Other (N = ", format(total_other, big.mark = ","), ")")),
colwidths = c(1, 1, 1, 1, 1),
top = TRUE
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3),
top = TRUE
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border(part = "header", i = 1, border.top = fp_border(color = "black", width = 1)) %>%
border(part = "header", i = 3, border.bottom = fp_border(color = "black", width = 1)) %>%
border_inner_h(part = "header", border = fp_border(color = "black", width = 0.8)) %>%
border_inner_h(part = "body", border = fp_border(color = "black", width = 0.8)) %>%
border(part = "body", i = nrow(tab_combined), border.bottom = fp_border(color = "black", width = 1)) %>%
set_caption(as_paragraph(as_chunk(
"Table xx: Vaccination uptake by demographic, temporal, and race variables, 2015–2024.",
props = fp_text(font.size = 12, font.family = "Times New Roman", bold = FALSE)
))) %>%
add_footer_lines(values = c(
"ᵃ Vaccination proportions were calculated per row (row percentage) except for race group which used column percentages.",
"ᵇ Marital status was grouped into two: partnered for those married and non-partnered for those divorced, widowed or single.",
"ᶜ Other race refers to those who identify as native or other racial groups."
)) %>%
fontsize(part = "header", size = 12) %>%
fontsize(part = "body", size = 11) %>%
fontsize(part = "footer", size = 10) %>%
font(part = "all", fontname = "Times New Roman") %>%
color(part = "header", color = "white") %>%
bg(part = "header", bg = "black") %>%
bg(j = 1:5, i = seq(2, nrow(tab_combined), 2), bg = "white") %>%
bold(j = 1, part = "body")
# --- Step 9: Display table ---
table_vacc_summary
# --- Step 10: Clean up temporary objects (optional) ---
rm(
tab_gender, tab_age, tab_marital, tab_year, tab_quarter, tab_race,
tab_combined, create_tab, respiratory_vaccines,
totals, total_resp, total_child, total_other
)
```
## **13. Multiple Response Tables – Rowwise Percentage**
Here we simulate **multiple-response tables** by combining summary outputs from different variables side-by-side.
📚 *What’s useful here?*
- Shows how people responded with combinations of vaccine types (e.g., Both, Influenza, or None), where both here refers vaccination of influenza and other vaccine types.
- 🙌 *Applies to*: Any analysis involving response overlaps (e.g., co-morbidity, multiple treatments, combined exposures).
```{r}
# ------------ 10. Mutiple Response Tables: via appending: Rowwise -------------------------------------------
# .................. Create flextable summary by variable ..................
process_variable <- function(data, group_var, group_label) {
# Base pattern: Both and No
df_base <- data %>%
group_by(across(all_of(group_var)), vacc_pattern) %>%
summarise(count = n(), .groups = "drop") %>%
group_by(across(all_of(group_var))) %>%
mutate(total = sum(count),
proportion = round((count / total) * 100, 1)) %>%
ungroup() %>%
filter(vacc_pattern %in% c("Both", "No")) %>%
rename(value = !!sym(group_var)) %>%
mutate(value = as.character(value))
# Influenza
df_flu <- data %>%
filter(!is.na(!!sym(group_var))) %>%
mutate(flu_yes = recode(flu_yes, "Yes" = "Influenza")) %>%
group_by(across(all_of(group_var)), flu_yes) %>%
summarise(count = n(), .groups = "drop") %>%
group_by(across(all_of(group_var))) %>%
mutate(total = sum(count),
proportion = round((count / total) * 100, 1)) %>%
ungroup() %>%
filter(flu_yes == "Influenza") %>%
rename(vacc_pattern = flu_yes, value = !!sym(group_var)) %>%
mutate(value = as.character(value))
df_combined <- bind_rows(df_base, df_flu) %>%
mutate(vacc_pattern = factor(vacc_pattern, levels = c("Both", "Influenza", "No")))
df_wide <- df_combined %>%
select(value, vacc_pattern, count, proportion) %>%
pivot_wider(
names_from = vacc_pattern,
values_from = c(count, proportion),
values_fn = list(count = sum, proportion = sum),
values_fill = list(count = 0, proportion = 0)
) %>%
mutate(
Both = paste0(comma(`count_Both`), " (", `proportion_Both`, ")"),
`Influenza only` = paste0(comma(`count_Influenza`), " (", `proportion_Influenza`, ")"),
No = paste0(comma(`count_No`), " (", `proportion_No`, ")"),
Variable = group_label
) %>%
select(value, Both, `Influenza only`, No, Variable)
return(df_wide)
}
# Run for selected grouping variables
df_year <- process_variable(vacc_1524, "vacc_service_year", "Service year")
df_age <- process_variable(vacc_1524, "age_group", "Age group")
# Combine and format for flextable
df_combined_long <- bind_rows(df_year, df_age) %>%
rename(Value = value) %>%
select(Variable, Value, Both, `Influenza only`, No) %>%
group_by(Variable) %>%
mutate(Variable = ifelse(row_number() == 1, Variable, "")) %>%
ungroup()
# Create flextable
final_table <- flextable(df_combined_long) %>%
set_header_labels(
Variable = "Variable",
Value = "Value",
Both = "Both (%)",
`Influenza only` = "Influenza (%)",
No = "No (%)"
) %>%
add_header_row(
values = c(" ", " ", "Vaccination patternᵃ"),
colwidths = c(1, 1, 3)
) %>%
align(align = "justify", part = "all") %>%
bold(part = "header") %>%
bold(i = 1, part = "body", j = "Variable") %>%
border_remove() %>%
border_inner_h(border = fp_border(color = "black", width = 1)) %>%
add_header_lines(values = "Table xx: Vaccination uptake pattern, 2015–2024.") %>%
add_footer_lines(values = c(
"ᵃ The pattern reflects those with influenza vaccination history, influenza, or both."
))
# Display table
final_table
# .................. Clean up ..................
rm(df_year, df_age, df_combined_long, process_variable, final_table)
```
## **14. Multiple Response Tables – Columnwise Percentages**
Similar to the rowwise example, this version presents the **column percentages** of multiple responses.
🧾 *Key advantage*: Allows side-by-side comparison of proportions within each vaccine group across multiple demographics.
💼 *Common in*: Survey analysis, patient preference studies, behavioral research.
```{r}
# ------------ 11. Mutiple Response Tables: via appending: Columnwise -------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function (column-wise %) ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate column-wise percentages within each Category group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(
Total_Resp = sum(`Respiratory vaccines`[Label != "Missing"], na.rm = TRUE),
Total_Child = sum(`Childhood vaccines`[Label != "Missing"], na.rm = TRUE),
Total_Other = sum(Other[Label != "Missing"], na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total_Resp, 1), "%)")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total_Child, 1), "%)")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total_Other, 1), "%)"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Get total counts per vaccine group ..................
totals <- vacc_1524 %>%
count(vacc_group3) %>%
deframe()
# .................. Create flextable ..................
ft_vacc_summary <- tab_combined %>%
flextable() %>%
set_header_labels(
Category = "Variable type",
Label = "Values",
Respiratory = "Number (%)ᵃ",
Childhood = "Number (%)ᵃ",
Other_grp = "Number (%)ᵃ"
) %>%
add_header_row(
values = c(
"", "",
paste0("Respiratory vaccines (N = ", format(totals["Respiratory vaccines"], big.mark = ","), ")"),
paste0("Childhood vaccines (N = ", format(totals["Childhood vaccines"], big.mark = ","), ")"),
paste0("Other (N = ", format(totals["Other"], big.mark = ","), ")")
)
) %>%
add_header_row(
values = c("Description", "Vaccination uptake"),
colwidths = c(2, 3)
) %>%
theme_box() %>%
set_table_properties(layout = "autofit", width = 1) %>%
width(j = 1, width = 1.5) %>%
width(j = 2, width = 2.5) %>%
width(j = 3:5, width = 1.5) %>%
align(align = "justify", part = "all") %>%
border_remove() %>%
border_inner_h(border = fp_border(color = "black", width = 1)) %>%
set_caption("Table: Vaccination uptake (column percentage) across service year, quarter, and weekday") %>%
footnote(i = 1,
value = as_paragraph("Vaccination proportions were calculated column-wise (i.e., within each vaccine group)."),
ref_symbols = "ᵃ",
part = "header")
# .................. Show the table ..................
ft_vacc_summary
# .................. Clean up environment ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, ft_vacc_summary, totals, create_tab)
```
## **15. Interactive and Print-Ready Table**
Finally, we build an **interactive table using the DT package** with export options (CSV, Excel, Print).
🖱️ *Why use this*:
- Offers flexibility for exploratory review or end-user interaction.
- Enables printing or downloading filtered subsets.
🌍 *Perfect for*: Dashboards, websites, or when publishing to HTML via Quarto/GitHub.
```{r, echo=FALSE}
# Load required libraries
library(dplyr)
library(janitor)
library(tibble)
library(scales)
library(DT)
library(htmltools)
```
```{r}
# ------------ 12. Interactive and Print ready: Table -------------------------------------------
# .................. Convert grouping vars to character ..................
vacc_1524 <- vacc_1524 %>%
mutate(across(c(age_group, gender, marital_status,
vacc_service_year, vacc_quarter, vacc_weekday), as.character))
# .................. Table generation function with Missing ..................
create_tab <- function(data, var, label) {
tab_main <- data %>%
filter(!is.na({{ var }})) %>%
tabyl({{ var }}, vacc_group3) %>%
mutate(Category = label, .before = 1) %>%
rename(Label = {{ var }}) %>%
mutate(Label = as.character(Label))
na_count <- sum(is.na(pull(data, {{ var }})))
if (na_count > 0) {
missing_row <- tibble(
Category = label,
Label = "Missing",
`Respiratory vaccines` = sum(data$vacc_group3 == "Respiratory vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
`Childhood vaccines` = sum(data$vacc_group3 == "Childhood vaccines" & is.na(pull(data, {{ var }})), na.rm = TRUE),
Other = sum(data$vacc_group3 == "Other" & is.na(pull(data, {{ var }})), na.rm = TRUE)
)
tab_final <- bind_rows(tab_main, missing_row)
} else {
tab_final <- tab_main
}
return(tab_final)
}
# .................. Create summary tables ..................
tab_age <- create_tab(vacc_1524, age_group, "Age")
tab_sex <- create_tab(vacc_1524, gender, "Gender")
tab_marital_status <- create_tab(vacc_1524, marital_status, "Marital status")
tab_year <- create_tab(vacc_1524, vacc_service_year, "Year")
tab_quarter <- create_tab(vacc_1524, vacc_quarter, "Quarter")
tab_weekday <- create_tab(vacc_1524, vacc_weekday, "Weekday")
# .................. Combine all tables ..................
tab_combined <- bind_rows(tab_sex, tab_age, tab_marital_status, tab_year, tab_quarter, tab_weekday)
# .................. Calculate row-wise percentages (skip for Missing) ..................
tab_combined <- tab_combined %>%
rowwise() %>%
mutate(
Total = sum(`Respiratory vaccines`, `Childhood vaccines`, Other, na.rm = TRUE),
is_missing = trimws(tolower(Label)) == "missing",
Respiratory = ifelse(is_missing,
comma(`Respiratory vaccines`),
paste0(comma(`Respiratory vaccines`), " (", round(100 * `Respiratory vaccines` / Total, 1), ")")),
Childhood = ifelse(is_missing,
comma(`Childhood vaccines`),
paste0(comma(`Childhood vaccines`), " (", round(100 * `Childhood vaccines` / Total, 1), ")")),
Other_grp = ifelse(is_missing,
comma(Other),
paste0(comma(Other), " (", round(100 * Other / Total, 1), ")"))
) %>%
ungroup() %>%
select(Category, Label, Respiratory, Childhood, Other_grp)
# .................. Show Category only once per group ..................
tab_combined <- tab_combined %>%
group_by(Category) %>%
mutate(Category = ifelse(row_number() == 1, Category, "")) %>%
ungroup()
# .................. Create interactive table with DT ..................
datatable(
tab_combined,
rownames = FALSE,
extensions = 'Buttons',
options = list(
dom = 'Bfrtip',
buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
pageLength = 25,
autoWidth = TRUE,
columnDefs = list(list(width = '200px', targets = c(0, 1)))
),
caption = htmltools::tags$caption(
style = 'caption-side: top; font-size: 18px; font-family: "Times New Roman"; font-weight: bold;',
'Table: Vaccination uptake across service year, quarter, and weekday'
),
class = 'stripe hover cell-border'
)
# .................. Clean up all new objects ..................
rm(tab_age, tab_sex, tab_marital_status, tab_year, tab_quarter, tab_weekday,
tab_combined, create_tab)
```
# Brief summary
Automating table generation is crucial for improving reproducibility, consistency, and efficiency in health data reporting. It reduces the need for repetitive formatting and enables streamlined integration with reporting tools such as Quarto and GitHub. This document demonstrates various approaches to creating publication-ready tables using R.
### Row-Wise Tables
This method calculates percentages across rows, meaning each row adds up to 100%. Ideal for summarizing characteristics such as age, gender, or quarter.
### Column-Wise Tables
This variation calculates percentages within each vaccine group (i.e., by column). Each column totals 100%.
### Handling Missing Data
Tables optionally include rows labeled "Missing" to account for unreported values.
### Table Styles, Fonts and Border Options
Apply different styling configurations:
- **No lines** – minimal look for Word docs
- **Horizontal lines only** – readable and clean for print
- **All lines visible** – structured look for Excel-style documents
- The goal is to adapt the visual format to your output medium.
- Font selection (e.g., Times New Roman)
- Font sizes (header, body, footer)
- Color-coding headers and alternating rows
- Setting captions and custom footnotes
<div>
## Interactive Tables
</div>
Interactive tables are created using `DT::datatable()` for HTML outputs. Key features:
- Searchable and filterable
- Buttons for export (CSV, Excel, PDF)
- Caption styling using `htmltools`
```
## Summary Table: Style and Purpose
```
| Style | Use Case | Best For |
|---------------|---------------------|---------------------------|
| Row-wise % | Compare categories | Epidemiological summaries |
| Column-wise % | Within group totals | Vaccine distribution |
| Missing shown | Transparency | Data audits |
| No borders | Word reports | Clean look |
| Full borders | Technical | Detailed analysis |
| Interactive | HTML export | User engagement |
# Key Takeaways
- Automated table generation is efficient, reproducible, and customizable.
- `flextable`, `janitor`, and `dplyr` form a powerful combination for health data summarization.
- Style your tables based on your audience—minimal for reports, interactive for dashboards.
- Table Automation reduces manual error and allows rapid updates to underlying data.
- You can extend this workflow to support additional data types, link it to Shiny dashboards, or convert it for LaTeX outputs.
*This project is hosted on GitHub and supports reproducible outcomes using Quarto.*