Illustration of a spurious differential abundance result due to bias

bias sensitivity R ref:leopold2020host differential abundance

Create illustration of a spurious differential abundance using the Leopold and Busby (2020) dataset.

Michael R. McLaren true
2021-04-26

Setup

# Tools for microbiome data
library(speedyseq)
# Tools for general purpose data manipulation and plotting
library(tidyverse)
# ggplot helpers
library(ggbeeswarm)
library(cowplot)
library(patchwork)
theme_set(theme_cowplot())
# stats helpers
library(broom)

library(metacal); packageVersion("metacal")
[1] '0.2.0.9001'
ps.mock.pseudo <- ps.mock %>%
  transform_sample_counts(function(x) x + 1)
mc_fit <- estimate_bias(ps.mock.pseudo, mock_actual, boot = TRUE) %>% print
A metacal bias fit.

Estimated relative efficiencies:
   Melampsora     Dioszegia     Epicoccum      Fusarium   Penicillium 
    8.7486819     0.2947290     1.0064648     3.0660596     0.2233792 
 Cladosporium   Trichoderma    Alternaria Aureobasidium 
    0.8988898     0.5776648     1.4679841     0.7380894 

Contains 1000 bootstrap replicates.
bias <- coef(mc_fit) %>% print
   Melampsora     Dioszegia     Epicoccum      Fusarium   Penicillium 
    8.7486819     0.2947290     1.0064648     3.0660596     0.2233792 
 Cladosporium   Trichoderma    Alternaria Aureobasidium 
    0.8988898     0.5776648     1.4679841     0.7380894 
mc_fit.summary <- summary(mc_fit)
print(mc_fit.summary)
Summary of a metacal bias fit.

Estimated relative efficiencies:
# A tibble: 9 × 4
  taxon         estimate gm_mean gm_se
  <chr>            <dbl>   <dbl> <dbl>
1 Melampsora       8.75    8.77   1.06
2 Dioszegia        0.295   0.294  1.10
3 Epicoccum        1.01    1.01   1.50
4 Fusarium         3.07    3.06   1.14
5 Penicillium      0.223   0.223  1.09
6 Cladosporium     0.899   0.898  1.13
7 Trichoderma      0.578   0.578  1.09
8 Alternaria       1.47    1.47   1.08
9 Aureobasidium    0.738   0.736  1.12

Geometric standard error estimated from 1000 bootstrap replicates.
coef_tb <- mc_fit.summary$coefficients
coef_tb %>%
  mutate(taxon = fct_reorder(taxon, estimate)) %>%
  ggplot(aes(taxon, estimate, 
      ymin = estimate / gm_se^2, ymax = estimate * gm_se^2)) +
  geom_hline(yintercept = 1, color = "grey") +
  geom_pointrange() +
  scale_y_log10() +
  coord_flip()

ps.pseudo <- transform_sample_counts(ps, function(x) x + 1)
ps.pseudo.cal <- calibrate(ps.pseudo, bias) %>% print
phyloseq-class experiment-level object
otu_table()   OTU Table:          [ 9 taxa and 567 samples ]:
sample_data() Sample Data:        [ 567 samples by 16 sample variables ]:
tax_table()   Taxonomy Table:     [ 9 taxa by 7 taxonomic ranks ]:
refseq()      DNAStringSet:       [ 9 reference sequences ]
taxa are columns

Sample mean efficiency across samples

To estimate the SMRE with formula sum_i A_i E_i, use the calibrated abundances.

ps0 <- ps.pseudo.cal %>%
  prune_taxa(names(bias), .)
otu <- ps0 %>% otu_table %>% orient_taxa(as = "cols") %>%
  transform_sample_counts(close_elts)
sam0 <- sample_data(ps0) %>% 
  as("data.frame") %>% 
  as_tibble(rownames = "Sample")
sme <- otu %>% psmelt %>%
  left_join(bias %>% enframe("OTU", "Efficiency"), by = "OTU") %>%
  group_by(Sample) %>%
  summarize(SMRE = sum(Efficiency * Abundance)) %>%
  left_join(sam0, by = "Sample") %>%
  mutate(
    across(Timepoint, as.factor),
    across(Timepoint, fct_explicit_na, "Other"),
  )

Regression analysis

First let’s pick a subset of samples to work with.

sam %>%
  count(Samp_type, Timepoint)
# A tibble: 4 × 3
  Samp_type  Timepoint     n
  <chr>      <fct>     <int>
1 Experiment 1           250
2 Experiment 2           298
3 Mock       <NA>         10
4 Single     <NA>          9
sam %>%
  filter(Samp_type == "Experiment") %>%
  count(Treatment, Timepoint)
# A tibble: 12 × 3
   Treatment     Timepoint     n
   <chr>         <fct>     <int>
 1 Alternaria    1            52
 2 Alternaria    2            53
 3 Aureobasidium 1            50
 4 Aureobasidium 2            48
 5 Cladosporium  1            49
 6 Cladosporium  2            48
 7 Dioszegia     1            48
 8 Dioszegia     2            49
 9 Fusarium      1            49
10 Fusarium      2            50
11 Negative      1             2
12 Negative      2            50
sam %>%
  filter(Samp_type == "Experiment", Timepoint == 2) %>%
  count(Region, Treatment)
# A tibble: 12 × 3
   Region Treatment         n
   <chr>  <chr>         <int>
 1 East   Alternaria       22
 2 East   Aureobasidium    21
 3 East   Cladosporium     19
 4 East   Dioszegia        18
 5 East   Fusarium         21
 6 East   Negative         21
 7 West   Alternaria       31
 8 West   Aureobasidium    27
 9 West   Cladosporium     29
10 West   Dioszegia        31
11 West   Fusarium         29
12 West   Negative         29
sam %>%
  filter(Samp_type == "Experiment", Treatment != "Negative") %>%
  count(Region, Timepoint)
# A tibble: 4 × 3
  Region Timepoint     n
  <chr>  <fct>     <int>
1 East   1           102
2 East   2           101
3 West   1           146
4 West   2           147
sam %>%
  filter(Samp_type == "Experiment", Treatment != "Negative") %>%
  with_groups(Genotype, summarize, n1 = sum(Timepoint == 1), n2 = sum(Timepoint == 2), tot = n())
# A tibble: 12 × 4
   Genotype    n1    n2   tot
   <chr>    <int> <int> <int>
 1 East-1      20    20    40
 2 East-2      16    17    33
 3 East-3      19    18    37
 4 East-4      25    25    50
 5 East-5      22    21    43
 6 West-1      22    21    43
 7 West-2      22    24    46
 8 West-3      23    23    46
 9 West-4      23    23    46
10 West-5      18    17    35
11 West-6      20    19    39
12 West-7      18    20    38

The “Negative” treatment was not inoculated with commensals; let’s drop that from the analysis. Note that there are 2 more West genotypes adn thus more West than East samples.

Time point

Asks, how did taxa vary in proportion with the addition of Mel? Note, we expect the commensal taxa to typically decrease given that Mel successfully infects most plants.

Want to do linear regression on the timepoint.

starting point - observed and calibrated proportions in the experiment samples; SME in the experiment samples

ps1.obs <- ps.pseudo %>%
  subset_samples(Samp_type == "Experiment" & Treatment != "Negative") %>%
  prune_taxa(mock_taxa, .) %>%
  orient_taxa(as = "cols") %>%
  transform_sample_counts(close_elts)
ps1.cal <- ps1.obs %>% calibrate(bias)
sam1 <- sample_data(ps1.obs) %>% as("data.frame")
sme1 <- sme %>% filter(Sample %in% sample_names(ps1.obs)) %>% rename(Mean_efficiency = SMRE)

Note the different numbers of samples at each timepoint Should consider more sophisticated resgression models.

fit.obs <- lm(log2(otu_table(ps1.obs)) ~ Timepoint, data = sam1)
fit.cal <- lm(log2(otu_table(ps1.cal)) ~ Timepoint, data = sam1)
fit.sme <- lm(log2(Mean_efficiency) ~ Timepoint, data = sme1)

Check that the observed changes - the calibrated changes are off by the expected constant shift,

t(coef(fit.obs) - coef(fit.cal))
              (Intercept) Timepoint2
Melampsora     2.60208447  -1.991951
Aureobasidium -0.96511376  -1.991951
Trichoderma   -1.31867681  -1.991951
Fusarium       1.08940455  -1.991951
Penicillium   -2.68941450  -1.991951
Alternaria     0.02685515  -1.991951
Cladosporium  -0.68076501  -1.991951
Dioszegia     -2.28952043  -1.991951
Epicoccum     -0.51768451  -1.991951
coef(fit.sme)
(Intercept)  Timepoint2 
  0.5269812   1.9919505 
coef(fit.obs) %>% t
              (Intercept) Timepoint2
Melampsora    -10.9999195  10.702851
Aureobasidium  -7.9294352  -3.681906
Trichoderma    -9.4061811  -3.872543
Fusarium       -2.1926596  -5.089495
Penicillium   -10.8867241  -2.555602
Alternaria     -0.9593701  -4.074728
Cladosporium   -6.0294424  -3.585850
Dioszegia      -7.7191545  -5.463284
Epicoccum      -2.3532764  -4.745944
coef(fit.cal) %>% t
              (Intercept) Timepoint2
Melampsora    -13.6020039 12.6948011
Aureobasidium  -6.9643215 -1.6899556
Trichoderma    -8.0875043 -1.8805929
Fusarium       -3.2820642 -3.0975448
Penicillium    -8.1973096 -0.5636511
Alternaria     -0.9862252 -2.0827775
Cladosporium   -5.3486773 -1.5938999
Dioszegia      -5.4296340 -3.4713339
Epicoccum      -1.8355919 -2.7539932

The effect of bias is significant; e.g. Aureobasidium and Trichoderma appear to decrease in proportion by 16X but actually decrease by 4X, and Penicillium appears to decrease by 5.9X but only decreases by 1.5X.

Could use the case of Penicillium as our example.

# Check expected relationship between coefficients on intercept and slope
all.equal(
  coef(fit.obs)[,"Penicillium"],
  coef(fit.cal)[,"Penicillium"] + c(log2(bias["Penicillium"]), 0) - coef(fit.sme)
)
[1] TRUE
tb <- bind_rows(
  Observed = psmelt(ps1.obs),
  Calibrated = psmelt(ps1.cal),
  .id = "Type"
) %>%
  mutate(across(Type, fct_relevel, "Calibrated")) %>%
  rename(Proportion = Abundance)
tb %>%
  filter(OTU == "Penicillium") %>%
  ggplot(aes(y = log2(Proportion), x = Timepoint, color = Type)) +
  geom_quasirandom(alpha = 0.4) +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8) +
  scale_color_brewer(type = "qual", palette = 6)

tb %>%
  filter(OTU == "Penicillium") %>%
  ggplot(aes(y = log2(Proportion), x = Timepoint)) +
  geom_quasirandom(alpha = 0.4) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8, arrow = grid::arrow(length = unit(0.15, "inches"))
  ) +
  facet_wrap(~Type)

TODO: Add a panel with the sample mean efficiency

sme1 %>%
  ggplot(aes(y = log2(Mean_efficiency), x = Timepoint)) +
  geom_quasirandom(alpha = 0.4) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8, arrow = grid::arrow(length = unit(0.15, "inches"))
  )

Question: For the purposes of illustration, might we want to find the host region or host genotype with the largest change in SME? Ok as long as we’re clear what we’re doing. Reducing the number of data points and biological variation might actually make things cleaner as well (or could use boxplots), since the point is not to dwell on all the variation, but rather just the average effect.

for paper, might be clearer to call “Corrected” instead of “Calibrated”.

Can easily split graph by region or host genotype:

tb %>%
  filter(OTU == "Penicillium") %>%
  ggplot(aes(y = log2(Proportion), x = Timepoint, color = Region)) +
  geom_quasirandom(alpha = 0.4) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8, arrow = grid::arrow(length = unit(0.15, "inches"))
  ) +
  facet_wrap(~Type)

tb %>%
  filter(OTU == "Penicillium") %>%
  ggplot(aes(y = log2(Proportion), x = Timepoint)) +
  geom_quasirandom(alpha = 0.3) +
  #> stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_summary(fun.data = mean_cl_normal, fun.args = c(conf.int = 0.95), 
    geom = "pointrange") +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8, arrow = grid::arrow(length = unit(0.15, "inches"))
  ) +
  facet_grid(Region~Type)

Note that the 95% CIs are small and hard to see.

Want to plot the sme’s with the change in proportion; make sure the y-axes have the same scale. One way to simplify this is to show the mean efficiency relative to the taxon in question; then it will give the difference between calibrated and observed.

sme2 <- sme1 %>%
  mutate(Mean_relative_efficiency = Mean_efficiency / bias["Penicillium"]) %>%
  rename(Proportion = Mean_relative_efficiency) %>%
  mutate(Type = "Mean relative efficiency")
tb1 <- tb %>%
  filter(OTU == "Penicillium") %>%
  bind_rows(sme2) %>%
  mutate(across(Type, fct_relevel, c("Calibrated", "Observed", "Mean relative efficiency")))
lyrs <- list(
  geom_quasirandom(alpha = 0.3),
  #> stat_summary(fun = "mean", geom = "point", size = 3),
  stat_summary(fun.data = mean_cl_normal, fun.args = c(conf.int = 0.95), 
    geom = "pointrange"),
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8, arrow = grid::arrow(length = unit(0.15, "inches"))
  )
)
tb1 %>%
  ggplot(aes(y = log2(Proportion), x = Timepoint)) +
  lyrs +
  facet_grid(Region~Type)

tb1 %>%
  mutate(across(Type, fct_reorder, Proportion)) %>%
  ggplot(aes(y = log2(Proportion), x = Timepoint, color = Type)) +
  geom_quasirandom(alpha = 0.3) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.8, arrow = grid::arrow(length = unit(0.15, "inches"))
  ) +
  scale_color_brewer(type = "qual", palette = 2) +
  guides(color = guide_legend(reverse = TRUE)) +
  facet_grid(.~Region) +
  geom_text(data = tibble(Region = "West"), x = 2.7, y = 5,
    color = "black", size = 4.5, hjust = 0, vjust = 1,
    label = "Pathogen growth causes\nmean efficiency to increase,",
  ) +
  geom_text(data = tibble(Region = "West"), x = 2.7, y = -10,
    color = "black", size = 4.5, hjust = 0, vjust = 1,
    label = "which leads to larger-than-\nactual decreases in the\nfocal taxon Penicillium.",
  ) +
  coord_cartesian(xlim = c(1, 2), clip = "off") +
  theme(
    legend.position = "top",
    plot.margin = margin(r = 2.5, unit = "in"),
  ) +
  plot_annotation(
    title = "Effect of bias on Penicillium differential abundance"
  )

Summary:

Alt. plot

tb2 <- tb1 %>%
  mutate(
    log2_proportion = log2(Proportion),
    across(Type, fct_relevel, c("Observed", "Calibrated", "Mean relative efficiency")),
    #> across(Type, fct_recode, Corrected = "Calibrated", Uncorrected = "Observed")
  )

Let’s show the calibrated, observed, and mean efficiency side by side, similar to my initial explanatory plot.

Question: How to get the same y-axis range in each facet? We need to determine what we want the min and max y values for each facet to be. Choose so that the range in each panel equals the max data range for any one panel, and center the data range of each panel within the plot range.

yr <- tb2 %>%
  with_groups(c(Region, Type), summarize, 
    across(log2_proportion, c(min = min, max = max), .names =  "data_{.fn}")
 ) %>%
  mutate(
    data_range = data_max - data_min,
    data_mid = (data_max + data_min) / 2,
    y_min = data_mid - max(data_range) / 2,
    y_max = data_mid + max(data_range) / 2
  ) %>%
  print
# A tibble: 6 × 8
  Region Type      data_min data_max data_range data_mid   y_min y_max
  <chr>  <fct>        <dbl>    <dbl>      <dbl>    <dbl>   <dbl> <dbl>
1 East   Observed    -14.9     -8.66       6.26   -11.8  -16.2   -7.41
2 East   Calibrat…   -12.2     -5.96       6.22    -9.07 -13.5   -4.69
3 East   Mean rel…     2.48     5.27       2.79     3.87  -0.510  8.26
4 West   Observed    -14.9     -6.16       8.77   -10.5  -14.9   -6.16
5 West   Calibrat…   -11.3     -3.58       7.76    -7.46 -11.8   -3.07
6 West   Mean rel…     2.47     5.26       2.79     3.87  -0.515  8.25
stopifnot(yr %>% {sd((.$y_max - .$y_min))} < 1e-14)
yr.long <- yr %>%
  pivot_longer(c(y_min, y_max))
tb2 %>%
  ggplot(aes(y = log2_proportion, x = Timepoint)) +
  geom_quasirandom(alpha = 0.3) +
  geom_blank(data = yr.long, aes(y = value, x = 1)) +
  stat_summary(fun = "mean", geom = "point", size = 3) +
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.9, arrow = grid::arrow(length = unit(0.15, "inches")), color = "red"
  ) +
  scale_color_brewer(type = "qual", palette = 2) +
  guides(color = guide_legend(reverse = TRUE)) +
  facet_wrap(Region~Type, scales = "free_y") +
  coord_cartesian(xlim = c(1, 2), clip = "off") +
  scale_y_continuous(breaks = seq(-20, 10, by = 2)) +
  labs(y = "log2(Proportion)")

text_tb <- yr %>%
  filter(Type %in% c("Calibrated", "Observed")) %>%
  mutate(
    text = case_when(
      Type == "Observed" ~ "=",
      Type == "Calibrated" ~ "–",
    ),
    y = y_min + (y_max - y_min) / 2
  )

#> Label "Calibrated" as "Actual" for purposes of illustration?
lblr <- function(labels) {
  labels %>% pull(1) %>% fct_recode(Actual = "Calibrated") %>% as.character %>% list
}

lyrs <- list(
  geom_quasirandom(alpha = 0.3),
  #> stat_summary(fun = "mean", geom = "point", size = 3),
  stat_summary(fun.data = "mean_cl_boot", geom = "pointrange"),
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    #> arrow = grid::arrow(length = unit(0.15, "inches")),
    size = 0.9, color = "red"
  ),
  scale_color_brewer(type = "qual", palette = 2),
  guides(color = guide_legend(reverse = TRUE)),
  facet_wrap(~Type, scales = "free_y", labeller = lblr),
  #> theme_minimal_hgrid(),
  scale_y_continuous(breaks = seq(-20, 10, by = 2)),
  theme(
    plot.title = element_text(face = "plain"),
    legend.position = "top",
    panel.spacing.x = unit(0.75, "in")
  ),
  labs(y = "log2(Proportion)"),
  # Corrected = Observed - Mean relative efficiency
  coord_cartesian(xlim = c(1, 2), clip = "off")
)
p1 <- tb2 %>%
  filter(Region == "East") %>%
  ggplot(aes(y = log2_proportion, x = Timepoint)) +
  lyrs +
  geom_blank(data = yr.long %>% filter(Region == "East"), aes(y = value, x = factor(1))) +
  labs(title = "Eastern genotypes: Bias causes sign error") +
  geom_text(
    data = text_tb %>% filter(Region == "East"),
    aes(y = y, label = text),
    x = 2.9,
    color = "black", size = 8, hjust = 0.5, vjust = 0.5,
  )
p2 <- tb2 %>%
  filter(Region == "West") %>%
  ggplot(aes(y = log2_proportion, x = Timepoint)) +
  lyrs +
  geom_blank(data = yr.long %>% filter(Region == "West"), aes(y = value, x = factor(1))) +
  labs(title = "Western genotypes: Bias causes magnitude error") +
  geom_text(
    data = text_tb %>% filter(Region == "West"),
    aes(y = y, label = text),
    x = 2.9,
    color = "black", size = 8, hjust = 0.5, vjust = 0.5,
  )
p1 / p2 + 
  plot_annotation(
    #> title = "Effect of bias on Penicillium differential abundance",
    tag_levels = "A"
  )

This ^ is my favorite so far

TODO:

alternate approach using 2-d facet wrap:

symbol_labels <- yr %>%
  filter(Type %in% c("Calibrated", "Observed")) %>%
  mutate(
    text = case_when(
      Type == "Observed" ~ "=",
      Type == "Calibrated" ~ "–",
      #> Type == "Mean relative efficiency" & Region == "East" ~ "Eastern\ngenotypes",
      #> Type == "Mean relative efficiency" & Region == "West" ~ "Western\ngenotypes",
    ),
    y = y_min + (y_max - y_min) / 2
  )
region_labels <- yr %>%
  filter(Type == "Mean relative efficiency") %>%
  mutate(
    text = case_when(
      Region == "East" ~ "Eastern genotypes\n\nBias causes\nsign error",
      Region == "West" ~ "Western genotypes\n\nBias causes\nmagnitude error",
    ),
    y = y_min + (y_max - y_min) / 2
  )

lyrs <- list(
  geom_quasirandom(alpha = 0.3),
  #> geom_blank(data = yr, aes(y = y_min, x = factor(1))),
  #> geom_blank(data = yr, aes(y = y_max, x = factor(1))),
  geom_blank(data = yr.long, aes(y = value, x = 1)),
  stat_summary(fun = "mean", geom = "point", size = 3),
  stat_smooth(aes(x = as.integer(Timepoint)), method = "lm", geom = "line",
    size = 0.9, arrow = grid::arrow(length = unit(0.15, "inches")), color = "red"
  ),
  scale_color_brewer(type = "qual", palette = 2),
  guides(color = guide_legend(reverse = TRUE)),
  #> facet_wrap(~Type, scales = "free_y"),
  #> theme_minimal_hgrid(),
  scale_y_continuous(breaks = seq(-20, 10, by = 2)),
  labs(y = "log2(Proportion)"),
  # Calibrated = Observed - Mean relative efficiency
  geom_text(
    data = symbol_labels,
    aes(y = y, label = text),
    x = 2.9, color = "black", size = 8, hjust = 0.5, vjust = 0.5,
  ),
  geom_text(
    data = region_labels,
    aes(y = y, label = text),
    x = 2.8, color = "black", size = 4.8, hjust = 0, vjust = 0.5,
  ),
  coord_cartesian(xlim = c(1, 2), clip = "off"),
  theme(
    legend.position = "top",
    panel.spacing.x = unit(0.75, "in"),
    plot.margin = margin(r = 1.75, unit = "in")
  )
)

# labeller assuming first col is region, second is type
lblr <- function(labels) {
  labels %>% pull(2) %>% as.character %>% list
}

tb2 %>%
  ggplot(aes(y = log2_proportion, x = Timepoint)) +
  lyrs +
  facet_wrap(Region~Type, scales = "free_y", 
    labeller = lblr
    #> labeller = labeller(Region = NULL, .multi_line = FALSE),
    #> strip.position = "right"
  )

Session info

Click for session info
sessioninfo::session_info()
─ Session info ──────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.1.0 (2021-05-18)
 os       Arch Linux                  
 system   x86_64, linux-gnu           
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/New_York            
 date     2021-07-28                  

─ Packages ──────────────────────────────────────────────────────────────────────
 package          * version    date       lib source                           
 ade4               1.7-17     2021-06-17 [1] CRAN (R 4.1.0)                   
 ape                5.5        2021-04-25 [1] CRAN (R 4.1.0)                   
 assertthat         0.2.1      2019-03-21 [1] CRAN (R 4.0.0)                   
 backports          1.2.1      2020-12-09 [1] CRAN (R 4.0.3)                   
 base64enc          0.1-3      2015-07-28 [1] CRAN (R 4.0.0)                   
 beeswarm           0.4.0      2021-06-01 [1] CRAN (R 4.1.0)                   
 Biobase            2.52.0     2021-05-19 [1] Bioconductor                     
 BiocGenerics       0.38.0     2021-05-19 [1] Bioconductor                     
 biomformat         1.20.0     2021-05-19 [1] Bioconductor                     
 Biostrings         2.60.0     2021-05-19 [1] Bioconductor                     
 bitops             1.0-7      2021-04-24 [1] CRAN (R 4.1.0)                   
 broom            * 0.7.8      2021-06-24 [1] CRAN (R 4.1.0)                   
 bslib              0.2.5.1    2021-05-18 [1] CRAN (R 4.1.0)                   
 cellranger         1.1.0      2016-07-27 [1] CRAN (R 4.0.0)                   
 checkmate          2.0.0      2020-02-06 [1] CRAN (R 4.0.2)                   
 cli                3.0.1      2021-07-17 [1] CRAN (R 4.1.0)                   
 cluster            2.1.2      2021-04-17 [2] CRAN (R 4.1.0)                   
 codetools          0.2-18     2020-11-04 [2] CRAN (R 4.1.0)                   
 colorspace         2.0-2      2021-06-24 [1] CRAN (R 4.1.0)                   
 cowplot          * 1.1.1      2020-12-30 [1] CRAN (R 4.0.4)                   
 crayon             1.4.1      2021-02-08 [1] CRAN (R 4.0.4)                   
 data.table         1.14.0     2021-02-21 [1] CRAN (R 4.0.4)                   
 DBI                1.1.1      2021-01-15 [1] CRAN (R 4.0.4)                   
 dbplyr             2.1.1      2021-04-06 [1] CRAN (R 4.0.5)                   
 digest             0.6.27     2020-10-24 [1] CRAN (R 4.0.3)                   
 distill            1.2        2021-01-13 [1] CRAN (R 4.1.0)                   
 downlit            0.2.1      2020-11-04 [1] CRAN (R 4.0.3)                   
 dplyr            * 1.0.7      2021-06-18 [1] CRAN (R 4.1.0)                   
 ellipsis           0.3.2      2021-04-29 [1] CRAN (R 4.1.0)                   
 evaluate           0.14       2019-05-28 [1] CRAN (R 4.0.0)                   
 fansi              0.5.0      2021-05-25 [1] CRAN (R 4.1.0)                   
 farver             2.1.0      2021-02-28 [1] CRAN (R 4.0.4)                   
 forcats          * 0.5.1      2021-01-27 [1] CRAN (R 4.0.4)                   
 foreach            1.5.1      2020-10-15 [1] CRAN (R 4.0.3)                   
 foreign            0.8-81     2020-12-22 [2] CRAN (R 4.1.0)                   
 Formula            1.2-4      2020-10-16 [1] CRAN (R 4.0.3)                   
 fs                 1.5.0      2020-07-31 [1] CRAN (R 4.0.2)                   
 generics           0.1.0      2020-10-31 [1] CRAN (R 4.0.3)                   
 GenomeInfoDb       1.28.0     2021-05-19 [1] Bioconductor                     
 GenomeInfoDbData   1.2.6      2021-05-31 [1] Bioconductor                     
 ggbeeswarm       * 0.6.0      2017-08-07 [1] CRAN (R 4.0.0)                   
 ggplot2          * 3.3.5      2021-06-25 [1] CRAN (R 4.1.0)                   
 glue               1.4.2      2020-08-27 [1] CRAN (R 4.0.2)                   
 gridExtra          2.3        2017-09-09 [1] CRAN (R 4.0.2)                   
 gtable             0.3.0      2019-03-25 [1] CRAN (R 4.0.0)                   
 haven              2.4.1      2021-04-23 [1] CRAN (R 4.1.0)                   
 here               1.0.1      2020-12-13 [1] CRAN (R 4.0.5)                   
 highr              0.9        2021-04-16 [1] CRAN (R 4.1.0)                   
 Hmisc              4.5-0      2021-02-28 [1] CRAN (R 4.0.4)                   
 hms                1.1.0      2021-05-17 [1] CRAN (R 4.1.0)                   
 htmlTable          2.2.1      2021-05-18 [1] CRAN (R 4.1.0)                   
 htmltools          0.5.1.1    2021-01-22 [1] CRAN (R 4.0.3)                   
 htmlwidgets        1.5.3      2020-12-10 [1] CRAN (R 4.0.3)                   
 httr               1.4.2      2020-07-20 [1] CRAN (R 4.0.2)                   
 igraph             1.2.6      2020-10-06 [1] CRAN (R 4.0.3)                   
 IRanges            2.26.0     2021-05-19 [1] Bioconductor                     
 iterators          1.0.13     2020-10-15 [1] CRAN (R 4.0.3)                   
 jpeg               0.1-9      2021-07-24 [1] CRAN (R 4.1.0)                   
 jquerylib          0.1.4      2021-04-26 [1] CRAN (R 4.1.0)                   
 jsonlite           1.7.2      2020-12-09 [1] CRAN (R 4.0.3)                   
 knitr              1.33       2021-04-24 [1] CRAN (R 4.1.0)                   
 labeling           0.4.2      2020-10-20 [1] CRAN (R 4.0.3)                   
 lattice            0.20-44    2021-05-02 [2] CRAN (R 4.1.0)                   
 latticeExtra       0.6-29     2019-12-19 [1] CRAN (R 4.0.0)                   
 lifecycle          1.0.0      2021-02-15 [1] CRAN (R 4.0.4)                   
 lubridate          1.7.10     2021-02-26 [1] CRAN (R 4.0.4)                   
 magrittr           2.0.1      2020-11-17 [1] CRAN (R 4.0.3)                   
 MASS               7.3-54     2021-05-03 [2] CRAN (R 4.1.0)                   
 Matrix             1.3-3      2021-05-04 [2] CRAN (R 4.1.0)                   
 metacal          * 0.2.0.9001 2021-07-16 [1] Github (mikemc/metacal@a7a87a1)  
 mgcv               1.8-35     2021-04-18 [2] CRAN (R 4.1.0)                   
 modelr             0.1.8      2020-05-19 [1] CRAN (R 4.0.0)                   
 multtest           2.48.0     2021-05-19 [1] Bioconductor                     
 munsell            0.5.0      2018-06-12 [1] CRAN (R 4.0.0)                   
 nlme               3.1-152    2021-02-04 [2] CRAN (R 4.1.0)                   
 nnet               7.3-16     2021-05-03 [2] CRAN (R 4.1.0)                   
 nvimcom          * 0.9-102    2021-07-17 [1] local                            
 patchwork        * 1.1.1      2020-12-17 [1] CRAN (R 4.0.3)                   
 permute            0.9-5      2019-03-12 [1] CRAN (R 4.0.0)                   
 phyloseq         * 1.36.0     2021-05-19 [1] Bioconductor                     
 pillar             1.6.1      2021-05-16 [1] CRAN (R 4.1.0)                   
 pkgconfig          2.0.3      2019-09-22 [1] CRAN (R 4.0.0)                   
 plyr               1.8.6      2020-03-03 [1] CRAN (R 4.0.0)                   
 png                0.1-7      2013-12-03 [1] CRAN (R 4.0.0)                   
 purrr            * 0.3.4      2020-04-17 [1] CRAN (R 4.0.0)                   
 R6                 2.5.0      2020-10-28 [1] CRAN (R 4.0.3)                   
 RColorBrewer       1.1-2      2014-12-07 [1] CRAN (R 4.0.0)                   
 Rcpp               1.0.7      2021-07-07 [1] CRAN (R 4.1.0)                   
 RCurl              1.98-1.3   2021-03-16 [1] CRAN (R 4.0.5)                   
 readr            * 2.0.0      2021-07-20 [1] CRAN (R 4.1.0)                   
 readxl             1.3.1      2019-03-13 [1] CRAN (R 4.0.0)                   
 reprex             2.0.0      2021-04-02 [1] CRAN (R 4.0.5)                   
 reshape2           1.4.4      2020-04-09 [1] CRAN (R 4.0.0)                   
 rhdf5              2.36.0     2021-05-19 [1] Bioconductor                     
 rhdf5filters       1.4.0      2021-05-19 [1] Bioconductor                     
 Rhdf5lib           1.14.0     2021-05-19 [1] Bioconductor                     
 rlang              0.4.11     2021-04-30 [1] CRAN (R 4.1.0)                   
 rmarkdown        * 2.9        2021-06-15 [1] CRAN (R 4.1.0)                   
 rpart              4.1-15     2019-04-12 [2] CRAN (R 4.1.0)                   
 rprojroot          2.0.2      2020-11-15 [1] CRAN (R 4.0.3)                   
 rstudioapi         0.13       2020-11-12 [1] CRAN (R 4.0.3)                   
 rvest              1.0.0      2021-03-09 [1] CRAN (R 4.0.5)                   
 S4Vectors          0.30.0     2021-05-19 [1] Bioconductor                     
 sass               0.4.0      2021-05-12 [1] CRAN (R 4.1.0)                   
 scales             1.1.1      2020-05-11 [1] CRAN (R 4.0.0)                   
 sessioninfo        1.1.1      2018-11-05 [1] CRAN (R 4.0.0)                   
 speedyseq        * 0.5.3.9018 2021-06-29 [1] Github (mikemc/speedyseq@ceb941f)
 stringi            1.7.3      2021-07-16 [1] CRAN (R 4.1.0)                   
 stringr          * 1.4.0      2019-02-10 [1] CRAN (R 4.0.0)                   
 survival           3.2-11     2021-04-26 [2] CRAN (R 4.1.0)                   
 tibble           * 3.1.3      2021-07-23 [1] CRAN (R 4.1.0)                   
 tidyr            * 1.1.3      2021-03-03 [1] CRAN (R 4.0.4)                   
 tidyselect         1.1.1      2021-04-30 [1] CRAN (R 4.1.0)                   
 tidyverse        * 1.3.1      2021-04-15 [1] CRAN (R 4.1.0)                   
 tzdb               0.1.2      2021-07-20 [1] CRAN (R 4.1.0)                   
 useful             1.2.6      2018-10-08 [1] CRAN (R 4.0.0)                   
 utf8               1.2.2      2021-07-24 [1] CRAN (R 4.1.0)                   
 vctrs              0.3.8      2021-04-29 [1] CRAN (R 4.1.0)                   
 vegan              2.5-7      2020-11-28 [1] CRAN (R 4.0.3)                   
 vipor              0.4.5      2017-03-22 [1] CRAN (R 4.0.0)                   
 withr              2.4.2      2021-04-18 [1] CRAN (R 4.0.5)                   
 xfun               0.24       2021-06-15 [1] CRAN (R 4.1.0)                   
 xml2               1.3.2      2020-04-23 [1] CRAN (R 4.0.0)                   
 XVector            0.32.0     2021-05-19 [1] Bioconductor                     
 yaml               2.2.1      2020-02-01 [1] CRAN (R 4.0.0)                   
 zlibbioc           1.38.0     2021-05-19 [1] Bioconductor                     

[1] /home/michael/.local/lib/R/library
[2] /usr/lib/R/library