This function provides an alternative to phyloseq::merge_samples() that better handles sample variables of different types, especially categorical sample variables. It combines the samples in x defined by the sample variable or factor group by summing the abundances in otu_table(x) and combines sample variables by the summary functions in funs. The default summary function, unique_or_na(), collapses the values within a group to a single unique value if it exists and otherwise returns NA. The new (merged) samples are named by the values in group.

merge_samples2(x, group, fun_otu = sum, funs = list(), reorder = FALSE)

# S4 method for phyloseq
merge_samples2(x, group, fun_otu = sum, funs = list(), reorder = FALSE)

# S4 method for otu_table
merge_samples2(x, group, fun_otu = sum, reorder = FALSE)

# S4 method for sample_data
merge_samples2(x, group, funs = list(), reorder = FALSE)

Arguments

x

A phyloseq, otu_table, or sample_data object

group

A sample variable or a vector of length nsamples(x) defining the sample grouping. A vector must be supplied if x is an otu_table

fun_otu

Function for combining abundances in the otu table; default is sum. Can be a formula to be converted to a function by purrr::as_mapper()

funs

Named list of merge functions for sample variables; default is unique_or_na

reorder

Logical specifying whether to reorder the new (merged) samples by name

Examples

data(enterotype) # Merge samples with the same project and clinical status ps <- enterotype sample_data(ps) <- sample_data(ps) %>% transform(Project.ClinicalStatus = Project:ClinicalStatus) sample_data(ps) %>% head
#> Sample Data: [ 6 samples by 10 sample variables ]: #> Enterotype Sample_ID SeqTech SampleID Project Nationality Gender #> 1 AM.AD.1 NA AM.AD.1 Sanger AM.AD.1 gill06 american F #> 2 AM.AD.2 NA AM.AD.2 Sanger AM.AD.2 gill06 american M #> 3 AM.F10.T1 NA AM.F10.T1 Sanger AM.F10.… turnba… american F #> 4 AM.F10.T2 3 AM.F10.T2 Sanger AM.F10.… turnba… american F #> 5 DA.AD.1 2 DA.AD.1 Sanger DA.AD.1 MetaHIT danish F #> 6 DA.AD.1T NA DA.AD.1T Sanger NA NA NA NA #> # … with 3 more variables: Age <dbl>, ClinicalStatus <fct>, #> # Project.ClinicalStatus <fct> #>
ps0 <- merge_samples2(ps, "Project.ClinicalStatus", fun_otu = mean, funs = list(Age = mean) )
#> Error in merge_samples2(ps, "Project.ClinicalStatus", fun_otu = mean, funs = list(Age = mean)): could not find function "merge_samples2"
sample_data(ps0) %>% head
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'sample_data': object 'ps0' not found