it would be useful to be able to subset the collections at a minimum prior to running a compute. that would mean applying the subset to both the data and metadata coherently.
also consider whether this utility would be better in the microbiomedb package vs here.