Friday, September 18, 2015

Multivariate Repeated Measurements with adonis():

Please check the updated code in the comment by Wallace Beiroz, 26 January 2015 at 13:37!

Lately I had to figure out how to do a repeated measures (or mixed effects) analysis on multivariate (species) data. Here I share code for a computation in R with the adonis function of the vegan package. Credit goes to Gavin Simpson providing most of the important pieces of the below code in R-Help.

The design:
We have multivariate species data, sampled at different sites (n = 6) at 3 points in time (N = 24).

The hypothesis:
H0: Species composition is the same across time. The permutation will produce the H0-population by a restricted randomization of time points.
H1: Species composition differs between time points

The analysis:
With repeated measurements one has to allow for the temporal structure and for the dependency within repeated measures. In a pedantic manner one would account for temporal structure by permuting all time-points in the same way also keeping the ordering of time-points. With 3 time-points this would give you only 3 permutations!

But we may relax this very restrictive assumption and say it doesn't matter that all samples at t1 were taken at one time and the ones of t2, etc. at another. Still there would be the repeated measures and the fact that the samples were taken in a temporal sequence. So how to deal with this?

(1) Having in mind that repeated measures on one sample are not independent, we will not permute across these samples.

(2) Now to the temporal correlation within repeated samples: The ordering of time-points is another type of dependency: the ordering is 1, 2, 3 - we will change this ordering only to 3, 1, 2 or 2, 3, 1, This permutation is referred to as toroidal shift which is keeping the "neighborhood" of values intact.

Eventually, we have 3 permutations per site, with 6 sites - this yields 3^6 = 729 permutations overall. Thus, a minimal p-value of 1/729 = 0.0014 can be obtained. I'd say that's good enough!

### species matrix with 20 species abundances (mean = 50, sd = 10)
### one time variable, with 3 timepoints, which should be tested
### and a factor denoting sites that were repeatedly sampled (site)

## Load packages
require(vegan)

### Data:
sp <- matrix(rnorm(3 * 6 * 20, 50, 10), nrow = 3 * 6, ncol = 20,
dimnames = list(1:18, paste("Sp", 1:20, sep = "")))

time <- as.ordered(rep(1:3, 6))
site <- gl(6, 3)
cbind(site, time, sp)

### add time effect at timepoint 3,
### this will effect will be tested by adonis():
sp_1 <- sp
sp_1[time==3,] <- sp[time==3,] + rnorm(20, 10, 1)
cbind(site, time, sp_1)

### choose which species set to test:
test_sp <- sp_1

### computing the true R2-value

### (btw, using dist() defaults to euclidean distance):
print(fit <- adonis(dist(test_sp) ~ time, permutations=1))

### number of perms
B <- 1999

### setting up frame which will be populated by
### random r2 values:
pop <- rep(NA, B + 1)

### the first entry will be the true r2:
pop[1] <- fit$aov.tab[1, 5]

### set up a "permControl" object:
### we turn off mirroring as time should only flow in one direction
ctrl <- permControl(strata = site, within = Within(type = "series", mirror = FALSE))

### Number of observations:
nobs <- nrow(test_sp)

### check permutation (...rows represent the sample id):
### ..they are ok!
### within in each repeated sample (= sites) timepoints are shuffled,
### with keeping the sequence intact (e.g., for site 1: 1,2,3 - 2,3,1 - 3,2,1)
shuffle(nobs, control = ctrl)

### loop:
### in adonis(...) you need to put permutations = 1, otherwise
### adonis will not run
set.seed(123)
for(i in 2:(B+1)){
idx <- shuffle(nobs, control = ctrl)
fit.rand <- adonis(dist(test_sp) ~ time[idx], permutations = 1)
pop[i] <- fit.rand$aov.tab[1, 5]
}

### get the p-value:
print(pval <- sum(pop >= pop[1]) / (B + 1))
### [1] 0.0035

### the sign. p-value supports the H1 (->there is a time effect).
### ..and the fact that samples are not iid is allowed by
### the customized perms - so this p-value is trustworthy as opposed
### to tests not acknowledging dependency of data points..

### test sp set without an effect:
### replace test_sp with sp set without effect:
test_sp <- sp

### now re-run the script and see the result:
### it is insign. - as expected:

### setting up frame which will be populated by
### random r2 values:
pop <- rep(NA, B + 1)

### computing the true R2-value:
print(fit <- adonis(dist(test_sp) ~ time, permutations = 1))

### the first entry will be the true r2:
pop[1] <- fit$aov.tab[1, 5]

### run the loop:
set.seed(123)
for(i in 2:(B+1)){
idx <- shuffle(nobs, control = ctrl)
fit.rand <- adonis(dist(test_sp) ~ time[idx], permutations = 1)
pop[i] <- fit.rand$aov.tab[1, 5]
}
print(pval <- sum(pop >= pop[1]) / (B + 1))
### [1] 0.701

## make a histogram to see random R2-values and the true one:
hist(pop, xlab = "Population R2")
abline(v = pop[1], col = 2, lty = 3)
text(0.08, 300, paste("true R2,\np = ", pval, sep = ""))

To cite package ‘vegan’ in publications use:

Jari Oksanen, F. Guillaume Blanchet, Roeland Kindt, Pierre Legendre,
Peter R. Minchin, R. B. O'Hara, Gavin L. Simpson, Peter Solymos, M.
Henry H. Stevens and Helene Wagner (2011). vegan: Community Ecology
Package. R package version 2.0-2.
http://CRAN.R-project.org/package=vegan

No comments:

Post a Comment