Projects
- Project report:
- OrangeTree.pdf
- sim-bias.pdf
- sim-ci.pdf
- Project report:
- min.pdf
- authors: Richard Chandler, Marc Kery, and Hans Skaug
- Project report:
- nmix.pdf
- Project report:
- owls.pdf
- Project report:
- skate.pdf
- author: Ben Bolker
- Project report:
- tadpole.pdf
- AD Model Builder is fastest, but requires the most code.
- JAGS is slower, but not too bad for this relatively simple problem (and produces much wider credible/confidence intervals).
- A hidden Markov model can be implemented in R, but takes some effort and is quite slow.
- Project report:
- theta.pdf
- Project report:
- weeds.pdf
Orange tree growth
The orange tree growth data, originally taken from Draper & Smith (1981) and reproduced in Draper & Smith (1998), p. 559, was used by Pinheiro & Bates (2000, Ch. 8.2) to illustrate how a logistic growth curve model with random effects can be implemented with the S-Plus function nlme. The data contain measurements of trunk circumferences (mm) made at seven occasions for each of five orange trees. The data is available within R in "datasets::Orange".
The errors within trees are assumed to be normally distributed and independent; the data can be straightforwardly analyzed either by standard nonlinear regression (assuming each tree follows an independent growth curve) or by nonlinear mixed-effects models (allowing the growth parameters to be random variables from an underlying population distribution).
Mineralization of terbuthylazine
Terbuthylazine is a herbicide used in agriculture. It is a so-called s-triazin like
atrazine, which has been banned in Denmark after suspicion of causing cancer.
Terbuthylazine can be bound to the soil, but free terbuthylazine can be washed into
the drinking water. Some bacteria can mineralize it. This data is part of a larger
experiment to determine the ability of certain bacteria to mineralize terbuthylazine,
and to estimate the mineralization rate.
This is a fairly straightforward nonlinear least-squares problem, with normally distributed residuals and no random effects or latent variables. The deterministic part of the model is the solution to a set of coupled ordinary differential equations (ODEs) for the concentrations in different compartments. Because the ODEs are linear, the deterministic solution can be found directly in terms of a matrix exponential, for which functions exist in all three of ADMB, BUGS, and R. From there it is simply a matter of defining a normal likelihood, or equivalently a least-squares expression, and minimizing it. The main differences appear in the speed and robustness of the matrix exponential formulations in different software tools.
author: Anders Nielsen
Fitting N-mixture models with random observer effects
Most protocols for estimating abundance while taking detectability into account require that individuals can be individually identified, a condition which often requires capturing and marking of animals. This is costly and therefore the N-mixture, or binomial mixture, model of Royle (2004) is an appealing alternative: this model yields estimates of abundance from spatially and temporally replicated counts of unmarked animals alone. Typical applications of the Nmix model require the assumption of some effects as random, for instance, to account for intrinsic differences in the ability of field ornithologists to detect and identify birds.
This example shows worked BUGS and ADMB solutions to estimating an N-mixture model and demonstrates their use on simulated (pseudo-)data. (1) The BUGS and ADMB estimates of the fixed effects and variances are very similar. (2) Differences are more pronounced in the estimates of the random effects. The posterior means from WinBUGS were more accurate than the ADMB estimates (RMSE = 0.22 vs 0.43). (3) The BUGS estimates were more precise than ADMB estimates of the random effects. On average, the posterior standard deviations obtained by BUGS were 47% smaller than the standard deviations estimated by ADMB. ADMB is much faster (4 vs. 12 minutes per estimate).
Owl nestling negotiation
The data for this example, taken from Zuur et al. (2009) and ultimately from Roulin and Bersier (2007), quantify the number of vocalizations (sibling negotiations) by owl chicks in different nests as a function of food treatment (deprived or satiated), the sex of the parent, and arrival time of the parent at the nest.
This problem is basically a zero-inflated generalized linear mixed model, where numbers of negotiations are the response variable, food treatment/arrival time/parental sex are the fixed-effect predictors, and sites are a random effect. The presence of zero-inflation puts the problem beyond standard GLMM implementations. In R, the MCMCglmm package allows for zero-inflation, or one can implement an expectation-maximization function. The problem is relatively straightforward in JAGS, or in ADMB, and one can also use the glmmADMB package in R.
authors: Ben Bolker, Mollie Brooks, Beth Gardner, Cleridy Lennert, Mihoko Minami
Skate mortality: Bayesian state-space model
The goal of the model was to obtain decadal mortality estimates of three different size classes of winter skates (Leucoraja ocellata) on the eastern Scotian Shelf. The time series are largely non-informative for several of the model parameters (catchability, recruitment rate, and stage transition probability), so informative Bayesian priors are used.
The model described here is a Bayesian state-space model implemented in both JAGS and AD Model Builder. The model description and alternative model formulations are fully described in Swain et al. (2009)
authors: Trevor Davies and Steve Martell
Tadpole mortality as a function of size
The data are originally from Vonesh and Bolker (2005), describing the numbers of reed frog (Hyperolius spinigularis) tadpoles killed by predators as a function of size in a small-scale field trial. Our main interest is in a quantitative description of the "window of vulnerability", defined as the unimodal pattern of proportion killed as a function of size. In various contexts, we can use this description either to describe and test differences among treatments (e.g., does the window of vulnerability differ by predator size, or with tadpoles exposed to different predator cues?) or to project the effects of growth and mortality rates through a life stage. See the reference above and McCoy et al. (2011) for more details and examples.
This basic example is essentially a maximum likelihood estimation problem with a binomial response variable. The data set is small, there are no random effects or latent variables, and the problem is low-dimensional, with only a single predictor and a single response variable and only three parameters in the statistical model used.
Theta-logistic population growth model
The example is a theta-logistic nonlinear state-space population model. The population size is modelled as a nonlinear function of its previous size, with a discrete-time theta-logistic process model: N(t+1)=theta-logistic(N(t)) plus a normally distributed process error, and the observation error is also normally distributed. This example uses simulated data from the same model to test it. More details are available in Pedersen et al. (2011).
author: Casper W. Berg
Weeds: Modeling weed density over time
The goal of this problem is to model weed density from 12 years of data in the form of an S-shaped curve. The data are simply 12 densities at equispaced index times. The suggested model was a three-parameter logistic function, though an extension to estimate the variance around the model is also of interest.
The problem is relatively difficult, especially in its original presentation, as it is badly scaled and there are nearly flat areas of the sum of squares or likelihood surface. Moreover, the Hessian at the solution is effectively singular, so methods based on Newton's iteration do rather badly, while crude approaches such as Nelder-Mead may do better if they can be scaled appropriately.
author: John C. Nash and Anders Nielsen and Ben Bolker
Wildflowers
These data are from E. Crone and colleagues' long-term study of stages, flowering, and seed pod production of Astragulus scaphoides. The model looks at individual flowering as a function of the previous year's stage and seed production.
This is a binomial generalized linear mixed model for flowering probability with three random effects: intercept and effect of size across individuals and intercept variation across years.
authors: Elizabeth Crone, Mollie Brooks, and Perry de Valpine