PhD Stats Defence: Point process modelling of presence-only species data: methodological advances
Date and Time
Location
Summerlee Science Complex Room 1511
Details
CANDIDATE: JEFFREY DANIEL
ABSTRACT:
Presence-only datasets commonly arise in ecological and environmental applications. Such data are often modelled as a realization of a spatial point process, and recent work has unified several seemingly disparate presence-only methods under a point process framework. These unifying results have spurred particular interest in the use of regularized point process models in order to improve predictive performance and aid interpretation. This thesis presents several new methods for modelling presence-only species data with spatial point processes in both the regularized and unregularized settings.
In Chapter 2 we present a unified framework for fitting regularized Gibbs point process models in order to accommodate spatial dependence among presence records. Our approach encapsulates both penalized pseudolikelihood and a new approach based on penalized logistic composite likelihood, which tends to perform better in simulations. We also investigate model selection using composite information criteria and propose a new criterion, cERIC, which outperforms other criteria in a simulation study. We apply our methods in an analysis of the distribution of a rainforest tree species within the Barro Colorado Island census plot.
In Chapter 3 we propose a multivariate Poisson point process model for presence-only multispecies data. We regularize our model with an adaptive sparse group lasso penalty in order to exploit structure in the model coefficients when the species intensities being modelled depend upon overlapping subsets of covariates. We compare our regularized multivariate model with separate regularized univariate Poisson models in a simulation study and in an application modelling the distributions of 18 bumble bee species within Ontario, Canada. In both settings, the multivariate model tends to outperform the separate univariate models.
Finally, in Chapter 4 we return to the unregularized setting and derive local background sampling, an algorithm for fitting Poisson point process models via logistic regression in which background samples are sample with probability proportional to an initial pilot estimate of intensity. In simulations and in another analysis of the Ontario bumble bee data, local background sampling yields more efficient estimates of the model coefficients than the standard uniform background sampling technique.
Advisory Committee
- Prof. G. Umphrey, Advisor
- Prof. J. Horrocks, co-advisor
- Prof. J. Fryxwell
Examining Committee
- Prof. P. Kim, Chair
- Prof. G. Umphrey
- Prof. J. Horrocks
- Prof. L. Deeth
- Prof. F. Nathoo (external examiner)