Mark E. Irwin - Research

Monte Carlo Methods

With the recent advances in computing power, many problems that up to recently were infeasible to solve can now be answered. In many situations, the answers can be achieved through the use of Monte Carlo methods. While the most popular simulation based approach today is Markov chain Monte Carlo (MCMC), my interests are in the area of sequential importance sampling (SIS), also known as sequential imputation or particle filtering. In SIS, the random variables to be sampled are decomposed into blocks, and each block is sampled sequentially based on more and more of the observed data. Whereas MCMC samplers generate equally weighted, but correlated samples, SIS samplers generate independent, but unequally weighted samples. MCMC and SIS can be seen as complementary procedures. Often in problems where SIS is useful, MCMC will be less efficient, or perhaps infeasible. The opposite also occurs where MCMC will give answers when SIS cannot.

An important issue with SIS is finding a useful decomposition of the probability structure, so that sampling can be done easily and quickly and that the importance sampling weights are well behaved. Usually these are competing issues. Often the faster, easier sampling schemes tend to have poorer behaved importance weights, while the schemes that have better behaved weights are slower and more difficult to implement. The key is to find a balance between these two issues and to minimize the standard errors of the quantities estimated from the sample.

Statistical Genetics

Much of my interest in Monte Carlo methods comes from my work in statistical genetics. A problem of great interest is how to calculate linkage statistics in large pedigrees and a moderate to large number of loci. Exact calculation methods will break down in these situations. For example, peeling becomes infeasible with a moderate number of loci, while the hidden Markov model, as implemented in programs such GENEHUNTER, cannot handle moderate sized pedigrees. One approach that has been successful in dealing with large simple pedigrees, is SIS, which has been implemented in the package SIMPLE.

Danger field evolving over a 5 hour period

Command and Control

The estimation of the level of threat in a battlespace is an important consideration for a battle commander. One possible measure of threat is the danger field, which describes the expectage damage due to explosive weapons in the battlespace (Irwin et al., 2002). The figure on the right shows the evolution of the damage field as five tanks move through the battlespace over a five hour period.

A threat of particular interest to aviators is that posed by mobile anti-aircraft lauchers. Knowledge of how these launchers could be deployed is important in the planning of airstrikes into enemy territory. Underlying the forecasting of launcher locations is the historical location patterns used and how and when they change. Changes in launcher locations can be investigated via testing based of summaries of the launcher intensity patterns (Kornak et al., 2006) as estimated by a kernel intensity estimate.

Hierachical Modeling

A powerful technique for dealing with many problems in statistics today is the use of hierarchical models. They can be used in a wide range of problems, as they allow complex processes to be described by simpler, more easily understood subprocesses. For example, in pedigree analysis, one level of the hierarchy uses Mendelian laws with models of interference to describe how genetic material is passed through the family. The next level of the hierarchy describes the relationship between the observed trait data conditional on the pedigree members genetic makeup. Another example is the danger field example mentioned in the previous section. The movement of the tanks is described by a Markov model under the assumption that each of the tanks has three space-time waypoints as part of their paths. Then conditional on the tank positions at each time, there is a model describing the possible attack locations and the potential damage at each location.

One field where hierarchical modeling is particularly useful is that of environmental problems. The hierarchical structure allows for consistent modeling of the space-time structure and the interaction of a wide range of factors. For example, hierarchical modeling has been used for ozone forcasting in a five state region around Lake Michigan (McMillian et al., 2005). The hierarchical model used involves ozone transport, meteorology (winds, temperature, air pressure, etc), and an ozone region switching scheme which describes whether the region is in a period of high or low ozone levels. Another example is sea surface temperature forecasting in the tropical Pacific ocean (Berliner et al, 2000). Their model involves observed sea surface temperatures, the Southern Oscillation Index (sea level pressure), and zonal wind data. This model is based on three temperature regimes and the hierarchical structure allows for regime based temperature forecasts. To view their forecasts, go to http://www.stat.ohio-state.edu/~sses/collab_enso.php.