This is an old revision of the document!

Tools for Assessment of Multiple Scale Land Change Models

Tools for Assessment of Multiple Scale Land Change Models

Authors: Kristina Helle, Pedro Andrade, and Edzer Pebesma

Introduction

The complex relations between biophysical and anthropological factors generate the land change patterns of our environment. In order to study this complex phenomena, we have to rely on simulation models, for example cellular automata or agent-based models. LUCC simulation models usually generate a new map given a real world map of land cover classes. In the figure below, the left map shows the real data and the right one the simulated results.

Figure 1: Example of real world data (left) and a simulation (right). Source: Pontius (2002)

Considering that “all models are wrong, but some are useful” (Box 1999), model assessment should address the most feasible requirements, such as testing how well the model fits the data and if it is useful for certain purposes. The fit to data is addressed by goodness-of-fit tests, whereas usability should guide the choice of tests. Goodness-of-fit tests compare the predicted map with the reality at the new time.

Some authors have been proposed ways to calculate metrics trying to inform the quality of the results to the scientist. With these metrics, it is possible to calibrate or to validate the model. In fact, the final objective of these goodness-of-fit methods is to point out how to improve the model.

State of the Art

Costanza (1989) was the first author in the literature to propose a model to compare the results of a simulation with real world data in a non-exclusively pixel-by-pixel way. The author proposes a multi-resolution approach, comparing growing sets of cells, trying to measure the spatial patterns of land use variables. Given the difference on different resolutions, the method calculate a weighted average called F, and a p-value-like metric (Figure 2).

Figure 2: Multi-scale goodness-of-fit test. Source: (Costanza 1989)

Pontius (2002) realized that we only need to take into account the cells that have changed, instead of comparing the whole maps. His more flexible approach allows to explicitly separate errors of quantity and of location and to use fuzzy classification.

Land use changes may show varying behaviour on different scales. Therefore it can be useful to find the best resolution for a given purpose. Goetz and Jantz (2005) apply several measures for absolute error rate, number and shape of clusters, amount of edges, and exact position to a urban growth model.

Topics of the proposed Thesis and Questions to be answered in each work package

The following open questions can be investigated by a PhD and a Master theses (Supervisors: Edzer Pebesma, Gilberto Câmara - not confirmed).

Goodness of fit tests

“Qualitatively, these goodness-of-fit measures range from completely non-spatial (the amount of change) to non-spatial metrics (edges and clusters), to explicitly spatial.” (Jantz and Goetz, IJGIS 2005)

which kind of models are we going to address?

To validate and calibrate models of LUCC the most basic tools are goodness-of-fit tests for the prediction or the change.There exist several methods which can take into account spatial properties of the maps to be compared (Costanza 1989, Pontius 2002, Manson 2000, Jantz & Goertz 2005).

TerraME lacks procedures to calculate goodness of fit. The idea of this group is twofold. First, to implement goodness of fit tests available in the literature in TerraME. Second, to propose a model which extends these models to consider that there are errors on the classification.

A first task is to compare those tests and to investigate which of them fits best the needs of agent-based models of deforestation in Amazonia. The best test may vary depending on the aim of modelling: a good fit on the averaged deforestation rate needs only a global goodness-of-fit test whereas typical patterns need comparing geometric properties and prediction about the actual areas affected besides need comparison on pixel level.

The test results depend strongly on the resolution of the model (Jantz & Goetz 2005) therefore a multi-scale model (hierarchically nested or with resolution not constant in space) could be tested in many different ways. The ideas of multi resolution tests from Costanza (1989) may provide a possibility to aggregate test results on different resolutions. Multiple-scale models developed at IMPE already (citation?) will be the usecases and will locate the resolutions to be used. The multi-scaled reality which is needed to compare could be provided by aggregation from data at the finest resolution.

Aggregation errors are implicitly addressed by tests on multiple scales but other errors in the reality maps are not. Therefore tests which can take into account their varying correctness should be developed. Pontius (2002) approach using fuzzy classification may be modified to solve this task.

Space time

The models in the literature compare two static models to generate a metric. But how about the case where have more than two time steps? How does the error propagate in time?

Goodness of fit tests for change (focus on change, not on states at certain times) e.g. compare data(t1) ~data(t2); data(t1)~simulation(t2) ⇒ 2 matrices…; goodness of fit test for the change matrices [not about callibration; not about calibration errors]

Errors

the model may also have uncertainty in the results, due to random procedures. So, how to match this uncertaity with the uncertainty of the classification of the real world data?

~~aggregation error may be ignored because it is unbiased misclassification error: hope that it averages out by aggregation~~

~~uncertainty in the models used (cellular automata, agent based modelling) still missing: errors in basic maps~~

The models of the literature do not take into account the intrinsic errors of classification in the real world data in each of the times.

Multi-scale

A current challenge in LUCC is to develop multi-scale models. Human behaviour can only be captured at different levels. So we need procedures to goodness-of-fit at different scales. A goodness-of-fit metric of a hierarchical multi-scale model could be a simple extension of the multi-resolution procedures of the literature. But note that multi-scale models are not constrained to hierarchical models. For example, Moreira et al. (2009) propose a partially hierarchical model, where the scale below is a finer grid of only a sub-area of the upper scale. How to create goodness-of-fit metrics to this kind of model? How about error propagation between scales?

Our aim is to use this methods to derive and validate multi-scale models with hierarchical structure or varying spatial resolution.

Goodness of fit tests (multiscale): how to weight fair (as models on fine scale are usually worse for certain tests of goodness of fit); two scales in one map / several (hierarchical) maps of different resolution.

Calibration

Up to now, models in TerraME are not callibrated statistically but by expert advice. This expertise could be used together with Monte Carlo simulations to find the sensitive parameters.

To complete the design of a model calibration and final validation is needed. This requires a split of the existing data. The spatio-temporal structure of data and model suggests different designs. They will depend on stationarity of the process in space and / or time.

This choose minimum reasonable area / time for calibration

validation problem (should not use data which were used in calibration);

~~random split in space is hard / split in time also has a bias; (more complex calibration designs; ) calibration over different time lags calibration: find the important parameters, sensitivity~~

Calibrating agent-based models

Top-down models are easier to calibrate than bottom-up models as in the former demand (amount of change) and allocation can be separated.

Top-down approaches for LUCC modelling have a demand separated to the allocation. Therefore, we have to adjust the quantity and location errors separately. In the case of agent-based models, each agent decide its own demand and its location independently, making much more difficult to calibrate the model.

demand known, location to be found agent based: neither location nor demand known

Jantz and Goetz (2005)

The geometric properties of the maps (of change) are manifold and should be validated by combining several tests. Manson (2000) applied to agent-based re-forestation models.

Validation

Traditionally, models use a set of data to calibrate the model and another set to validate the method. In the case of LUCC models we can have two configurations:

Validate the model with a future time not used in the calibration procedure. Then if there is a good prediction of the new time, the model can be used for building scenarios for future speculations, such as “always-as-usual” or increasing demand.
We calibrate the model with a given data set and then use the model with the same parameters with another data set of another region. The only question this procedure can answer is whether the two populations have similar behaviour in these areas, checking if there are significant differences.

Implementation

TerraME does not include tools for model calibration nor goodness-of-fit tests yet. The latter are required before calibration and validation can be addressed. The first step is to implement Costanza's and Pontiu's models within the TerraME framework. There are some LUCC models developed by INPE's researches that can be used for testing the goodness of fit models. One example is the adaptation of CLUE, a multi-scale model, to a region in the Brazilian Amazonia (Moreira et. al 2009). Another option is the model proposed by Almeida et al. (2005), which is a cellular automata model of city growth in Sao Paulo state. Both models have been developed by INPE researches, although only the first model was already implemented in TerraME framework.

The results of goodness of fit tests may be simple numbers but often are curves or maps and probability distributions. These results must be communicated e.g. by visualization in a way that supports the usability of the models. The most important properties should have an easy interpretation and access. On the other hand, experts should be able to improve the model by thoroughly investigating the errors.

Possible extensions

The thesis could be expanded in several directions.

The first goal of change models is usually prediction of the future but for to understand the process itself it may be useful to research the other direction in time, predicting the past or the intermediate states. This might require different calibration methods.
The validation methods suggested above are only goodness-of-fit measures. The validity of the rules used in the models and the fitness for certain purposes is not addressed. For a more profound understanding of the processes, this might be an important task.
The validation problem of calibration and validation rises questions about stationarity of the processes. Comparison of locally fitted parameters may help to understand the processes, typical trajectories and areas of similar development.
Providing tools for goodness-of-fit or even calibration and validation by a web service will involve more users and help to adjust the tools to their needs.

Mobility Measures

The idea is to have a sandwich PhD. If the person comes from Brazil, he/she goes to Germany, and if the person comes from Germany, he/she goes to Brazil for a period of more or less one year.

References

C. M. ALMEIDA, A. M. V. MONTEIRO, G. CAMARA, B. S. SOARES-FILHO, G. C. CERQUEIRA, C. L.pENNACHIN, M. BATTY. GIS and remote sensing as tools for the simulation of urban land-use change International Journal of Remote Sensing Vol. 26, No. 4, 20 February 2005, 759–774

BOX, G. E. P. Statistics as a Catalyst to Learning by Scientific Methods Part II - A Discussion. XLII Annual Fall Technical Conference of the Chemical and Process Industries Division and Statistics Division of the American Society for Quality and the Section on Physical & Engineering Sciences of the American Statistical Association. 1998.

COSTANZA, R. Model Goodness of Fit: A Multiple Resolution Procedure. Ecological Modelling 47: 199-215. 1989.

JANTZ, C. A.; GOETZ, S. J. Analysis of scale dependencies in an urban land-use-change model. International Journal of Geographical Information Science. Vol. 19, No. 2, February 2005, 217–241

MANSON, S. M. Agent-based dynamic spatial simulation of land-use/cover change in the Yucatán peninsula, Mexico. 4th International Conference on Integrating GIS and Environmental Modeling (GIS/EM4): Problems, Prospects and Research Needs. Banff, Alberta, Canada, September 2 - 8, 2000.

E. MOREIRA, S. COSTA, A. P. AGUIAR, G. CAMARA, T. CARNEIRO Dynamic coupling of multiscale land change models: interactions and feedbacks across regional and local deforestation models in the Brazilian Amazonia, Ecological Modelling (submitted). 2009

PONTIUS, R. G. Statistical Methods to Partition Effects of Quantity and Location During Comparison of Categorical Maps at Multiple Resolutions. Photogrammetric Engineering & Remote Sensing Vol. 68, No. 10, October 2002, pp. 1041–1049

Trace: • goodnessoffit