Differences

This shows you the differences between two versions of the page.

--- group_modelling:goodnessoffit [2009/03/19 12:22]
pedro
+++ group_modelling:goodnessoffit [2009/03/27 11:50]
inpeifgi
@@ Line 1: / Line 1: @@
-====== Tools for Assessment of Multiple Scale Land Change Models ======
+ Tools for Assessment of Multiple Scale Land Change Models
-Authors: Kristina Helle, Pedro Andrade, and Edzer Pebesma
+Kristina Helle, Pedro Andrade, and Edzer Pebesma
-=====Introduction====
+Introduction
-The complex relations between biophysical and anthropological factors generate the land change patterns of our environment. In order to study this complex phenomena, we have to rely on simulation models, for example cellular automata or agent-based models.
-LUCC simulation models usually generate a new map given a real world map of land cover classes.
+The complex relations between biophysical and anthropological factors generate the land change patterns of our environment. In order to study this complex phenomena, we have to rely on simulation models, for example cellular automata or agent-based models. LUCC simulation models usually generate a new map given a real world map of land cover classes. In the figure below, the left map shows the real data and the right one the simulated results.
-In the figure below, the left map shows the real data and the right one the simulated results.
-{{  encontros_e_eventos:simulation-real-world-pontius.jpg  }}
 Figure 1: Example of real world data (left) and a simulation (right). Source: Pontius (2002)
-\\
 Considering that "all models are wrong, but some are useful" (Box 1999), model assessment should address the most feasible requirements, such as testing how well the model fits the data and if it is useful for certain purposes.
@@ Line 16: / Line 13: @@
 Goodness-of-fit tests compare the predicted map with the reality at the new time.
 Some authors have been proposed ways to calculate metrics trying to inform the quality of the results to the scientist.
 With these metrics, it is possible to calibrate or to validate the model.
 In fact, the final objective of these goodness-of-fit methods is to point out how to improve the model.
@@ Line 27: / Line 24: @@
 \\
+Pontius (2002) realized that we only need to take into account the cells that have changed, instead of comparing the whole maps. His more flexible approach allows to explicitly separate errors of quantity and of location and to use fuzzy classification. Others like Jantz and Goetz (2005) did also address geometric porperties of the land use patterns like number and shape of clusters and length of edges. Some of these metrices are already implemented in TerraME but have not been used for testing.
-Pontius (2002) realized that we only need to take into account the cells that have changed, instead of comparing the whole maps. His more flexible approach allows to explicitly separate errors of quantity and of location and to use fuzzy classification.
+Scale (here in terms of resolution and extend) is an important property of LUCC models. Li (2000) investigated the fractal properties which are typical to many land use (change) patterns. Another approach by Jantz and Goetz (2005) compared different goodness-of-fit measures on several resolutions for an urban growth model as land use changes may show varying behaviour on different scales. But also extend can change models a lot as Kok and Veldkamp (2001) showed for national vs. multinational models.
-Land use changes may show varying behaviour on different scales. Therefore it can be useful to find the best resolution for a given purpose.
+Calibration of cellular automata or agent-based models is not a trivial task as parameters influence is in most cases non-linear and often the number of parameters is high, making comprehensive evaluation of all combinations unfeasible. Simple approaches like by Clarke et al. (1998) generate lots of simulations to be evaluated by the user, they consider interactive visualization as an important tool. Multi-resolution search of the parameter space as described in Candau (2002) may help to detect important parameter combinations and subsequently to adjust them with feasible computational effort. Still users may not find the most influential parameter combinations. This task was addressed by Miller (1998) who used several robust optimization algorithms to investigate the parameter space. Wu (2002)  uses the data to fit a prior distribution to the parameters and updates it according to the results of Monte Carlo simulations for calibrating a cellular automata model.
-Goetz and Jantz (2005) apply several measures for absolute error rate, number and shape of clusters, amount of edges, and exact position to a urban growth model.
+The diversity of LUCC models may require different calibration and validation methods. An overview over current multi-agent models is given by Parker et al. (2003).
 =====Topics of the proposed Thesis and Questions to be answered in each work package=====
-The following open questions can be investigated by a PhD and a Master theses (Supervisors: Edzer Pebesma, Gilberto Câmara - not confirmed).
+The following open questions can be investigated by a PhD and a Master theses (Supervisors: Prof. Dr. Edzer Pebesma, Prof. Dr.Gilberto Câmara - not confirmed).
 ====Goodness of fit tests====
-“Qualitatively, these goodness-of-fit measures range from completely non-spatial (the amount of change) to non-spatial metrics (edges and clusters), to explicitly spatial.” (Jantz and Goetz, IJGIS 2005)
+“Qualitatively, these goodness-of-fit measures range from completely non-spatial (the amount of change) to non-spatial metrics (edges and clusters), to explicitly spatial” (Jantz & Goetz 2005). A first task is to compare those tests and to investigate which of them fits best the needs of agent-based models of deforestation in Amazonia. The best test may vary depending on the aim of modelling: a good fit on the averaged deforestation rate needs only a global goodness-of-fit test whereas typical patterns need comparing geometric properties, and prediction about the actual areas affected besides needs comparison on pixel level.
-which kind of models are we going to address?
-To validate and calibrate models of LUCC the most basic tools are goodness-of-fit tests for the prediction or the change.There exist several methods which can take into account spatial properties of the maps to be compared (Costanza 1989, Pontius 2002, Manson 2000, Jantz & Goertz 2005).
-TerraME lacks procedures to calculate goodness of fit. The idea of this group is twofold. First, to implement goodness of fit tests available in the literature in TerraME. Second, to propose a model which extends these models to consider that there are errors on the classification.
-A first task is to compare those tests and to investigate which of them fits best the needs of agent-based models of deforestation in Amazonia. The best test may vary depending on the aim of modelling: a good fit on the averaged deforestation rate needs only a global goodness-of-fit test whereas typical patterns need comparing geometric properties and prediction about the actual areas affected besides need comparison on pixel level.
-The test results depend strongly on the resolution of the model (Jantz & Goetz 2005) therefore a multi-scale model (hierarchically nested or with resolution not constant in space) could be tested in many different ways. The ideas of multi resolution tests from Costanza (1989) may provide a possibility to aggregate test results on
-different resolutions. Multiple-scale models developed at IMPE already **(citation?)
-** will be the usecases and will locate the resolutions to be used. The multi-scaled //reality// which is needed to compare could be provided by aggregation from data at the finest resolution.
-Aggregation errors are implicitly addressed by tests on multiple scales but other errors in the //reality maps// are not. Therefore tests which can take into account their varying correctness should be developed. Pontius (2002) approach using fuzzy classification may be modified to solve this task.
 ===Space time===
-The models in the literature compare two static models to generate a metric. But how about the case where have more than two time steps?
+The models in the literature compare two static maps for goodness-of-fit testing. They do not address the errors among more than two time steps and the propagation of errors in time.
-How does the error propagate in time?
-<del>Goodness of fit tests for change (focus on change, not on states at certain times)
-e.g. compare data(t1) ~data(t2); data(t1)~simulation(t2) => 2 matrices...; goodness of fit test for the change matrices
-[not about callibration; not about calibration errors]</del>
 ===Errors===
-the model may also have uncertainty in the results, due to random procedures. So, how to match this uncertaity with the uncertainty of the classification of the real world data?
+The models of the literature do not take into account the intrinsic errors of classification in the data in each of the times. Errors can also emerge from uncertainty in the results, due to random procedures. How to separate this uncertainty from the errors of the model itself? Pontius (2002) approach using fuzzy classification could be a first attempt towards this task.
-<del>aggregation error may be ignored because it is unbiased
-misclassification error: hope that it averages out by aggregation</del>
-<del>uncertainty in the models used (cellular automata, agent based modelling)
-still missing: errors in basic maps</del>
-The models of the literature do not take into account the intrinsic errors of classification in the real world data in each of the times.
 ===Multi-scale===
-A current challenge in LUCC is to develop multi-scale models. Human behaviour can only be captured at different levels. So we need procedures to goodness-of-fit at different scales. A goodness-of-fit metric of a hierarchical multi-scale model could be a simple extension of the multi-resolution procedures of the literature. But note that multi-scale models are not constrained to hierarchical models. For example, Moreira et al. (2009) propose a partially hierarchical model, where the scale below is a finer grid of only a sub-area of the upper scale. How to create goodness-of-fit metrics to this kind of model? How about error propagation between scales?
+A current challenge in LUCC is to develop multi-scale models. Human behaviour can only be captured at different levels. Jantz and Goetz (2005) compared goodness-of-fit tests on different resolutions. But they did not address multi-scale models like the partially hierarchical model of Moreira et al. (2009), where the scale below is a finer grid of only a sub-area of the upper scale.
-Our aim is to use this methods to derive and validate multi-scale models with hierarchical structure or varying spatial resolution.
+{{  encontros_e_eventos:multi-scale-model.jpg?500  }}
+Figure 3: Multi-scale model. Source: (Moreira 2009)
-<del>Goodness of fit tests (multiscale): how to weight fair (as models on fine scale are usually worse for certain tests of goodness of fit); two scales in one map / several (hierarchical) maps of different resolution.</del>
+The test results depend strongly on the resolution of the model (Jantz & Goetz 2005) therefore a multi-scale model (hierarchically nested or with resolution not constant in space) could be tested in many different ways. The ideas of multi resolution tests from Costanza (1989) may provide a possibility to aggregate test results on different resolutions.
+====Calibration and Validation====
+To complete the design of a model calibration and final validation is needed. Both procedures require goodness-of-fit tests. Calibration should improve the most sensitive and important parameters. Validation finally tests if the calibrated model is overfitting the data or to which extent it is valid.
+Up to now, models in TerraME are not calibrated statistically but by expert advice. This expertise could be used together with Monte Carlo simulations to find the sensitive parameters.
-====Calibration====
-Up to now, models in TerraME are not callibrated statistically but by expert advice. This expertise could be used together with Monte Carlo simulations to find the sensitive parameters.
-To complete the design of a model calibration and final validation is needed. This requires a split of the existing data. The spatio-temporal structure of data and model suggests different designs. They will depend on stationarity of the process in space and / or time.
-This choose minimum reasonable area / time for calibration
-validation problem (should not use data which were used in calibration);
-<del>random split in space is hard / split in time also has a bias; (more complex calibration designs; )
-calibration over different time lags
-calibration: find the important parameters, sensitivity</del>
-===Calibrating agent-based models===
 Top-down models are easier to calibrate than bottom-up models as in the former demand (amount of change) and allocation can be separated.
+Top-down approaches for LUCC modelling have a demand separated to the allocation. Therefore, we have to adjust the quantity and location errors separately. In the case of agent-based models, each agent decide its own demand and its location independently, making much more difficult to calibrate the model. Jantz and Goetz (2005) and Manson (2000) have studied calibration of bottom-up approaches.
-Top-down approaches for LUCC modelling have a demand separated to the allocation. Therefore, we have to adjust the quantity and location errors separately. In the case of agent-based models, each agent decide its own demand and its location independently, making much more difficult to calibrate the model.
-demand known, location to be found
-agent based: neither location nor demand known
-Jantz and Goetz (2005)
-The geometric properties of the maps (of change) are manifold and should be validated by combining several tests. Manson (2000) applied to agent-based re-forestation models.
-====Validation====
 Traditionally, models use a set of data to calibrate the model and another set to validate the method. In the case of LUCC models
@@ Line 121: / Line 71: @@
 The first step is to implement Costanza's and Pontiu's models within the TerraME framework. There are some LUCC models developed by INPE's researches that can be used for testing the goodness of fit models. One example is the adaptation of CLUE, a multi-scale model, to a region in the Brazilian Amazonia (Moreira et. al 2009). Another option is the model proposed by Almeida et al. (2005), which is a cellular automata model of city growth in Sao Paulo state. Both models have been developed by INPE researches, although only the first model was already implemented in TerraME framework.
 The results of goodness of fit tests may be simple numbers but often are curves or maps and probability distributions.
 These results must be communicated e.g. by visualization in a way that supports the usability of the models. The most important properties should have an easy interpretation and access. On the other hand, experts should be able to improve the model by thoroughly investigating the errors.
@@ Line 130: / Line 80: @@
   - The validation problem of calibration and validation rises questions about stationarity of the processes. Comparison of locally fitted parameters may help to understand the processes, typical trajectories and areas of similar development.
   - Providing tools for goodness-of-fit or even calibration and validation by a web service will involve more users and help to adjust the tools to their needs.
 =====Mobility Measures=====
-The idea is to have a sandwich PhD. If the person comes from Brazil, he/she goes to Germany, and if the person comes from Germany, he/she goes to Brazil for a period of more or less one year.
+The topic is at the overlap of the research at INPE (agent-based and cellular automata models of LUCC) and IFGI (statistics, calibration / validation of models). Therefore the theses should take place as sandwich (exchange: PhD 6-12 months, MSc 2-3 months), starting either at INPE or IFGI.
 ===== References =====
@@ Line 138: / Line 96: @@
 BOX, G. E. P. {{http://ecow.engr.wisc.edu/cgi-bin/get/ie/691/barrios/papers/box-1999.pdf|Statistics as a Catalyst to Learning by Scientific Methods Part II - A Discussion}}. XLII Annual Fall Technical Conference of the Chemical and Process Industries Division and Statistics Division of the American Society for Quality and the Section on Physical & Engineering Sciences of the American Statistical Association. 1998.
+CANDAU, J., 2002, Temporal calibration sensitivity of the SLEUTH urban growth model.
+Masters thesis, Department of Geography, University of California.
+CLARKE, K.; HOPPEN, S. & GAYDOS, L. (1998): {{http://www.ncgia.ucsb.edu/conf/SANTA_FE_CD-ROM/sf_papers/clarke_keith/clarkeetal.html|Methods And Techniques for Rigorous Calibration of a Cellular Automaton Model of Urban Growth}}. (accessed 25.03.09)
 COSTANZA, R. {{encontros_e_eventos:inpeifgi2009:costanza_em_1989.pdf|Model Goodness of Fit: A Multiple Resolution Procedure}}. Ecological Modelling 47: 199-215. 1989.
-JANTZ, C. A.; GOETZ, S. J. {{group_modelling:analysis_of_scale_dependencies_in_an_urban_land-use-change_model.pdf|Analysis of scale dependencies in an urban land-use-change model}}. International Journal of Geographical Information Science. Vol. 19, No. 2, February 2005, 217–241
+JANTZ, C. A.; GOETZ, S. J. {{group_modelling:analysis_of_scale_dependencies_in_an_urban_land-use-change_model.pdf|Analysis of scale dependencies in an urban land-use-change model}}. International Journal of Geographical Information Science. Vol. 19, No. 2, February 2005, 217–241.
+KOK, K. & VELDKAMP, A. Evaluating impact of spatial scales on land use pattern
+analysis in Central America. Agriculture, Ecosystems and Environment 85 (2001) 205–221.
+LI, B.-L. 2000. {{group_modelling:fractal_geometry_applications_in_description_and_analysis_of_patch_patterns_and_patch_dynamics.pdf|Fractal geometry applications in description and analysis of patch patterns and patch dynamics. }}. Ecological Modelling 132 (1/2): 33–50.
 MANSON, S. M. {{encontros_e_eventos:inpeifgi2009:agent-based_dynamic_spatial_simulation_of_land-use_cover_change.pdf|Agent-based dynamic spatial simulation of land-use/cover change in the Yucatán peninsula, Mexico}}. 4th International Conference on Integrating GIS and Environmental Modeling (GIS/EM4):
 Problems, Prospects and Research Needs. Banff, Alberta, Canada, September 2 - 8, 2000.
+MILLER, J. H. 1998. {{group_modelling:active_nonlinear_tests_ants_of_complex_simulation_models.pdf|Active nonlinear tests (ANTs) of complex
+simulation models}}. Management Science 44 (6): 820–30.
 E. MOREIRA, S. COSTA, A. P. AGUIAR, G. CAMARA, T. CARNEIRO Dynamic coupling of multiscale land change models: interactions and feedbacks across regional and local deforestation models in the Brazilian Amazonia, Ecological Modelling (//submitted//). 2009
+PARKER, D. C.; Manson, S. M.; JANSSEN, M. A.; HOFFMANN, M. J. & DEADMAN, P. 2003 {{group_modelling:multi-agent_systems_for_the_simulation_of_land-use_and_land-cover_change_a_review.pdf|Multi-Agent Systems for the Simulation of Land-Use
+and Land-Cover Change: A Review}}. Annals of the Association of American Geographers 93 (2): 314–337.
 PONTIUS, R. G. {{encontros_e_eventos:inpeifgi2009:pontius_2002_pers.pdf|Statistical Methods to Partition Effects of Quantity and Location During Comparison of Categorical Maps at Multiple Resolutions}}. Photogrammetric Engineering & Remote Sensing
 Vol. 68, No. 10, October 2002, pp. 1041–1049
+WU, F., 2002, Calibration of stochastic cellular automata: the application to rural-urban land
+conversions. International Journal of Geographical Information Science, 16,
+pp. 795–818.
+==not directly relevant==
+GILES, R. H., Jr. & TRANI, M. K..1999. {{http://www.springerlink.com/content/je6g2v0khpl93gpc/fulltext.pdf|Key elements of landscape pattern measures}}. Environmental Management 23 (4):477–81.)

Trace:

Differences

Navigation

Search