Essa é uma revisão anterior do documento!

Tabela de conteúdos

Read Papers

Read Papers

Multi-Temporal Analysis

Modeling Cyclic Change

Advances in Conceptual Modeling, 2010.

Hornsby, K., Egenhofer, M.J., Hayes, P.

The most general model of time in a temporal logic represents time as an arbitrary, partially-ordered set [11, 12]. The addition of axioms result in more refined models of time [11]. In the linear model, an axiom imposes total order on time, resulting in the linear advancement of time from the past, through the present, and to the future. The branching model, also known as the “possible futures” model, describes time as being linear from the past to the present, where it then divides into several time-lines, each representing a potential sequence of events. Few of these models, however, explicitly treat cycles. Although current information systems are useful for producing a snapshot of a phenomenon at any one time, cyclically-varying phenomena require new solutions. Temporal data models are commonly based on the primitive elements of either time points or time intervals. Time points typically describe a precise time when an event occurred. A linear model based on time points assumes a set of time points that are totally ordered [12]. When precise information on time is unavailable, time intervals become useful constructs. Reasoning about temporal intervals addresses the problem that much of our temporal knowledge is relative and methods are needed that allow for significant imprecision in reasoning The linear or branching models of time do not treat the fact that certain events or phenomena may be recurring.

Fast subsequence matching in time-series databases

ACM SIGMOD, 2004

Faloutsos, C., Ranganathan, M., Manolopoulos, Y.

The F-index works as follows: Given N sequences, all of the same length n, we apply the n-point Fourier Transform coefficients. Discrete (DFT). apply the n-point DFT on the sequence Q, we keep the f features, thus mapping Q into a f-dimensional point ~t in feature space; (b) we use the F-index to retrieve all the points within distance c from qf; We keep the first few (2-3) coefficients the features,

Change Detection

Mining Image Time-Series

IGARSS, 2004.

Heas, P., Datcu, M., Giros, A.

This paper introduces the concept of two different representations for image time series:

“The spatio-temporal representation is simply the time-series of images.”
“The multitemporal feature space representation is a multidimensional space composed by the union of all the time localized feature components and for which the spatial index is hidden.”

The proposal is to search for similar trajectories in time series. The proposed method is based on Mutual Information, which quantifies the amount of information that is contributed by one random variable into another. According to the paper, “mutual information evolutions are significant of the degree of change of the multitemporal cluster during time. Set of consecutive nodes can be grouped by similarity according to this measurement.” The detection of similarities in the oriented graphs is done by discovering the probability density function (PDF) of the nodes, and measuring the entropy of these functions, also considering the irregular time sampling rate.

Land Cover Change Detection: A Case Study

ACM SIGKDD international conference on knowledge discovery and data mining, 2008.

Boriah, S., Kumar, V., Steinbach, M., Potter, C., Klooster, S.

This paper proposes a new technique for change detection in satellite image time series. Change detection has been already studied in many research areas, such as statistics, signal processing, and control theory. However, the employed techniques in these areas are not well suited to huge amount of data present in Earth Science. In this article they analyzed a vegetation-related variable, “the enhanced vegetation index (EVI) product, measured by MODIS. EVI is a vegetation index which essentially serves as a measure of the amount and 'greenness' of vegetation at a particular location.”

The proposed technique for change detection is called Recursive Merging, and is based on some premises:

Given the large coverage of land cover data sets, it is fairly obvious that only a small fraction of points will actually exhibit a change.
Natural seasonal growing cycle is a dominant characteristic of a time series and this intrinsic seasonality should not itself be called a change.
The main idea is to exploit seasonality in order to distinguish between points that have had a land cover change and those that have not.

The algorithm runs as follows:

The two most similar consecutive annual cycles are merged, and the distance is stored.
Step 1 is applied recursively until one annual cycle is left remaining.
The change score for this location is based on whether any of the observed distances are extreme, by calculating the ratio between the minimum and the maximum distances.

If the change score is greater than a certain threshold, the point is then classified as a change.

Definition of land cover change, by the authors: “The land cover change detection problem is essentially one of detecting when the land cover at a given location has been converted from one type to another”.

On Extracting Evolutions from Satellite Image Time Series

IGARSS, 2008.

Andreea, J., Meger, N., Trouve, E., Bolon, P.

The authors use Satellite Image Time Series (SITS) to extract pixel-based evolutions and to detect spatio-temporal patterns in the images. The first technique is based on pixel quantization. Pixel values are quantized in a set of non-overlapping intervals equally populated. This reduces the amount of values, and therefore the evolution is described as a sequence of quantized values along the time. If using more than one image channel, for each time is created a sequence of values. To this sequence of values is then aggregated the subsequent snapshots, and the final sequence defines the evolution. Then, “pixels having the same evolution at the same dates are then set to the same color”.

The quantization reduces the amount of data, and therefore reduces the computational time to reach results, however, it may produce only rough results. Another drawback of this technique is that evolutions are only classified as similar if they occurred in the same interval, but sometimes the same event can occur in different intervals, and still should be classified as the same evolution.

Applying case-based reasoning in the evolution of deforestation patterns in the Brazilian Amazonia

ACM symposium on Applied computing, 2008.

Mota, J., Câmara, G., Fonseca, L., Escada, M., Bittencourt, O.

Starting from one initial pattern of deforestation, they aim to recover the object's history, based on a set of similar cases of deforestation progress. For this purpose they use a Case-Based system, which stores previous cases, and “continually increases allowing adapting cases with similar characteristics that can be useful to solve a new problem”.

The database of cases were generated in the Agrarian Settlements Project called Vale do Anari, in Rondônia. The main spatial patterns are called Linear, Irregular, and Geometric. The stages of deforestation were defined as Road, Small Lots, and Large Concentration. Such typology were applied to describe the deforestation evolution in another area.

Data Mining

Maximizing Land Cover Classification Accuracies Produced by Decision Trees at Continental to Global Scales

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 37, NO. 2, MARCH 1999

Friedl, M., Brodley, C., Strahler, A.

boosting algorithms estimate multiple classifications in an iterative fashion using the base classification algorithm (in this case C5.0). At each iteration, a weight is assigned to each training observation. Those observations that were misclassified in the previous iteration are assigned a heavier weight in the next iteration, thereby forcing the classification algorithm to concentrate on those observations that are more difficult to classify. We therefore conclude that boosting is a useful technique and should be used for land cover classifi- cation problems using remotely sensed data at continental to global scales. Second, adding features related to vegetation phenology produced little improvement to classification accuracy. the use of geographic position provides substantial predictive power to the decision tree classification algorithms. Stated another way, it is not surprising that geographic position has relatively high predictive power when classifying fairly coarse classes of vegetation at continental and global scales. geographic position should only be used as a secondary input feature used to discriminate between land cover classes that are spectrally similar, but geographically distinct.

Quantification of live aboveground forest biomass dynamics with Landsat time-series and field inventory data: A comparison of empirical modeling approaches

Remote Sensing of Environment, 2010.

Powell, S., Cohen, W., Healey, S., Kennedy, R., Moisen, G., Pierce, K., Ohmann, J.

The authors modeled tree biomass in a 20 year time series of Landsat images (northern Arizona and northern Minnesota) and tested three statistical techniques to derive trajectories of evolution.

Reduced Major Axis regression: an orthogonal regression technique that aims to minimize error in both X and Y directions.
Gradient Nearest Neighbor imputation …
Random Forests regression trees: RF is a non- parametric ensemble modeling approach that constructs numerous small regression trees that vote on predictions, and is considered to be robust to over-fitting.

The following variables were employed to model live aboveground tree biomass:

Raw Landsat bands (B1–B5, B7, as surface reflectance)
Tasseled Cap indices (brightness (TCB), greenness (TCG), and wetness (TCW))
NDVI, ( ( B4 − B3 ) / ( B4 + B3 ) )

The results obtained from the three techniques were compared to a validation data set, in terms of root mean square error (RMSE), variance ratio (VR) and bias (difference between observed and predicted values). According to the authors, in terms of RMSE, the technique of RF obtained the best results. Therefore, if minimizing the prediction error is the main objective, RF is suggested. Another conclusion is that “although there is significant modeling error with biomass prediction, the temporal analysis ensures that at least the models are consistent across the time-series, and, therefore, the relative changes are potentially accurate”.

Feature-selection ability of the decision-tree algorithm and the impact of feature-selection/extraction on decision-tree results based on hyperspectral data

International Journal of Remote Sensing, 2008.

Wang, Y., Li, J.

Decision Tree (DT) was tested as a feature selection algorithm, using hyperspectral data. Feature selection is defined by the authors as imperative for massive amounts of data. According to the authors, “feature selection results of DT are those features that are used to form splitting rules at internal tree nodes”, i.e. the attributes of the splitting nodes are the most relevant ones. C4.5 algorithm were employed in this article.

DT as a feature selector is proportional to the amount of samples (“tends to select too many features when the sample size is large”), and they suggested to use another algorithm to select features (mainly with small sample sizes), and to allow repeated use of features to form split nodes.

Another suggestion is to use “feature extraction”, which means creating new features by applying combinations on already existent features. However, “although extracted features have higher discrimination power, their physical meanings were hard to explain, which lowered the interpretability of classification trees”.