GooSL - pretraživanje ŠL

DIGITALNA ARHIVA ŠUMARSKOG LISTA
prilagođeno pretraživanje po punom tekstu

ŠUMARSKI LIST 3-4/2017 str. 27 <-- 27 --> PDF

Boosting algorithm – Algoritam jačanja klasifikatora
For the creation of a single collective classifier (ensemble classifier), that will classify sample trees and/or sample surfaces in site qualities, with the input of the topographical features (altitude and slope), age and canopy density, we implemented the boosting algorithm (Freund and Schapire 1997, Drucker 1997). The application of the algorithm was done with the statistical package SPSS v. 21.0 (IBM 2012).
The boosting method can be used with any type of model and can reduce variance and bias in the forecast, i.e. to increase the accuracy of the model. Boosting produces a sequence of components, namely the main (base) models, each of which shall be drawn up by the entire set of data. Before drawing each successive component, records-cases are weighted based on the errors (residuals) of the previous component. The cases with large residuals relatively weight more, so the next component focuses on better predictions of these cases. All together the components-models compose a single model (ensemble model). The single model provides values for new records using a combination rule, depending on the scale of measurement of the target variable, i.e. the dependent variable (analog or categorical).
Boosting model measures – Mjere modela jačanja klasifikatora
Accuracy is calculated for the naïve model, the reference model, i.e. the simple linear model, without application of boosting and bagging, for the ensemble model and for the basic models.
For categorical target variables, the accuracy is (IBM 2012):

where:
N             Total number of records. .
K              Number of records-cases in the training dataset.
II (π)       For any condition π, II(π) is 1 if π hold, and 0 otherwise.
f_k             Frequency for the k-th record.
y_k             Target value for the k-th record.
= T^m (X_k)                 Predicted target value of the k-th record of the m-th bootstrap sample.
T^m            Model for the m-th bootstrap sample.
X_k            Predictions for the k-th record.
For the naïve model, is the mode for categorical target variables (IBM 2012).