yatsm.algorithms.ccdc module

class yatsm.algorithms.ccdc.CCDCesque(test_indices=None, estimator={'object': <MagicMock name='mock.linear_model.Lasso()' id='140477735303568'>, 'fit': {}}, consecutive=5, threshold=2.56, min_obs=None, min_rmse=None, retrain_time=365.25, screening='RLM', screening_crit=400.0, remove_noise=True, green_band=1, swir1_band=4, dynamic_rmse=False, slope_test=False, idx_slope=1, **kwargs)[source]

Bases: yatsm.algorithms.yatsm.YATSM

Initialize a CCDC-like model for data X (spectra) and Y (dates)

An unofficial and unvalidated port of the Continuous Change Detection and Classification (CCDC) algorithm by Zhu and Woodcock, 2014.

Parameters:
  • test_indices – Test for changes with these indices of Y. If not provided, all series in Y will be used as test indices
  • estimator – dictionary containing estimation model from scikit-learn used to fit and predict timeseries and, optionally, a dict of options for the estimation model fit method (default: {'object': Lasso(alpha=20), 'fit': {}})
  • consecutive – Consecutive observations to trigger change
  • threshold – Test statistic threshold for change
  • min_obs – Minimum observations in model
  • min_rmse – Minimum RMSE for models during testing
  • retrain_time – Number of days between model fit updates during monitoring period
  • screening – Style of prescreening of the timeseries for noise. Options are ‘RLM’ or ‘LOWESS’ (default: RLM)
  • screening_crit – critical value for multitemporal noise screening (default: 400.0)
  • remove_noise – Remove observation if change is not detected but first observation is above threshold (if it looks like noise) (default: True)
  • green_band – Index of green band in Y for multitemporal masking (default: 1)
  • swir1_band – Index of first SWIR band in Y for multitemporal masking (default: 4)
  • dynamic_rmse – Vary RMSE as a function of day of year (default: False)
  • slope_test – Use an additional slope test to assess the suitability of the training period. A value of True enables the test and uses the threshold parameter as the test criterion. False turns off the test or a float value enables the test but overrides the test criterion threshold. (default: False)
  • idx_slope – if slope_test is enabled, provide index of X containing slope term (default: 1)
_get_dynamic_rmse()[source]

Return the dynamic RMSE for each model

Dynamic RMSE refers to the Root Mean Squared Error calculated using self.min_obs number of observations closest in day of year to the observation self.consecutive steps into the future. Goal is to reduce false-positives during seasonal transitions (high variance in the signal) while decreasing omission during stable times of year.

Returns:dynamic RMSE of each tested model
Return type:numpy.ndarray
_get_model_rmse()[source]

Return the normal RMSE of each fitted model

Returns:RMSE of each tested model
Return type:numpy.ndarray
fit(X, Y, dates)[source]

Fit timeseries model

Parameters:
  • X – design matrix (number of observations x number of features)
  • Y – independent variable matrix (number of series x number of observations)
  • dates – ordinal dates for each observation in X/Y
Returns:

NumPy structured array containing timeseries

model attribute information

Return type:

numpy.ndarray

monitor()[source]

Monitor for changes in time series

The test criteria for CCDC can be represented as:

\[\sum_{i=0}^{\color{red}{consec}} I \left( \sqrt{ \sum_{b\in \color{red}{B_{test}}} \left( \frac {\hat\rho_{b,i} - \rho_{b,i}} {{RMSE}_b^{\color{red}{*}}} \right) ^2 } > \color{red}{T_{crit}} \right) > \color{red}{consec}\]

where the symbols in red are model hyperparameters:

If remove_noise is True, the first of consecutive observations will be removed if first scaled residual is above threshold but not all consecutive scaled residuals exceed threshold.

reset()[source]

Reset state information required for model fittings

train()[source]

Train time series model if stability criteria are met

Stability criteria (Equation 5 in Zhu and Woodcock, 2014) include a test on the change in reflectance over the training period (slope test) and a test on the magnitude of the residuals for the first and last observations in the training period. Training periods with large slopes can indicate that a disturbance process is still in progress. Large residuals on the first or last observations have high leverage on the estimated regression and should be excluded from the training period.

  1. Slope test:
\[\frac{1}{n}\sum\limits_{b\in B_{test}}\frac{ \left|\beta_{slope,b}(t_{end}-t_{start})\right|} {RMSE_b} > T_{crit}\]
  1. First and last residual tests:
\[ \begin{align}\begin{aligned}\frac{1}{n}\sum\limits_{b\in B_{test}}\frac{ \left|\hat\rho_{b,i=1} - \rho_{b,i=1}\right|} {RMSE_b} > T_{crit}\\\frac{1}{n}\sum\limits_{b\in B_{test}}\frac{ \left|\hat\rho_{b,i=N} - \rho_{b,i=N}\right|} {RMSE_b} > T_{crit}\end{aligned}\end{align} \]
can_monitor

Determine if timeseries can monitor the future consecutive obs

ndays = 365.25
record_template

YATSM record template for features in X and series in Y

Record template will set px and py if defined as class attributes. Otherwise px and py coordinates will default to 0.

Returns:
NumPy structured array containing a template of a
YATSM record
Return type:numpy.ndarray
running

Determine if timeseries can run

span_index

Return time span (in index) between start and end of model

span_time

Return time span (in days) between start and end of model