yatsm.algorithms.ccdc module¶
-
class
yatsm.algorithms.ccdc.
CCDCesque
(test_indices=None, estimator={'object': <MagicMock name='mock.linear_model.Lasso()' id='140477735303568'>, 'fit': {}}, consecutive=5, threshold=2.56, min_obs=None, min_rmse=None, retrain_time=365.25, screening='RLM', screening_crit=400.0, remove_noise=True, green_band=1, swir1_band=4, dynamic_rmse=False, slope_test=False, idx_slope=1, **kwargs)[source]¶ Bases:
yatsm.algorithms.yatsm.YATSM
Initialize a CCDC-like model for data X (spectra) and Y (dates)
An unofficial and unvalidated port of the Continuous Change Detection and Classification (CCDC) algorithm by Zhu and Woodcock, 2014.
Parameters: - test_indices¶ – Test for changes with these
indices of
Y
. If not provided, all series inY
will be used as test indices - estimator¶ – dictionary containing estimation model from
scikit-learn
used to fit and predict timeseries and, optionally, a dict of options for the estimation modelfit
method (default:{'object': Lasso(alpha=20), 'fit': {}}
) - consecutive¶ – Consecutive observations to trigger change
- threshold¶ – Test statistic threshold for change
- min_obs¶ – Minimum observations in model
- min_rmse¶ – Minimum RMSE for models during testing
- retrain_time¶ – Number of days between model fit updates during monitoring period
- screening¶ – Style of prescreening of the timeseries for noise. Options are ‘RLM’ or ‘LOWESS’ (default: RLM)
- screening_crit¶ – critical value for multitemporal noise screening (default: 400.0)
- remove_noise¶ – Remove observation if change is not detected but first observation is above threshold (if it looks like noise) (default: True)
- green_band¶ – Index of green band in
Y
for multitemporal masking (default: 1) - swir1_band¶ – Index of first SWIR band in
Y
for multitemporal masking (default: 4) - dynamic_rmse¶ – Vary RMSE as a function of day of year (default: False)
- slope_test¶ – Use an additional slope test to assess the suitability of the training period. A value of True enables the test and uses the threshold parameter as the test criterion. False turns off the test or a float value enables the test but overrides the test criterion threshold. (default: False)
- idx_slope¶ – if
slope_test
is enabled, provide index ofX
containing slope term (default: 1)
-
_get_dynamic_rmse
()[source]¶ Return the dynamic RMSE for each model
Dynamic RMSE refers to the Root Mean Squared Error calculated using self.min_obs number of observations closest in day of year to the observation self.consecutive steps into the future. Goal is to reduce false-positives during seasonal transitions (high variance in the signal) while decreasing omission during stable times of year.
Returns: dynamic RMSE of each tested model Return type: numpy.ndarray
-
_get_model_rmse
()[source]¶ Return the normal RMSE of each fitted model
Returns: RMSE of each tested model Return type: numpy.ndarray
-
fit
(X, Y, dates)[source]¶ Fit timeseries model
Parameters: Returns: - NumPy structured array containing timeseries
model attribute information
Return type:
-
monitor
()[source]¶ Monitor for changes in time series
The test criteria for CCDC can be represented as:
\[\sum_{i=0}^{\color{red}{consec}} I \left( \sqrt{ \sum_{b\in \color{red}{B_{test}}} \left( \frac {\hat\rho_{b,i} - \rho_{b,i}} {{RMSE}_b^{\color{red}{*}}} \right) ^2 } > \color{red}{T_{crit}} \right) > \color{red}{consec}\]where the symbols in red are model hyperparameters:
\(\color{red}{consec}\):
consecutive
\(\color{red}{B_{test}}\):
test_indices
\(\color{red}{T_{crit}}\):
threshold
\({RMSE}_b^{\color{red}{*}}\) depends on:
-
- True:
_get_dynamic_rmse()
is used for RMSE - False:
_get_model_rmse()
is used for RMSE
- True:
-
- If a float or int is given, override RMSE estimate
if estimate is smaller than
min_rmse
- If a float or int is given, override RMSE estimate
if estimate is smaller than
-
If
remove_noise
is True, the first ofconsecutive
observations will be removed if first scaled residual is abovethreshold
but not allconsecutive
scaled residuals exceedthreshold
.
-
train
()[source]¶ Train time series model if stability criteria are met
Stability criteria (Equation 5 in Zhu and Woodcock, 2014) include a test on the change in reflectance over the training period (slope test) and a test on the magnitude of the residuals for the first and last observations in the training period. Training periods with large slopes can indicate that a disturbance process is still in progress. Large residuals on the first or last observations have high leverage on the estimated regression and should be excluded from the training period.
- Slope test:
\[\frac{1}{n}\sum\limits_{b\in B_{test}}\frac{ \left|\beta_{slope,b}(t_{end}-t_{start})\right|} {RMSE_b} > T_{crit}\]- First and last residual tests:
\[ \begin{align}\begin{aligned}\frac{1}{n}\sum\limits_{b\in B_{test}}\frac{ \left|\hat\rho_{b,i=1} - \rho_{b,i=1}\right|} {RMSE_b} > T_{crit}\\\frac{1}{n}\sum\limits_{b\in B_{test}}\frac{ \left|\hat\rho_{b,i=N} - \rho_{b,i=N}\right|} {RMSE_b} > T_{crit}\end{aligned}\end{align} \]
-
can_monitor
¶ Determine if timeseries can monitor the future consecutive obs
-
ndays
= 365.25¶
-
record_template
¶ YATSM record template for features in X and series in Y
Record template will set px and py if defined as class attributes. Otherwise px and py coordinates will default to 0.
Returns: - NumPy structured array containing a template of a
- YATSM record
Return type: numpy.ndarray
-
running
¶ Determine if timeseries can run
-
span_index
¶ Return time span (in index) between start and end of model
-
span_time
¶ Return time span (in days) between start and end of model
- test_indices¶ – Test for changes with these
indices of