yatsm.algorithms.yatsm module¶
Yet Another TimeSeries Model baseclass
-
class
yatsm.algorithms.yatsm.YATSM(test_indices=None, estimator={'object': <MagicMock name='mock.linear_model.Lasso()' id='139886293970320'>, 'fit': {}}, **kwargs)[source]¶ Bases:
objectYet Another TimeSeries Model baseclass
Note
When
YATSMobjects are fit, the intended order of method calls is:Setup the model with
setup()Preprocess a time series for one unit area with
preprocess()Fit the time series with the YATSM model using
fit()A fitted model can be used to
Note
Record structured arrays must contain the following:
start(int): starting dates of timeseries segmentsend(int): ending dates of timeseries segmentsbreak(int): break dates of timeseries segmentscoef(double (n x p shape)): number of bands x number of features coefficients matrix for predictionsrmse(double (n length)): Root Mean Squared Error for each bandpx(int): pixel X coordinatepy(int): pixel Y coordinate
Parameters: - test_indices¶ – Test for changes with these
indices of
Y. If not provided, all series inYwill be used as test indices - estimator¶ – dictionary containing estimation model from
scikit-learnused to fit and predict timeseries and, optionally, a dict of options for the estimation modelfitmethod (default:{'object': Lasso(alpha=20), 'fit': {}}) - kwargs¶ – dictionary of addition keyword arguments (for sub-classes)
Variables: - record_template (numpy.ndarray) – An empty NumPy structured array that
is a template for the model’s
record - models (numpy.ndarray) – prediction model objects
- record (numpy.ndarray) – NumPy structured array containing timeseries model attribute information
- n_record (int) – number of recorded segments in time series model
- n_series (int) – number of bands in
Y - px (int) – pixel X location or index
- n_features (int) – number of coefficients in
Xdesign matrix - py (int) – pixel Y location or index
-
fit(X, Y, dates)[source]¶ Fit timeseries model
Parameters: Returns: NumPy structured array containing timeseries model attribute information
Return type:
-
fit_models(X, Y, bands=None)[source]¶ Fit timeseries models for bands within Y for a given X
Updates or initializes fit for
self.modelsParameters:
-
predict(X, dates, series=None)[source]¶ Return a 2D NumPy array of y-hat predictions for a given X
Predictions are made from ensemble of timeseries models such that predicted values are generated for each date using the model from the timeseries segment that intersects each date.
Parameters: - X¶ – Design matrix (number of observations x number of features)
- dates¶ – A single ordinal date or a np.ndarray of length X.shape[0] specifying the ordinal dates for each prediction
- series¶ – Return prediction for subset of series within timeseries model. If None is provided, returns predictions from all series
Returns: Prediction for given X (number of series x number of observations)
Return type:
-
preprocess(X, Y, dates, min_values=None, max_values=None, mask_band=None, mask_values=None, **kwargs)[source]¶ Preprocess a unit area of data (e.g., pixel, segment, etc.)
This preprocessing step will remove all observations that either fall outside of the minimum/maximum range of the data or are flagged for masking in the
mask_bandvariable inY. Ifmin_valuesormax_valuesare not specified, this masking step is skipped. Similarly, masking based on a QA/QC or cloud mask will not be performed ifmask_bandormask_valuesare not provided.Parameters: - X¶ – design matrix (number of observations x number of features)
- Y¶ – independent variable matrix (number of series x number of observations)
- dates¶ – ordinal dates for each observation in X/Y
- min_values¶ – Minimum possible range of values for each variable in Y (optional)
- max_values¶ – Maximum possible range of values for each variable in Y (optional)
- mask_band¶ – The mask band in Y (optional)
- mask_values¶ – A list or np.ndarray of values in the
mask_bandto mask (optional)
Returns: tuple – X, Y, and dates after
being preprocessed and masked
Return type: np.ndarray, np.ndarray, np.ndarray
-
score(X, Y, dates)[source]¶ Return timeseries model performance scores
Parameters: Returns: performance summary statistics
Return type: namedtuple
-
setup(df, **config)[source]¶ Setup model for input dataset and (optionally) return design matrix
Parameters: Returns: return design matrix if used by algorithm
Return type: numpy.ndarray or None
-
record_template¶ YATSM record template for features in X and series in Y
Record template will set
pxandpyif defined as class attributes. Otherwisepxandpycoordinates will default to 0.Returns: NumPy structured array containing a template of a YATSM record Return type: numpy.ndarray