yatsm.classifiers.diagnostics module¶

class yatsm.classifiers.diagnostics.SpatialKFold(y, row, col, n_folds=3, shuffle=False, random_state=None)[source]¶

Bases: object

Spatial cross validation iterator

Training data samples physically located next to test samples are likely to be strongly related due to spatial autocorrelation. This violation of independence will artificially inflate crossvalidated measures of algorithm performance.

Provides training and testing indices to split data into training and testing sets. Splits a “Region of Interest” image into k consecutive folds. Each fold is used as a validation set once while k - 1 remaining folds form the training set.

Parameters:

y¶ – Labeled features
row¶ – Row (y) pixel location for each y
col¶ – Column (x) pixel location for each x
n_folds¶ – Number of folds (default: 3)
shuffle¶ – Shuffle the unique training data regions before splitting into batches (default: False)
random_state¶ – Pseudo-random number generator to use for random sampling. If None, default to numpy RNG for shuffling

shuffle = False¶

class yatsm.classifiers.diagnostics.SpatialKFold_ROI(roi, n_folds=3, mask_values=[0], shuffle=False, random_state=None)[source]¶

Bases: object

Spatial cross validation iterator on ROI images

Training data samples physically located next to test samples are likely to be strongly related due to spatial autocorrelation. This violation of independence will artificially inflate crossvalidated measures of algorithm performance.

Provides training and testing indices to split data into training and testing sets. Splits a “Region of Interest” image into k consecutive folds. Each fold is used as a validation set once while k - 1 remaining folds form the training set.

Parameters:

roi¶ – “Region of interest” matrix providing training data samples of some class
n_folds¶ – Number of folds (default: 3)
mask_values¶ – one or more values within roi to ignore from sampling (default: [0])
shuffle¶ – Shuffle the unique training data regions before splitting into batches (default: False)
random_state¶ – Pseudo-random number generator to use for random sampling. If None, default to numpy RNG for shuffling

shuffle = False¶

yatsm.classifiers.diagnostics.kfold_scores(X, y, algo, kf_generator)[source]¶

Performs KFold crossvalidation and reports mean/std of scores

Parameters:	X¶ – X feature input used in classification y¶ – y labeled examples algo¶ – classifier used from scikit-learn kf_generator¶ – generator for indices used in crossvalidation
Returns:	mean and standard deviation of crossvalidation scores
Return type:	(mean, std)