yatsm.io package

Module contents

YATSM IO module

Todo

Include result file IO abstraction (issue 69)

Contents:

  • helpers: Collection of helper functions that ease common filesystem operations
  • readers: Collection of functions designed to ease common image or timeseries reading tasks
  • stack_line_readers: Two readers of stacked timeseries images that trade storing file handles for reducing repeated and relatively expensive file open calls
yatsm.io.find_stack_images(location, folder_pattern='L*', image_pattern='L*stack', date_index_start=9, date_index_end=16, date_format='%Y%j', ignore=['YATSM'])[source]

Find and identify dates and filenames of Landsat image stacks

Parameters:
  • location – Stacked image dataset location
  • folder_pattern – Filename pattern for stack image folders located within location (default: ‘L*’)
  • image_pattern – Filename pattern for stacked images located within each folder (default: ‘L*stack’)
  • date_index_start – Starting index of image date string within folder name (default: 9)
  • date_index_end – Ending index of image date string within folder name (default: 16)
  • date_format – String format of date within folder names (default: ‘%Y%j’)
  • ignore – List of folder names within location to ignore from search (default: [‘YATSM’])
Returns:

Tuple of lists containing the dates and filenames of all stacked

images located

Return type:

tuple

yatsm.io.mkdir_p(d)[source]

Make a directory, ignoring error if it exists (i.e., mkdir -p)

Parameters:d – directory path to create
Raises:OSError – Raise OSError if cannot create directory for reasons other than it existing already (errno 13 “EEXIST”)
yatsm.io.get_image_attribute(image_filename)[source]

Use GDAL to open image and return some attributes

Parameters:image_filename – image filename
Returns:nrow (int), ncol (int), nband (int), NumPy datatype (type)
Return type:tuple
yatsm.io.read_image(image_filename, bands=None, dtype=None)[source]

Return raster image bands as a sequence of NumPy arrays

Parameters:
  • image_filename – Image filename
  • bands – A sequence of bands to read from image. If bands is None, function returns all bands in raster. Note that bands are indexed on 1 (default: None)
  • dtype – NumPy datatype to use for image bands. If dtype is None, arrays are kept as the image datatype (default: None)
Returns:

list of NumPy arrays for each band specified

Return type:

list

Raises:
  • IOError – raise IOError if bands specified are not contained within raster
  • RuntimeError – raised if GDAL encounters errors
yatsm.io.read_pixel_timeseries(files, px, py)[source]

Returns NumPy array containing timeseries values for one pixel

Parameters:
  • files – List of filenames to read from
  • px – Pixel X location
  • py – Pixel Y location
Returns:

Array (nband x n_images) containing all timeseries data

from one pixel

Return type:

np.ndarray

yatsm.io.read_line(line, images, image_IDs, dataset_config, ncol, nband, dtype, read_cache=False, write_cache=False, validate_cache=False)[source]

Reads in dataset from cache or images if required

Parameters:
  • line – line to read in from images
  • images – list of image filenames to read from
  • image_IDs – list image identifying strings
  • dataset_config – dictionary of dataset configuration options
  • ncol – number of columns
  • nband – number of bands
  • dtype – NumPy datatype
  • read_cache – try to read from cache directory (default: False)
  • write_cache – try to to write to cache directory (default: False)
  • validate_cache – validate that cache data come from same images specified in images (default: False)
Returns:

3D array of image data (nband, n_image, n_cols)

Return type:

np.ndarray