Regions of interest (ROIs)#

The sdt.roi module provides classes for restricting data (both single molecule and image data) to a region of interest (ROI). One can specify

  • integer pixel-based recangular ROIs using ROI or

  • arbitrarily shaped ROIs with subpixel accuracy using PathROI or one of its subclasses (RectangleROI, EllipseROI) for convenience.

  • ROIs defined by boolean arrays, which are interpreted as masks (MaskROI).

Note that only the ROI class ensures accurate cropping of images. The PathROI-derivied classes will crop an image to the bounding box of the path and set any pixels not within the path to 0 (or whatever the fill_value parameter was set to). Due to rounding effects, the actual shape of the resulting images may be different from what one may expect.

All ROI classes can be serialized to YAML using sdt.io.yaml. It is also possible to load ROIs from ImageJ ROI files using load_imagej() and load_imagej_zip().

Examples

Create some data:

>>> img = numpy.zeros((150, 80))
>>> img_seq = io.ImageSequence("images.tif").open()
>>> data = pandas.DataFrame([[10, 10], [30, 30]], columns=["x", "y"])

Create simple recangular integer-pixel ROIs:

>>> r = ROI((15, 15), (120, 60))  # Specify top-left and bottom-right corner
>>> r2 = ROI((15, 15), size=(105, 45))  # Specify top-left and size

Create subpixel coordinate ROIs with arbitrary shape:

>>> # vertices of an arbitrary path
>>> pr = PathROI([[15.3, 15.1], [50.5, 15.1], [90.4, 40.7], [30.9, 70.6]])
>>> er = EllipseROI((60, 30), axes=(20, 10)) # elliptical ROI
>>> # recangular with subpixel accuracy
>>> rr = RectangleROI((15.3, 17.2), size=(100.2, 20.3))

These ROI object can now be used to select image data and single molecule data:

>>> cropped_img = r(img)
>>> cropped_img.shape
(45, 65)
>>> cropped_seq = r(img_seq)
>>> cropped_seq[0].shape
(45, 65)
>>> r(data)  # New coordinates will be w.r.t. ROI top-left corner
    x   y
1  15  15
>>> r(data, reset_origin=False)  # Don't change coordinate origin
    x   y
1  30  30

Load ROIs from ImageJ ROI files:

>>> ijr = roi.load_imagej("ij.roi")
>>> ijr
<sdt.roi.roi.ROI object at 0x7f9b9ddebf98>
>>> ijr.top_left, ijr.bottom_right
((24, 23), (97, 100))
>>> ijrs = roi.load_imagej_zip("ij.zip")
>>> ijrs
{'ij': <sdt.roi.roi.ROI at 0x7f9b9ddfccc0>,
 'ij2': <sdt.roi.roi.ROI at 0x7f9b9ddfcb38>}

Integer pixel-based rectangular ROIs#

class sdt.roi.ROI(top_left, bottom_right=None, size=None)[source]#

Rectangular region of interest in a picture

This class represents a rectangular region of interest. It can crop images or restrict data (such as feature localization data) to a specified region.

At the moment, this works only for single channel (i. e. grayscale) images.

Examples

Let f be a numpy array representing an image.

>>> f.shape
(128, 128)
>>> r = ROI((0, 0), (64, 64))
>>> f2 = r(f)
>>> f2.shape
(64, 64)
Parameters:
  • top_left (tuple of int) – Coordinates of the top-left corner. Pixels with coordinates greater or equal than these are excluded from the ROI.

  • bottom_right (tuple of int or None, optional) – Coordinates of the bottom-right corner. Pixels with coordinates greater or equal than these are excluded from the ROI. Either this or size need to specified.

  • size (tuple of int or None, optional) – Size of the ROI. Specifying size is equivalent to bottom_right=[t+s for t, s in zip(top_left, shape)]. Either this or bottom_right need to specified.

top_left#

x and y coordinates of the top-left corner. Data with coordinates greater or equal than these are excluded from the ROI.

bottom_right#

x and y coordinates of the bottom-right corner. Data with coordinates greater or equal than these are excluded from the ROI.

dataframe_mask(data, columns={})[source]#

Get boolean array describing whether localizations lie within ROI

Parameters:
  • data (DataFrame) – Localization data

  • columns (Dict) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

Returns:

  • Boolean array, one entry per line in data, which is True if the

  • localization lies within the ROI, False otherwise.

Return type:

ndarray

__call__(data, rel_origin=True, invert=False, columns={})[source]#

Restrict data to the region of interest.

If the input is localization data, it is filtered depending on whether the coordinates are within the rectangle. If it is image data, it is cropped to the rectangle.

Parameters:
  • data (pandas.DataFrame or pims.FramesSequence or array-like) – Data to be processed. If a pandas.Dataframe, select only those lines with coordinate values within the ROI. Crop the image.

  • rel_origin (bool, optional) – If True, the top-left corner coordinates will be subtracted off all feature coordinates, i. e. the top-left corner of the ROI will be the new origin. Only valid if invert is False. Defaults to True.

  • invert (bool, optional) – If True, only datapoints outside the ROI are selected. Works only if data is a pandas.DataFrame. Defaults to False.

  • columns (dict, optional) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

Returns:

Data restricted to the ROI represented by this class.

Return type:

pandas.DataFrame or slicerator.Slicerator or numpy.array

reset_origin(data, columns={})[source]#

Reset coordinates to the original coordinate system

This undoes the effect of the reset_origin parameter to __call__(). The coordinates of the top-left ROI corner are added to the feature coordinates in data.

Parameters:
  • data (pandas.DataFrame) – Localization data, modified in place.

  • columns (dict, optional) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

classmethod to_yaml(dumper, data)[source]#

Dump as YAML

Pass this as the representer parameter to yaml.Dumper.add_representer()

classmethod from_yaml(loader, node)[source]#

Construct from YAML

Pass this as the constructor parameter to yaml.Loader.add_constructor()

Arbitrary ROIs with subpixel accuracy#

class sdt.roi.PathROI(path, buffer=0.0, no_image=False)[source]#

Region of interest in a picture determined by a path

This class represents a region of interest that is described by a path. It uses matplotlib.path.Path to this end. It can crop images or restrict data (such as feature localization data) to a specified region.

This works only for paths that do not intersect themselves and for single channel (i. e. grayscale) images.

Parameters:
  • path (list of vertices or matplotlib.path.Path) – Description of the path. Either a list of vertices that will be used to construct a matplotlib.path.Path or a matplotlib.path.Path instance that will be copied.

  • buffer (float, optional) – Add extra space around the path. This, however, does not affect the size of the cropped image, which is just the size of the bounding box of the path, without buffer. Defaults to 0.

  • no_image (bool, optional) – If True, don’t compute the image mask (which is quite time consuming). This implies that this instance only works for DataFrames. Defaults to False.

path#

matplotlib.path.Path outlining the region of interest. Do not modifiy.

buffer#

Float giving the width of extra space around the path. Does not affect the size of the image, which is just the size of the bounding box of the path, without buffer. Do not modify.

bounding_box#

numpy.ndarray, shape(2, 2), dtype(float) describing the bounding box of the path, enlarged by buffer on each side.

bounding_box_int#

Smallest integer bounding box containing bounding_box

area#

Area of the ROI (without buffer)

image_mask#

Boolean pixel array; rasterized path or None if no_image=True was passed to the constructor.

dataframe_mask(data, columns={})[source]#

Get boolean array describing whether localizations lie within ROI

Parameters:
  • data (DataFrame) – Localization data

  • columns (Dict) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

Returns:

  • Boolean array, one entry per line in data, which is True if the

  • localization lies within the ROI, False otherwise.

Return type:

ndarray

__call__(data, rel_origin=True, fill_value=0, invert=False, crop=True, columns={})[source]#

Restrict data to the region of interest.

If the input is localization data, it is filtered depending on whether the coordinates are within the path. If it is image data, it is cropped to the bounding rectangle of the path and all pixels not contained in the path are set to fill_value.

Parameters:
  • data (pandas.DataFrame or pims.FramesSequence or array-like) – Data to be processed. If a pandas.Dataframe, select only those lines with coordinate values within the ROI path (+ buffer). Otherwise, slicerator.pipeline is used to crop image data to the bounding rectangle of the path and set all pixels not within the path to fill_value

  • rel_origin (bool, optional) – If True, the top-left corner coordinates of the path’s bounding rectangle will be subtracted off all feature coordinates, i. e. the top-left corner will be the new origin. If a coordinate of the bounding rectangle is negative, 0 will be used as the origin instead. This is necessary to ensure that localization data to which the ROI is applied is consistent with image data to which the ROI is applied. Only valid if invert is False. Defaults to True.

  • fill_value (number or callable, optional) – Fill value for pixels that are not contained in the path. If callable, it should take the array of pixels within the mask as its argument and return a scalar that is used as the fill value. Not applicable for single molecule data. Defaults to 0.

  • invert (bool, optional) – If True, only datapoints/pixels outside the path are selected. Defaults to False.

  • crop (bool, optional) – If True, crop image data to the (integer) bounding box of the path. Defaults to True.

  • columns (dict, optional) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

Returns:

Data restricted to the ROI represented by this class.

Return type:

pandas.DataFrame or slicerator.Slicerator or numpy.array

reset_origin(data, columns={})[source]#

Reset coordinates to the original coordinate system

This undoes the effect of the reset_origin parameter to __call__(). The coordinates of the top-left ROI corner are added to the feature coordinates in data.

Parameters:
  • data (pandas.DataFrame) – Localization data, modified in place.

  • columns (dict, optional) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

transform(trafo=None, linear=None, trans=None)[source]#

Create a new PathROI instance with a transformed path

Parameters:
  • trafo (Transform | ndarray | None) –

    Full transform. If given as a an array, it has to have the form

    a c e
    b d f
    0 0 1,
    

    where a, b, c, d give the linear part of the transform (see linear below) and e, f give the translation part (see trans below).

  • linear (ndarray | None) – Linear part (rotation, scaling, shear) of the transform, a 2x2 matrix. Only used if trafo is not given.

  • trans (ndarray | None) – Translation, 1D array of length 2. Only used if trafo is not given.

Returns:

  • ROI with transformed path and same buffer. The image mask is

  • only created if this instance has an image mask.

Return type:

PathROI

classmethod to_yaml(dumper, data)[source]#

Dump as YAML

Pass this as the representer parameter to yaml.Dumper.add_representer()

classmethod from_yaml(loader, node)[source]#

Construct from YAML

Pass this as the constructor parameter to yaml.Loader.add_constructor()

class sdt.roi.RectangleROI(top_left, bottom_right=None, size=None, buffer=0.0, no_image=False)[source]#

Rectangular region of interest in a picture

This differs from ROI in that it is derived from PathROI and thus allows for float coordinates. Also, the path can easily be transformed using matplotlib.transforms.

Parameters:
  • top_left (tuple of float) – x and y coordinates of the top-left corner.

  • bottom_right (tuple of float or None, optional) – x and y coordinates of the bottom-right corner. Either this or shape need to specified.

  • size (tuple of float or None, optional) – Size of the ROI. Specifying size is equivalent to bottom_right=[t+s for t, s in zip(top_left, shape)]. Either this or bottom_right need to specified.

  • buffer – see PathROI.

  • no_image – see PathROI.

class sdt.roi.EllipseROI(center, axes, angle=0.0, buffer=0.0, no_image=False)[source]#

Elliptical region of interest in a picture

Subclass of PathROI.

Parameters:
  • center (tuple of float) –

  • axes (tuple of float) – Lengths of first and second axis.

  • angle (float, optional) – Angle of rotation (counterclockwise, in radian). Defaults to 0.

  • buffer – see PathROI.

  • no_image – see PathROI.

Boolean mask ROIs#

class sdt.roi.MaskROI(mask, mask_origin=(0, 0), pixel_size=1.0)[source]#

Region of interest defined by a boolean mask array

This class represents a region of interest that is described by an array of boolean values It can crop images or restrict data (such as feature localization data) to a specified region.

This works only for single channel (i. e. grayscale) images.

Parameters:
  • mask (numpy.ndarray, dtype(bool)) – Set the mask attribute.

  • mask_origin (tuple of float, optional) – Set the mask_origin attribute. Defaults to (0, 0).

  • pixel_size (float, optional) – Set the pixel_size attribute. Defaults to 1.

mask#

Boolean mask array where each True entry represents a pixel with data to be accepted and each False entry represents a pixel with data to be rejected.

mask_origin#

Tuple of coordinates of the origin of the mask. This shifts the mask with respect to the data it is applied to using __call__(). These are real coordinates, not array indices (whose order would be inverted and scaled by pixel_size).

pixel_size#

Size of a pixel. Used to scale the coordinates in DataFrames correctly.

dataframe_mask(data, columns={})[source]#

Get boolean array describing whether localizations lie within mask

Parameters:
  • data (DataFrame) – Localization data

  • columns (Dict) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

Returns:

  • Boolean array, one entry per line in data, which is True if the

  • localization lies within the image mask, False otherwise.

Return type:

ndarray

__call__(data, rel_origin=True, fill_value=0, invert=False, columns={})[source]#

Restrict data to the region of interest.

If the input is localization data, it is filtered depending on whether the coordinates are within the path. If it is image data, all pixels for which the mask evaluates to False are set to fill_value.

Parameters:
  • data (pandas.DataFrame or pims.FramesSequence or array-like) – Data to be processed. If a pandas.Dataframe, select only those lines with coordinate values within the ROI path (+ buffer). Otherwise, pipeline is used to crop image data to the bounding rectangle of the path and set all pixels not within the path to fill_value

  • rel_origin (bool, optional) – If True, mask_origin will be subtracted off all feature coordinates, i. e. mask_origin will be the new origin. Only used if invert is False. Defaults to True.

  • fill_value (number or callable, optional) – Fill value for pixels that are not contained in the mask. If callable, it should take the array of pixels within the mask as its argument and return a scalar that is used as the fill value. Not applicable for single molecule data. Defaults to 0.

  • invert (bool, optional) – If True, only datapoints/pixels outside the mask are selected. Defaults to False.

  • columns (dict, optional) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

Returns:

Data restricted to the ROI represented by this class.

Return type:

pandas.DataFrame or helper.Slicerator or numpy.array

reset_origin(data, columns={})[source]#

Reset coordinates to the original coordinate system

This undoes the effect of the reset_origin parameter to __call__(). The coordinates of the top-left ROI corner are added to the feature coordinates in data.

Parameters:
  • data (pandas.DataFrame) – Localization data, modified in place.

  • columns (dict, optional) – Override default column names as defined in config.columns. The only relevant name is coords. This means, if your DataFrame has coordinate columns “x” and “z”, set columns={"coords": ["x", "z"]}.

classmethod to_yaml(dumper, data)[source]#

Dump as YAML

Pass this as the representer parameter to yaml.Dumper.add_representer()

classmethod from_yaml(loader, node)[source]#

Construct from YAML

Pass this as the constructor parameter to yaml.Loader.add_constructor()

property area#

Area of True pixels in the mask

ImageJ ROI file loading#

sdt.roi.load_imagej(file_or_data)[source]#

Load ROI from ImageJ ROI file

Parameters:

file_or_data (str or pathlib.Path or bytes or file) – Source data. A str or pathlib.Path has to point to a file that can be opened for binary reading and is seekable. If bytes, this has to be the contents of a ROI file. A file has to be opened to allow mem-mapping (“r+b”).

Returns:

ROI object representing the ROI described in the file

Return type:

roi.ROI or roi.PathROI or roi.RectangleROI or roi.EllipseROI

sdt.roi.load_imagej_zip(file)[source]#

Load ROIs from ImageJ zip file

Parameters:

file (str or pathlib.Path or zipfile.ZipFile) – Name/path of the zip file or ZipFile opened for reading

Returns:

Use the ROI names inside the zip as keys and the return values load_imagej() calls as values.

Return type:

dict of ROI objects