Fluorescent feature localization#

Various algorithms exist for locating fluorescent features with subpixel accuracy. This package includes the following:

  • sdt.loc.daostorm_3d: An implementation of 3D-DAOSTORM [Babc2012], which is a fast Gaussian fitting algorithm based on maximum likelyhood estimation, which does several rounds of feature finding and fitting in order to fit even features which are close together.

  • sdt.loc.cg: An implementation of the feature localization algorithm created by Crocker and Grier [Croc1996], based on the implementation by the Kilfoil group.

All share a similar API modeled after trackpy’s. For each algorithm there is a locate function for locating features in a single image, a batch function for locating all features in a series of images, and locate_roi and batch_roi functions which only locate features in a given roi.PathROI.

A GUI for feature localization can be started by running python -m sdt.gui.locator on a terminal.

Additionally, one can fit the z position from astigmatism (if a zylindrical lense is inserted into the emission path) with help of the sdt.loc.z_fit module.

The get_raw_features() function allows for extracting the pixels around a localized features from an image sequence

Examples

Simulate a single molecule image with two features:

>>> img = sdt.sim.simulate_gauss([120, 100], [[30, 20], [80, 70]],
                                 [1000, 2000], [[1.1], [1.3]]) + 10
>>> img_seq = [img] * 2  # image sequence

Localize features in the image using 3D-DAOSTORM:

>>> daostorm_3d.locate(img, 1., "2d", 900)
      x     y  signal   bg          mass  size
0  80.0  70.0  2000.0  9.0  21237.166338   1.3
1  30.0  20.0  1000.0  9.0   7602.654222   1.1

Localize features in the whole sequence using 3D-DAOSTORM:

>>> daostorm_3d.batch(img_seq, 1., "2d", 900)
      x     y  signal   bg          mass  size  frame
0  80.0  70.0  2000.0  9.0  21237.166338   1.3      0
1  30.0  20.0  1000.0  9.0   7602.654222   1.1      0
2  80.0  70.0  2000.0  9.0  21237.166338   1.3      1
3  30.0  20.0  1000.0  9.0   7602.654222   1.1      1

The same with the Crocker & Grier algorithm:

>>> cg.locate(img, 3, 800, 2000)
      x     y          mass      size           ecc
0  30.0  20.0   7629.111685  1.501792  2.563823e-16
1  80.0  70.0  18272.005359  1.549778  5.062440e-16
>>> cg.batch(img_seq, 3, 800, 2000)
      x     y          mass      size           ecc  frame
0  30.0  20.0   7629.111685  1.501792  2.563823e-16      0
1  80.0  70.0  18272.005359  1.549778  5.062440e-16      0
2  30.0  20.0   7629.111685  1.501792  2.563823e-16      1
3  80.0  70.0  18272.005359  1.549778  5.062440e-16      1

To restrict feature localization to a sub-region of the image, use the locate_roi and batch_roi functions. Note that coordinates are now relative to the bounding box of the ROI.

>>> r = [[0, 20], [30, 0], [50, 20], [30, 40]]  # Diamond containing one feat
>>> daostorm_3d.locate_roi(img, r, 1., "2d", 900)
      x     y  signal   bg         mass  size
0  20.0  10.0  1000.0  9.0  7602.654223   1.1
>>> daostorm_3d.batch_roi(img_seq, r, 1., "2d", 900)
      x     y  signal   bg         mass  size  frame
0  20.0  10.0  1000.0  9.0  7602.654223   1.1      0
1  20.0  10.0  1000.0  9.0  7602.654223   1.1      1

daostorm_3d#

sdt.loc.daostorm_3d.locate(raw_image, radius, model, threshold, z_params=None, find_filter=<sdt.loc.snr_filters.Identity object>, find_filter_opts={}, min_distance=None, size_range=None, engine='numba', max_iterations=20)[source]#

Locate bright, Gaussian-like features in an image

Uses the 3D-DAOSTORM algorithm [Babc2012].

Parameters:
  • raw_image (array-like) – Raw image data

  • radius (float) – This is in units of pixels. Initial guess for the radius of the features.

  • model ({"2dfixed", "2d", "3d", "z"}) – “2dfixed” will do fixed sigma 2d Gaussian fitting. “2d” does variable sigma 2d Gaussian fitting. “3d” means that x, y sigma are independently variables (elliptical Gaussian, this is intened for determining the z position from astigmatism). When using “z”, the z position is varied to fit the signal x and y sigmas.

  • threshold (float) – A number roughly equal to the value of the brightest pixel (minus the CCD baseline) in the dimmest peak to be detected. Local maxima with brightest pixels below this threshold will be discarded.

  • z_params (z_fit.Parameters or str or pandas.DataFrame or None) – Only necessary if the model is “z” (then it cannot be None), otherwise it is ignored. One may pass a z_fit.Parameters instance or a filename to load the parameters from or a pandas.DataFrame with calibration data (z vs. size_x and size_y).

  • find_filter (str or snr_filters.SnrFilter, optional) – Apply a filter to the raw image data for the feature finding step. Fitting is still done on the original image. find_filter can be the name of one of the filters in snr_filters (i. e. “Identity”, “Cg”, or “Gaussian”) or an snr_filters.SnrFilter instance. “Identity” is the identity transform, i. e. does nothing. “Cg” is the Crocker & Grier bandpass filter, see sdt.image.filters.cg(), and “Gaussian” is a Gaussian filter. Defaults to “Identity”.

  • find_filter_opts (dict) – Parameters to be passed to the filter. For “Cg”, this is “feature_radius”, for “Gaussian”, this is “sigma”

  • min_distance (float or None, optional) – Minimum distance between two features. This can be used to suppress detection of bright features as multiple overlapping ones if threshold is rather low. If None, use radius (original 3D-DAOSTORM behavior). Defaults to None.

  • size_range (list of float or None, optional) – [min, max] of the feature sizes both in x and y direction. Features with sizes not in the range will be discarded, neighboring features will be re-fit. If None, use [0.25*radius, inf] (original 3D-DAOSTORM behavior). Defaults to None.

  • engine ({"python", "numba"}, optional) – Which engine to use for calculations. “numba” is much faster than “python”, but requires numba to be installed. Defaults to “numba”

  • max_iterations (int, optional) – Maximum number of iterations for successive peak finding and fitting. Default: 20

Returns:

x and y are the coordinates of the features. mass is the total intensity of the feature, bg the background per pixel. size gives the radii (sigma) of the features. If raw_image has a frame_no attribute, a frame column with this information will also be appended.

Return type:

DataFrame([x, y, z, signal, mass, bg, size])

sdt.loc.daostorm_3d.locate_roi(data, roi, *args, **kwargs)#

Process a ROI in an image or image sequence using locate()

This chooses a region of interest in an image (or image sequence) before calling locate().

Parameters:
  • data (image data) – Passed to locate() as the first argument.

  • roi (roi.PathROI or path) – If this isn’t already a py:class:PathROI, use it to construct one, i. e. it is passed as the first parameter to the PathROI constructor.

  • rel_origin (bool, optional) – If True, the top-left corner coordinates of the path’s bounding rectangle will be subtracted off all feature coordinates, i. e. the top-left corner will be the new origin. Defaults to True. This is a keyword-only argument.

  • *args – Positional arguments passed to locate()

  • **kwargs – Keyword arguments passed to locate()

Returns:

Result of the calls to locate() restricted to the ROI. Coordinates are given with respect to the bounding box of the ROI.

Return type:

pandas.DataFrame

sdt.loc.daostorm_3d.batch(frames, *args, **kwargs)#

Process an image stack using locate()

Apply locate() to each image in frames. The image is passed as the first argument. For details on function parameters, see the locate() documentation.

Parameters:
  • frames (iterable of images) – Iterable of array-like objects that represent image data

  • *args – Positional arguments passed to locate()

  • **kwargs – Keyword arguments passed to locate()

  • num_threads (int) – Number of CPU threads to use. Defaults to the number of CPUs.

Returns:

Concatenation of DataFrames returned by the individual locate() calls. Additionally, there is a “frame” column specifying the frame number.

Return type:

pandas.DataFrame

sdt.loc.daostorm_3d.batch_roi(data, roi, *args, **kwargs)#

Process a ROI in an image or image sequence using batch()

This chooses a region of interest in an image (or image sequence) before calling batch().

Parameters:
  • data (image data) – Passed to batch() as the first argument.

  • roi (roi.PathROI or path) – If this isn’t already a py:class:PathROI, use it to construct one, i. e. it is passed as the first parameter to the PathROI constructor.

  • rel_origin (bool, optional) – If True, the top-left corner coordinates of the path’s bounding rectangle will be subtracted off all feature coordinates, i. e. the top-left corner will be the new origin. Defaults to True. This is a keyword-only argument.

  • *args – Positional arguments passed to batch()

  • **kwargs – Keyword arguments passed to batch()

Returns:

Result of the calls to batch() restricted to the ROI. Coordinates are given with respect to the bounding box of the ROI.

Return type:

pandas.DataFrame

cg#

sdt.loc.cg.locate(raw_image, radius, signal_thresh, mass_thresh, bandpass=True, noise_radius=1)[source]#

Locate bright, Gaussian-like features in an image

This implements an algorithm proposed by Crocker & Grier [Croc1996] and is based on the implementation by the Kilfoil group, see http://people.umass.edu/kilfoil/tools.php

Parameters:
  • raw_image (array-like) – Raw image data

  • radius (int) – This should be a number a little greater than the radius of the peaks.

  • signal_thresh (float) – A number roughly equal to the value of the brightest pixel (minus the CCD baseline) in the dimmest peak to be detected. Local maxima with brightest pixels below this threshold will be discarded.

  • mass_thresh (float) – Use a number roughly equal to the integrated intensity (mass) of the dimmest peak (minus the CCD baseline) that should be detected. If this is too low more background will be detected. If it is too high more peaks will be missed.

  • bandpass (bool, optional) – Set to True to turn on bandpass filtering, false otherwise. Default is True.

  • noise_radius (float, optional) – Noise correlation length in pixels. Defaults to 1.

Returns:

x and y are the coordinates of the features. mass is the total intensity of the feature. size gives the radii of gyration of the features and ecc the eccentricity. If raw_image has a frame_no attribute, a frame column with this information will also be appended.

Return type:

DataFrame([“x”, “y”, “mass”, “size”, “ecc”])

sdt.loc.cg.locate_roi(data, roi, *args, **kwargs)#

Process a ROI in an image or image sequence using locate()

This chooses a region of interest in an image (or image sequence) before calling locate().

Parameters:
  • data (image data) – Passed to locate() as the first argument.

  • roi (path) – This is used by the sdt.roi.PathROI constructor to create the ROI

  • rel_origin (bool, optional) – If True, the top-left corner coordinates of the path’s bounding rectangle will be subtracted off all feature coordinates, i. e. the top-left corner will be the new origin. Defaults to True. This is a keyword-only argument.

  • *args – Positional arguments passed to locate()

  • **kwargs – Keyword arguments passed to locate()

Returns:

Result of the calls to locate() restricted to the ROI. Coordinates are given with respect to the bounding box of the ROI.

Return type:

pandas.DataFrame

sdt.loc.cg.batch(frames, *args, **kwargs)#

Process an image stack using locate()

Apply locate() to each image in frames. The image is passed as the first argument. For details on function parameters, see the locate() documentation.

Parameters:
  • frames (iterable of images) – Iterable of array-like objects that represent image data

  • *args – Positional arguments passed to locate()

  • **kwargs – Keyword arguments passed to locate()

Returns:

Concatenation of DataFrames returned by the individual locate() calls. Additionally, there is a “frame” column specifying the frame number.

Return type:

pandas.DataFrame

sdt.loc.cg.batch_roi(data, roi, *args, **kwargs)#

Process a ROI in an image or image sequence using batch()

This chooses a region of interest in an image (or image sequence) before calling batch().

Parameters:
  • data (image data) – Passed to batch() as the first argument.

  • roi (path) – This is used by the sdt.roi.PathROI constructor to create the ROI

  • rel_origin (bool, optional) – If True, the top-left corner coordinates of the path’s bounding rectangle will be subtracted off all feature coordinates, i. e. the top-left corner will be the new origin. Defaults to True. This is a keyword-only argument.

  • *args – Positional arguments passed to batch()

  • **kwargs – Keyword arguments passed to batch()

Returns:

Result of the calls to batch() restricted to the ROI. Coordinates are given with respect to the bounding box of the ROI.

Return type:

pandas.DataFrame

z position fitting#

By introducing a zylindrical lense into the emission pathway, the point spread function gets distorted depending on the z position of the emitter. Instead of being circular, it becomes elliptic. This can used to deteremine the z position of the emitter, provided the feature fitting algorithm supports fitting elliptical features. Currently, this is true only for sdt.loc.daostorm_3d.

class sdt.loc.z_fit.Fitter(params, resolution=0.001)[source]#

Class for fitting the z position from the elipticity of PSFs

This implements the Zhuang group’s z fitting algorithm [*]. The calibration curves for x and y are calculated from the parameters and the z position is determined by finding the minimum “distance” from the curve.

Parameters:
  • params (Parameters) – Z fit parameters

  • resolution (float, optional) – Resolution, i. e. smallest z change detectable. Defaults to 1e-3.

fit(data)[source]#

Fit the z position

Takes a pandas.DataFrame with size_x and size_y columns and writes the z column.

Parameters:

data (pandas.DataFrame) – Fitting data. There need to be size_x and size_y columns. A z column will be written with fitted z position values.

class sdt.loc.z_fit.Parameters(z_range=(-0.5, 0.5))[source]#

Z position fitting parameters

When imaging with a zylindrical lense in the emission path, round features are deformed into ellipses whose semiaxes extensions are computed as

\[w = w_0 \sqrt{1 + \left(\frac{z - c}{d}\right)^2 + a_1 \left(\frac{z - c}{d}\right)^3 + a_2 \left(\frac{z - c}{d}\right)^4 + \ldots}\]
class Tuple(w0, c, d, a)[source]#

Named tuple of the parameters for one axis

w0, c, d

\(w_0, c, d\) of the calibration curve

Type:

float

a#

Polynomial coefficients \(a_i\) of the calibration curve

Type:

numpy.ndarray

z_range#

Minimum and maximum valid z positions. Defaults to (-0.5, 0.5).

property x#

x calibration curve parameter Tuple

property y#

y calibration curve parameter Tuple

sigma_from_z(z)[source]#

Calculate x and y sigmas corresponding to a z position

Parameters:

z (numpy.ndarray) – Array of z positions

Returns:

First row contains sigmas in x direction, second row is for the y direction.

Return type:

numpy.ndarray, shape=(2, len(z))

exp_factor_from_z(z)[source]#

Calculate the factor in the exponential of the Gaussian

Calculate the factors \(s_x, s_y\) in \(A \exp(-s_x(x-x_c)^2) \exp(-s_y(y-y_c)^2)\). These factors are therefore \(\frac{1}{2\sigma^2}\).

Parameters:

z (numpy.ndarray) – Array of z positions

Returns:

First row contains s in x direction, second row is for the y direction.

Return type:

numpy.ndarray, shape=(2, len(z))

exp_factor_der(z, factor=None)[source]#

Calculate the derivative of the the exponential factor w.r.t. z

The analytical expression for this is

\[\frac{ds}{dz} = -\frac{2 w_0^2 s^2}{d} \left(2\frac{z - c}{d} + 3 a_1 \left(\frac{z - c}{d}\right)^2 + 4 a_2 \left(\frac{z - c}{d}\right)^3 + \ldots \right).\]
Parameters:
  • z (numpy.ndarray) – Array of z positions

  • factor (numpy.ndarray or None, optional) – Result of exp_factor_from_z() call. If None, it will be called in this method. The purpose of this is to speed up computation if the exponential factor has already been calculated.

Returns:

First row contains \(\frac{ds}{dz}\) in x direction, second row is for the y direction.

Return type:

numpy.ndarray, shape=(2, len(z))

save(file)[source]#

Save parameters to a yaml file

Parameters:

file (str or file-like object) – File name or file to write to

classmethod load(file)[source]#

Load parameters from a yaml file

Parameters:

file (str or file-like object) – File name or file to read from

Returns:

Class instance with parameters loaded from file

Return type:

Parameters

classmethod calibrate(loc, guess=Tuple(w0=1.0, c=0.0, d=1.0, a=array([1., 1.])), z_range=(-0.5, 0.5))[source]#

Get parameters from calibration sample

Extract fitting parameters from PSFs where the z position is known.

Parameters:
  • loc (pandas.DataFrame) – Localization data of a calibration sample. z, size_x, and size_y columns need to be present.

  • guess (ParamTuple, optional) – Initial guess for the parameter fitting. The length of the guess.a array also determines the number of polynomial parameters to be fitted. Defaults to (1., 0., 1., np.ones(2)).

  • z_range (tuple of float) – Minimum and maximum valid z positions. Defaults to (-0.5, 0.5).

Returns:

Class instance with parameters from the calibration sample

Return type:

Parameters

Raw pixel extraction#

sdt.loc.get_raw_features(loc_data, img_data, size, columns={})[source]#

Get raw image data surrounding localizations

For each localization, return the raw image data in the proximity to the feature position.

Parameters:
  • loc_data (pandas.DataFrame) – Localization data. Index should be unique.

  • img_data (list-like of numpy.ndarray) – Image sequence

  • size (int) – For each feature, return a square of 2 * size + 1 pixels.

  • columns (dict, optional) – Override default column names as defined in config.columns. Relevant names are coords and time. This means, if your DataFrame has coordinate columns “x” and “z” and the time column “alt_frame”, set columns={"coords": ["x", "z"], "time": "alt_frame"}.

Returns:

Image data for features. Uses the localization data indices as keys (therefore, the index of loc_data has to be unique) and square numpy.ndarrays of 2 * size + 1 pixels as values.

Return type:

OrderedDict

References

[Babc2012] (1,2)

Babcock et al.: “A high-density 3D localization algorithm for stochastic optical reconstruction microscopy”, Opt Nanoscopy, 2012, 1

[Croc1996] (1,2)

Crocker, J. C. & Grier, D. G.: “Methods of digital video microscopy for colloidal studies”, Journal of colloid and interface science, Elsevier, 1996, 179, 298-310