Analyze spatial aspects of data#
The sdt.spatial
module provides methods for analyzing spatial
aspects of single molecule data:
Check whether features have near neighbors using the
has_near_neighbor()
functionIn tracking data, interpolate features that have been missed by the localization algorithm with help of
interpolate_coords()
Calculate the area and center of mass of a polygon using
polygon_area()
andpolygon_center()
Find the smallest enclosing circle of a set of points via
smallest_enclosing_circle()
.
Examples
To find out whether single molecule features have other features nearby,
use the has_near_neighbor()
function:
>>> loc = pandas.DataFrame([[10, 10], [10, 11], [20, 20]], columns=["x", "y"])
>>> loc
x y
0 10 10
1 10 11
2 20 20
>>> has_near_neighbor(loc, r=2.)
>>> loc
x y has_neighbor
0 10 10 1
1 10 11 1
2 20 20 0
Missing localizations in single molecule tracking data can be interpolated
by interpolate_coords()
:
>>> trc = pandas.DataFrame([[10, 10, 0, 0], [10, 10, 2, 0]],
... columns=["x", "y", "frame", "particle"])
>>> trc
x y frame particle
0 10 10 0 0
1 10 10 2 0
>>> trc_i = interpolate_coords(trc)
>>> trc_i
x y frame particle interp
0 10 10 0 0 0
1 10 10 1 0 1
2 10 10 2 0 0
polygon_area()
can be used to calculate the area of a polygon:
>>> vertices = [[0, 0], [10, 0], [10, 10], [0, 10]]
>>> polygon_area(vertices)
100.0
Programming reference#
- sdt.spatial.has_near_neighbor(data, r, columns={})[source]#
Check whether localized features have near neighbors
Given a
pandas.DataFrame
data with localization data, each data point is checked whether other points (in the same frame) are closer than r.The results will be written in a “has_neighbor” column of the data DataFrame.
- Parameters:
data (pandas.DataFrame) – Localization data. The “has_neighbor” column will be appended/overwritten with the results.
r (float) – Maximum distance for data points to be considered near neighbors.
columns (dict, optional) – Override default column names as defined in
config.columns
. Relevant names are coords and time. This means, if your DataFrame has coordinate columns “x” and “z” and the time column “alt_frame”, setcolumns={"coords": ["x", "z"], "time": "alt_frame"}
.
- sdt.spatial.interpolate_coords(tracks, columns={})[source]#
Interpolate coordinates for missing localizations
For each particle in tracks, interpolate coordinates for frames where no localization was detected.
- Parameters:
tracks (pandas.DataFrame) – Tracking data
columns (dict, optional) – Override default column names as defined in
config.columns
. Relevant names are coords, time, and particle. This means, if your DataFrame has coordinate columns “x” and “z” and the time column “alt_frame”, setcolumns={"coords": ["x", "z"], "time": "alt_frame"}
.
- Returns:
Tracking data with missing frames interpolated. An “interp” column is added. If False, the localization was detected previously. If True, it was added via interpolation by this method.
- Return type:
pandas.DataFrame
- sdt.spatial.polygon_area(vertices)[source]#
Calculate the (signed) area of a simple polygon
The polygon may not self-intersect.
This is based on JavaScript code from http://www.mathopenref.com/coordpolygonarea2.html.
function polygonArea(X, Y, numPoints) { area = 0; // Accumulates area in the loop j = numPoints - 1; // The last vertex is the 'previous' one to the // first for (i=0; i<numPoints; i++) { area = area + (X[j]+X[i]) * (Y[j]-Y[i]); j = i; // j is previous vertex to i } return area/2; }
For triangles, a faster, specialized code path based on the cross product is used.
- Parameters:
vertices (Sequence[Sequence[float]]) – Coordinates of the poligon vertices.
- Returns:
Signed area of the polygon. Area is > 0 if vertices are given
counterclockwise.
- Return type:
float
- sdt.spatial.polygon_center(vertices, area=None)[source]#
Compute center of mass of a polygon
according to the formula
\[ \begin{align}\begin{aligned}C_\mathrm{x} = \frac{1}{6A} \sum_{i=0}^{n-1} (x_{i} + x_{i+1}) (x_{i}y_{i+1} - x_{i+1}y_{i})\\C_\mathrm{y} = \frac{1}{6A} \sum_{i=0}^{n-1} (y_{i} + y_{i+1}) (x_{i}y_{i+1} - x_{i+1}y_{i})\end{aligned}\end{align} \]where \(A\) is the signed polygon area as computed by
polygon_area()
. Note that the formula is valid for a closed polygon. This function also works for open polygons.- Parameters:
vertices (Sequence[Sequence[float]]) – Sequence of \((x, y)\) coordinate pairs.
area (float | None) – If already computed, pass area of the polygon for efficiency
- Return type:
Coordinates of the center of mass
- sdt.spatial.smallest_enclosing_circle(coords, shuffle=True)[source]#
Find the smallest circle enclosing a set of points
- Parameters:
coords (Sequence[Sequence[float]]) – 2D coordinates of points
shuffle (bool | RandomState) – If True, shuffle coordinate list before calculating circle. If a RandomState instance is passed, use it for shuffling. Note that coordinates should be in a random order for performance reasons.
- Return type:
Center coordinates and radius
Note
If you want to calculate the smallest enclosing circle in a numba.jit-ed function, have a look at
smallest_enclosing_circle_impl()
.
Smallest enclosing circle algorithm#
The smallest enclosing circle \(C_n\) for points \(p_1, p_2, …, p_n\) in randomized order is found iteratively. Let’s assume that \(C_{i-1}\) has already been found. If \(p_i\) lies within \(C_{i-1}\), then \(C_i = C_{i-1}\). Else, \(C_i\) needs to be the smallest enclosing circle for \(p_1, p_2, …, p_i\); \(p_i\) has to lie on the circle.
This new problem is again solved iteratively. Assume \(C'_{j-1}\) is the smallest enclosing circle for \(p_1, p_2, …, p_{j-1}\), \(j < i\) with \(p_i\) on the circle. Then \(C'_j = C'_{j-1}\) if \(p_j\) lies within \(C'_{j-1}\). Else the smallest enclosing circle for \(p_1, p_2, …, p_j\) with \(p_j\) and \(p_i\) on the circle needs to be found.
If all \(p_1, p_2, …, p_j\) lie within the circle whose diameter is the line \(l\) connecting \(p_j\) and \(p_i\), this circle is \(C'_j\). Otherwise, two possible candidates for \(C'_j\) are given by those circles through \(p_k\) on either side of the line \(l\) such that the circle centers are farthest away from \(l\). Of those two, the circle with the smaller radius is chosen.
Source: Project Nayuki, in particular this presentation.