Data input/output#
sdt.io
provides convenient ways to save and load all kinds of data.
Image sequences can be saved as multi-page TIFF files with help of
save_as_tiff()
, including metadata. Using theSdtTiffStack
package, these files can be easily read again.There is support for reading single molecule data as produced by the
sdt
package and various MATLAB tools using theload()
function. Most data formats can be written bysave()
.Further, there are helpers for common filesystem-related tasks, such as the
chdir()
andget_files()
.YAML is a way of data in a both human- and machine-readable way. The
sdt.io.yaml
submodule extends PyYAML to give a nice representation ofnumpy.ndarrays
. Further, it provides a mechanism to easily add representations for custom data types.sdt.io.yaml
has support for ROI types fromsdt.roi
, slice, OrderedDict, numpy.ndarray.
Examples
Open an image sequence. Make subststacks without actually loading any data. Only load data when accessing single frames.
>>> seq = ImageSequence("images.SPE").open()
>>> len(seq)
100
>>> seq2 = seq[::2] # No data is loaded here
>>> len(seq2)
50
>>> frame = seq2[1] # Load frame 1 (i.e., frame 2 in the original `seq`)
>>> frame.shape
(100, 150)
>>> seq.close()
Save an image sequence to a TIFF file using save_as_tiff()
:
>>> with ImageSequence("images.SPE") as seq:
... save_as_tiff(seq, "images.tif")
>>> seq = [frame1, frame2, frame2] # list of arrays representing images
>>> save_as_tiff(seq, "images2.tif")
load()
supports many types of single molecule data into
pandas.DataFrame
>>> d1 = load("features.h5")
>>> d1.head()
x y signal bg mass size frame
0 97.333295 61.423270 252.900938 217.345552 1960.274055 1.110691 0
1 60.857730 82.120585 315.317311 229.205847 724.322652 0.604647 0
2 83.271210 6.144862 215.995479 224.119462 911.167854 0.819383 0
3 8.354563 33.013809 177.990405 216.341051 1284.869645 1.071868 0
4 46.215290 40.053183 207.207850 219.746090 1719.788381 1.149329 0
>>> d2 = load("tracks.trc")
>>> d2.head()
x y mass frame particle
0 14.328209 53.256334 17558.629 1.0 0.0
1 14.189825 53.204634 17850.164 2.0 0.0
2 14.371586 53.391367 18323.903 3.0 0.0
3 14.363836 53.415152 16024.740 4.0 0.0
4 14.528098 53.242159 14341.417 5.0 0.0
>>> d3 = load("features.pkc")
>>> d3.head()
x y size mass bg bg_dev frame
0 39.888750 97.023047 1.123692 8332.624410 506.853598 102.278242 0.0
1 41.918963 102.717941 1.062784 8197.686482 306.632393 126.153321 0.0
2 38.584142 66.143237 0.883132 7314.566544 273.506181 29.597416 0.0
3 68.595091 96.649889 0.904778 6837.369352 275.512017 29.935145 0.0
4 55.593909 109.955202 1.094519 7331.581064 279.787186 38.772275 0.0
Single molecule data can be saved in various formats using save()
:
>>> save("output.h5", d1)
>>> save("output.trc", d2)
Temporarily change the working directory using chdir()
:
>>> with chdir("subdir"):
... # here the working directory is "subdir"
>>> # here we are back
Recursively search files matching a regular expression in a subdirectory by
means of get_files()
:
>>> names, ids = get_files(r"^image_.*_(\d{3}).tif$", "subdir")
>>> names
['image_xxx_001.tif', 'image_xxx_002.tif', 'image_yyy_003.tif']
>>> ids
[(1,), (2,), (3,)]
sdt.io.yaml
extends PyYAML’s yaml
package and can be used
in place of it:
>>> from io import StringIO # standard library io, not sdt.io
>>> sio = StringIO() # write to StringIO instead of a file
>>> from sdt.io import yaml
>>> a = numpy.arange(10).reshape((2, -1)) # example data to be dumped
>>> yaml.safe_dump(a, sio) # sdt.io.yaml.safe_dump instead of PyYAML safe_dump
>>> print(sio.getvalue())
!array
[[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]]
Image files#
- class sdt.io.ImageSequence(uri, **kwargs)[source]#
Sliceable, lazy-loading interface to multi-image files
Single images can be retrieved by index, while substacks can be created by slicing and fancy indexing using lists/arrays of indices or boolean indices. Creating substacks does not load data into memory, allowing for dealing with containing many images.
Examples
Load 3rd frame:
>>> with ImageSequence("some_file.tif") as stack: ... img = stack[3]
Use fancy indexing to create substacks:
>>> stack = ImageSequence("some_file.tif").open() >>> len(stack) 30 >>> substack1 = stack[1::2] # Slice, will not load any data >>> len(substack2) 15 >>> np.all(substack2[1] == stack[3]) # Actually load data using int index True >>> substack2 = stack[[3, 5]] # Create lazy substack using list of indices >>> substack3 = stack[[True, False] * len(stack) // 2] # or boolean index >>> seq.close()
- Parameters:
uri (str | pathlib.Path | bytes | IO) – File or file location or data to read from.
format – File format. Use None for automatic detection.
**kwargs – Keyword arguments passed to
imageio.v3.imopen()
when opening the file.
- property is_slice: bool#
Whether this instance is the result of slicing another instance
- uri: str | Path | bytes | IO#
File or file location or data to read from.
- reader_args: Mapping#
Keyword arguments passed to
imageio.v3.imopen()
when opening file.
- get_data(t, **kwargs)[source]#
Get a single frame
- Parameters:
t (int) – Frame number
**kwargs – Additional keyword arguments to pass to the imageio plugin’s
read()
method.
- Returns:
Image data. This has a frame_no attribute holding the original frame
number.
- Return type:
Image
- get_metadata(t=None)[source]#
Get metadata for a frame
If
t
is not given, return the global metadata.- Parameters:
t (int | None) – Frame number
- Returns:
Metadata dictionary. A “frame_no” entry with the original frame
number (i.e., before slicing the sequence) is also added.
- Return type:
Dict
- get_meta_data(t=None)[source]#
Alias for
get_metadata()
- Parameters:
t (int | None) –
- Return type:
Dict
- property closed: bool#
True if the file is currently closed.
- sdt.io.save_as_tiff(filename, frames, metadata=None, contiguous=True)[source]#
Write a sequence of images to a TIFF stack
If the items in frames contain a dict named metadata, an attempt to serialize it to YAML and save it as the TIFF file’s ImageDescription tags.
- Parameters:
filename (str | Path) – Name of the output file
frames (Iterable[ndarray]) – Frames to be written to TIFF file.
metadata (None | Iterable[Mapping] | Mapping) – Metadata to be written. If a single dict, save with the first frame. If an iterable, save each entry with the corresponding frame.
contiguous (bool) – Whether to write to the TIFF file contiguously or not. This has implications when reading the data. If using PIMS, set to True. If using imageio, use
"I"
mode for reading if True. Setting to False allows for per-image metadata.
Single molecule data#
Generic functions#
- sdt.io.load(filename, typ='auto', fmt='auto', color='red')[source]#
Load localization or tracking data from file
Use the
load_*()
function appropriate for the file type in order to load the data. The file type is determined by the file’s extension or the fmt parameter.Supported file types:
HDF5 files (*.h5)
ThunderSTORM CSV files (*.csv)
particle_tracking_2D positions (*_positions.mat)
particle_tracking_2D tracks (*_tracks.mat)
pkc files (*.pkc)
pks files (*.pks)
trc files (*.trc)
- Parameters:
filename (str or pathlib.Path) – Name of the file
typ (str, optional) – If the file is HDF5, load this key (usually either “features” or “tracks”), unless it is “auto”. In that case try to read “tracks” and if that fails, try to read “features”. If the file is in particle_tracker format, this can be either “auto”, “features” or “tracks”. Defaults to “auto”.
fmt ({"auto", "hdf5", "particle_tracker", "pkc", "pks", "trc", "csv"}, optional) – Output format. If “auto”, infer the format from filename. Otherwise, write the given format.
color ({"red", "green"}, optional) – For pkc files, specify whether to load the red (default) or green channel.
- Returns:
Loaded data
- Return type:
pandas.DataFrame
- sdt.io.save(filename, data, typ='auto', fmt='auto')[source]#
Save feature/tracking data
This supports HDF5, trc, and particle_tracker formats.
- Parameters:
filename (str or pathlib.Path) – Name of the file to write to
data (pandas.DataFrame) – Data to save
typ ({"auto", "features", "tracks"}) – Specify whether to save feature data or tracking data. If “auto”, consider data tracking data if a “particle” column is present, otherwise treat as feature data.
fmt ({"auto", "hdf5", "particle_tracker", "trc"}) – Output format. If “auto”, infer the format from filename. Otherwise, write the given format.
Specific functions#
- sdt.io.load_msdplot(filename)[source]#
Load msdplot data from .mat file
- Parameters:
filename (str or pathlib.Path) – Name of the file to load
Returns –
dict([d – d is the diffusion coefficient in μm²/s, stderr its standard error, qianerr its Qian error, pa the positional accuracy in nm and emsd a pandas.DataFrame containing the msd-vs.-tlag data.
stderr – d is the diffusion coefficient in μm²/s, stderr its standard error, qianerr its Qian error, pa the positional accuracy in nm and emsd a pandas.DataFrame containing the msd-vs.-tlag data.
qianerr – d is the diffusion coefficient in μm²/s, stderr its standard error, qianerr its Qian error, pa the positional accuracy in nm and emsd a pandas.DataFrame containing the msd-vs.-tlag data.
pa – d is the diffusion coefficient in μm²/s, stderr its standard error, qianerr its Qian error, pa the positional accuracy in nm and emsd a pandas.DataFrame containing the msd-vs.-tlag data.
emsd]) – d is the diffusion coefficient in μm²/s, stderr its standard error, qianerr its Qian error, pa the positional accuracy in nm and emsd a pandas.DataFrame containing the msd-vs.-tlag data.
- sdt.io.load_pt2d(filename, typ, load_protocol=True)[source]#
Load a _positions.mat file created by particle_tracking_2D
Use
scipy.io.loadmat()
to load the file and convert data to apandas.DataFrame
.- Parameters:
filename (str or pathlib.Path) – Name of the file to load
typ ({"features", "tracks"}) – Specify whether to load feature data (positions.mat) or tracking data (tracks.mat)
load_protocol (bool, optional) – Look for a _protocol.mat file (i. e. replace the “_positions.mat” part of filename with “_protocol.mat”) in order to load the column names. This may be buggy for some older versions of particle_tracking_2D. If reading the protocol fails, this behaves as if load_protocol=False. Defaults to True.
- Returns:
Loaded data.
- Return type:
pandas.DataFrame
- sdt.io.load_pkmatrix(filename, green=False)[source]#
Load a pkmatrix from a .mat file
Use
scipy.io.loadmat()
to load the file and convert data to apandas.DataFrame
.- Parameters:
filename (str or pathlib.Path) – Name of the file to load
green (bool) – If True, load pkmatrix_green, which is the right half of the image when using
prepare_peakposition
in 2 color mode. Otherwise, load pkmatrix. Defaults to False.
- Returns:
Loaded data.
- Return type:
pandas.DataFrame
- sdt.io.load_pks(filename)[source]#
Load a pks matrix from a MATLAB file
Use
scipy.io.loadmat()
to load the file and convert data to apandas.DataFrame
.- Parameters:
filename (str or pathlib.Path) – Name of the file to load
- Returns:
Loaded data.
- Return type:
pandas.DataFrame
- sdt.io.load_trc(filename)[source]#
Load tracking data from a .trc file
- Parameters:
filename (str or pathlib.Path) – Name of the file to load
- Returns:
Loaded data.
- Return type:
pandas.DataFrame
- sdt.io.load_csv(filename)[source]#
Load localization data from a CSV file created by ThunderSTORM
- Parameters:
filename (str or pathlib.Path) – Name of the file to load
- Returns:
Single molecule data
- Return type:
pandas.DataFrame
- sdt.io.save_pt2d(filename, data, typ='tracks')[source]#
Save feature/tracking data in particle_tracker format
- Parameters:
filename (str or pathlib.Path) – Name of the file to write to
data (pandas.DataFrame) – Data to save
typ ({"features", "tracks"}) – Specify whether to save feature data or tracking data.
YAML#
- sdt.io.yaml.load(stream, Loader=<class 'sdt.io.yaml.Loader'>)[source]#
Wrap PyYAML’s
yaml.load()
usingLoader
- sdt.io.yaml.load_all(stream, Loader=<class 'sdt.io.yaml.Loader'>)[source]#
Wrap PyYAML’s
yaml.load_all()
usingLoader
- sdt.io.yaml.safe_load(stream)[source]#
Wrap PyYAML’s
yaml.load()
usingSafeLoader
- sdt.io.yaml.safe_load_all(stream)[source]#
Wrap PyYAML’s
yaml.load_all()
usingSafeLoader
- sdt.io.yaml.dump(data, stream=None, Dumper=<class 'sdt.io.yaml.Dumper'>, **kwds)[source]#
Wrapper around
yaml.dump()
usingDumper
- sdt.io.yaml.dump_all(documents, stream=None, Dumper=<class 'sdt.io.yaml.Dumper'>, **kwds)[source]#
Wrap PyYAML’s
yaml.dump_all()
usingDumper
- sdt.io.yaml.safe_dump(data, stream=None, **kwds)[source]#
Wrap PyYAML’s
yaml.dump()
usingSafeDumper
- sdt.io.yaml.safe_dump_all(documents, stream=None, **kwds)[source]#
Wrap PyYAML’s
yaml.dump_all()
usingSafeDumper
- class sdt.io.yaml.Loader(stream)[source]#
A
ArrayLoader
with support for many more typesof the
sdt
package, e. g.roi.ROI
.Initialize the scanner.
- class sdt.io.yaml.SafeLoader(stream)[source]#
A
SafeArrayLoader
with support for many more typesof the
sdt
package, e. g.roi.ROI
.Initialize the scanner.
- class sdt.io.yaml.Dumper(stream, default_style=None, default_flow_style=False, canonical=None, indent=None, width=None, allow_unicode=None, line_break=None, encoding=None, explicit_start=None, explicit_end=None, version=None, tags=None, sort_keys=True)[source]#
A
ArrayDumper
with support for many more typesof the
sdt
package, e. g.roi.ROI
.
- class sdt.io.yaml.SafeDumper(stream, default_style=None, default_flow_style=False, canonical=None, indent=None, width=None, allow_unicode=None, line_break=None, encoding=None, explicit_start=None, explicit_end=None, version=None, tags=None, sort_keys=True)[source]#
A
SafeArrayDumper
with support for many more typesof the
sdt
package, e. g.roi.ROI
.
- sdt.io.yaml.register_yaml_class(cls)[source]#
Add support for representing and loading a class
A representer is be added to
Dumper
andSafeDumper
. A loader is added toLoader
andSafeLoader
.The class should have a yaml_tag attribute and may have a yaml_flow_style attribute.
If to_yaml or from_yaml class methods exist, they will be used for representing or constructing class instances (see PyYAML’s
yaml.Dumper.add_representer()
andyaml.Loader.add_constructor()
details). Otherwise, the defaultDumper.represent_yaml_object()
andLoader.construct_yaml_object()
are used.- Parameters:
cls (type) – Class to add support for