CNN -> Datasets

Image Augmentation

class tardis_em.cnn.datasets.augmentation.CenterCrop(size: tuple)

CENTER CROP ARRAY

Rescale the image and mask to a given size for 3D [DxHxW], or 2D images [HxW].

Parameters:

size (tuple) – Output size of image in DHW/HW.

class tardis_em.cnn.datasets.augmentation.RandomFlip

180 RANDOM FLIP ARRAY

Perform 180 degree flip randomly in z,x or y axis for 3D or 4D

  • 0 is z axis, 1 is x axis, 2 is y axis for 3D

  • 0 is x axis, 1 is y axis for 2D

class tardis_em.cnn.datasets.augmentation.RandomRotation

MULTIPLE 90 RANDOM ROTATIONS

Perform 90, 180 or 270-degree rotations for 2D or 3D in left or right

  • 0 is 90, 1 is 180, 2 is 270

  • 0 is left, 1 is right

class tardis_em.cnn.datasets.augmentation.ComposeRandomTransformation(transformations: list)

RANDOM TRANSFORMATION WRAPPER

Double wrapper for image and mask to perform random transformation

Parameters:

transformations – list of transforms objects from which single or multiple transformations will be selected.

tardis_em.cnn.datasets.augmentation.preprocess(image: ~numpy.ndarray, transformation: bool, size: tuple | int | None = <class 'int'>, mask: ~numpy.ndarray | None = None, output_dim_mask=1) Tuple[ndarray, ndarray] | ndarray

Module to augment dataset.

Parameters:
  • image (np.ndarray) – 2D/3D array with image data.

  • mask (np.ndarray, optional) – 2D/3D array with semantic labels.

  • transformation (bool) – If True, perform transformation on img and mask with the same random effect.

  • size (tuple, int) – Image size output for center crop.

  • output_dim_mask (int) – Number of output channel dimensions for label mask.

Returns:

Image and optionally label mask after transformation.

Return type:

np.ndarray

DataLoader

class tardis_em.cnn.datasets.dataloader.CNNDataset(img_dir: str, mask_dir: str, size=64, mask_suffix='_mask', transform=True, out_channels=1)

DATASET BUILDER FOR IMAGES AND SEMANTIC LABEL MASKS FOR TRAINING

Parameters:
  • img_dir (str) – Source of the 2D/3D .tif file.

  • mask_dir (str) – Source of the 2D/3D .tif images masks.

  • size (int) – Output patch size for image and mask.

  • mask_suffix (str) – Suffix name for mask images.

  • transform (bool) – Call for random transformation on img and mask.

  • out_channels (int) – Number of output channels.

class tardis_em.cnn.datasets.dataloader.PredictionDataset(img_dir: str, out_channels=1)

DATASET BUILDER FOR IMAGES AND SEMANTIC LABEL MASKS FOR PREDICTION

Module has turn off all transformations.

Parameters:
  • img_dir (str) – Source of the 2D/3D .tif file.

  • out_channels (int) – Number of output channels.

Dataset Builder

tardis_em.cnn.datasets.build_dataset.build_train_dataset(dataset_dir: str, circle_size: int, resize_pixel_size: float | None, trim_xy: int, trim_z: int, benchmark=False, correct_pixel_size=None)

Module for building train datasets from compatible files.

This module builds a train dataset from each file in the specified dir. Module working on the file as .tif/.mrc/.rec/.am and mask in image format of .csv and .am: - If mask is .csv, module expect image file in format of .tif/.mrc/.rec/.am - If mask is .am, module expect image file in format of .am

For the given dir, module recognizes file, and then for each file couple (e.g. image and mask) if needed (e.g. mask is not .tif) it’s build mask from the point cloud. For this module, expect that mask file (.csv, .am) contains coordinates values. Then for each point module draws 2D label.

These image arrays are then scaled up or down to fit given pixel size, and finally arrays are trimmed with overlay (stride) for specific size.

Files are saved in ./dir/train/imgs and ./dir/train/masks.

Mask is computed as np.uint8 arrays and images as np.float64 normalized image value between 0 and 1.

Parameters:
  • dataset_dir (str) – Directory with train test folders.

  • circle_size (int) – Size of the segmented object in A.

  • resize_pixel_size (float, None) – Pixel size for image resizing.

  • trim_xy (int) – Voxel size of output image in x and y dimension.

  • trim_z (int) – Voxel size of output image in z dimension.

  • benchmark (bool) – If True construct data for benchmark.

  • correct_pixel_size (float, None) – Optionally, correction for pixel size value.

tardis_em.cnn.datasets.build_dataset.load_img_mask_data(image: str, mask: str) Tuple[ndarray, ndarray, ndarray]

Define file format and load adequately image and mask/coordinate file.

Expected combination are:
  • Amira (image) + Amira (coord)

  • Amira (image) + csv (coord)

  • Amira (image) + Amira (mask) or MRC/REC (mask)

  • Amira (image) + tif (mask)

  • MRC/REC(image) + Amira (coord)! Need to check if coordinate is not transformed!

  • MRC/REC(image) + csv (coord)

  • MRC/REC(image) + Amira (mask) or MRC/REC (mask)

  • MRC/REC (image) + tif (mask)

Parameters:
  • image (str) – Directory address to the image file

  • mask (str) – Directory address to the mask/coordinate file

Returns:

Ndarray of image, mask, and pixel size

Return type:

Tuple[np.ndarray, np.ndarray, np.ndarray]

tardis_em.cnn.datasets.build_dataset.error_log_build_data(dir_: str, log_file: ndarray, id_: int, i: str) ndarray

Update log file with error for data that could not be loaded

Parameters:
  • dir (str) – Save directory.

  • log_file (np.ndarray) – Current log file list.

  • id (int) – Data id.

  • i (Str) – Data name.

Returns:

List of updated logfile

Return type:

list