Welcome to Munge’s documentation!

Dataset

Class to represent a dataset as a whole or for each study

class munge.Dataset.Dataset(config_file='config.json')

Bases: object

Dataset class can be instantiated with the following args

  • parameters, types, return and return types:
Parameters:config_file (string) – full path of the application config file
asarray()

Returns the array representation of all the data points in this dataset

Returns:array of data and labels of this dataset
get_all()

Maps the images with the contours and returns a generator with data points

Returns:generator of instances of DataElement having the corresponding image and contour
get_by_study(patient_id)

Maps the images with contours and returns a generator with data points, for the given study

Parameters:patient_id – unique ID of the study
Returns:generator of instances of DataElement having the corresponding image and contour, for the given study
plot_verification_for_study(patiend_id, filename=None, rows=5, columns=5)

Plots a series of images with the corresponding contour patches for the given study

Parameters:
  • patient_id – unique ID of the study
  • filename – filename to save the plot in
  • rows – number of rows in the plot
  • columns – number of columns in the plot
to_dict(patient_id=None)

Returns the Dict representation of the dataset

Returns:Dict having id, dcm_path and contour_path attributes of the data points in this dataset

DataElement

Class to represent a data point in the dataset with relevant features and methods

class munge.DataElement.DataElement(dicom_path, icontour_path, ocontour_path=None)

Bases: object

DataElement class can be instantiated with the following args

  • parameters, types, return and return types:
Parameters:
  • dicom_path (string) – full path of the DICOM image
  • contour_path (string) – full path of the corresponding contour file
asarray()

Returns the DataElement in the form of (data, label)

Returns:array of data and labels
get_area_in_sqmm(roi='icontour')

Gets the area of the ROI in sq.mm. The conversion is done using the PixelSpacing tag of the DICOM image.

Returns:area in sq.mm
get_image_icontour_overlay(window=30, patch_color=[255, 0, 0])

Gets a bounding box around the inner contour with and without the i-contour overlaid (horizontally stacked). This will be useful for manual verification of the annotation

Parameters:
  • window – size of bounding box required around the marked contour
  • patch_color – [r, g, b] value of the color in which the patch should be overlaid
Returns:

horizontally stacked array with left image being original and the right with the patch drawn

get_image_ocontour_overlay(window=30, patch_color=[255, 0, 0])

Gets a bounding box around the outer contour with and without the o-contour overlaid (horizontally stacked). This will be useful for manual verification of the annotation

Parameters:
  • window – size of bounding box required around the marked contour
  • patch_color – [r, g, b] value of the color in which the patch should be overlaid
Returns:

horizontally stacked array with left image being original and the right with the patch drawn

get_roi_avg_relative_intensity(roi='icontour')

Gets the relative intensity (%) of the ROI. Relative intensity is w.r.t the maximum intensity of the image

Returns:average intensity in percentage
overlay_contours(window=30, patch_colors=[[0, 0, 255], [255, 0, 0]])

Overlays both inner and outer contours for visualization

Parameters:
  • window – Bounding box window size around the ROI
  • patch_colors – Array of colors for the outer and inner contours

DataLoader

Class to load data in the second stage of the pipeline

class munge.DataLoader.DataLoader(dataset)

Bases: object

DataLoader class can be instantiated with the following args

  • parameters, types, return and return types:
Parameters:dataset – instance of Dataset class
load_train_data(epochs=10, batch_size=8, log_file='data_loader.log')

Returns an array of DataElement instances split into batches and epochs

Parameters:
  • epochs – number of epochs needed
  • batch_size – number of images to be used per batch
  • log_file – path to the log_file
Returns:

array of dimension epochs x (data_size/batch_size) x batch_size containing instances of DataElement

static plot_random_epoch(data, epoch_size=10, filename=None)

Method to plot the images from a randomly selected epoch

Parameters:
  • data – return value of load_train_data function
  • epoch_size – size of epoch
  • filename – file to which the plot should be saved

ImageThresholder

Class to threshold an image and plot necessary figures related to thresholding

class munge.ImageThresholder.ImageThresholder(data_element, n_components=2, method='gmm', postprocess=False)

Bases: object

ImageThresholder class can be instantiated with the following args

  • parameters, types, return and return types:
Parameters:
  • data_element – Instance of DataElement class
  • n_components – Number of components to the model fit
dilate(thresholded_img)

Performs binary dilation on the given image using a disk-shaped structural element of arbitrary radius 3.

Parameters:thresholded_img – thresholded image
Returns:dilated image
get_jaccard_coeff()

Gets the jaccard similarity co-efficient of the reference and the detected region

Returns:jaccard coefficient
get_thresholded_contour_mask()

Thresholds the o-contour region and returns a mask

Returns:Boolean mask containing the thresholded image
plot_model_fit(filename=None)

Plots the histogram of the o-contour region with the gaussians that were used to fit the model and the selected threshold value

Parameters:filename – File path to save the plot
plot_thresholding_result(filename=None)

Plots the thresholded region along with the ground truth region overlaid in different colors

Parameters:filename – File path to save the plot

Utils

Contour related util functions

munge.utils.contour.get_dcm_num_for_contour(contour_file_name)

Gets the DICOM series number for a given contour file name or full file path

Ex: For ‘data/contourfiles/SC-HF-I-1/i-contours/IM-0001-0048-icontour-manual.txt’ the return value will be 48

Parameters:contour_file_name – name of the contour file
Returns:Integer corresponding to the DICOM series number
munge.utils.contour.parse_contour_file(filename)

Parse the given contour filename

Parameters:filename – filepath to the contourfile to parse
Returns:list of tuples holding x, y coordinates of the contour
munge.utils.contour.poly_to_mask(polygon, width, height)

Convert polygon to mask

Parameters:
  • polygon – list of pairs of x, y coords [(x1, y1), (x2, y2), …] in units of pixels
  • width – scalar image width
  • height – scalar image height
Returns:

Boolean mask of shape (height, width)

Image related util functions

munge.utils.image.get_dcm_resolution(dcm_img)

Gets the resolution of the DICOM image

Parameters:dcm_img – pydicom instance of DICOM image
Returns:Resolution of the DICOM image i.e equivalent spacing of 1 pixel in millimeters
munge.utils.image.grayscale_to_rgb(img_raw)

Converts the given grayscale image to a three channel image

Parameters:img_raw – the raw grayscale image
Returns:3 Channel RGB image
munge.utils.image.parse_dicom_file(filename)

Parse the given DICOM filename

Parameters:filename – filepath to the DICOM file to parse
Returns:dictionary with DICOM image data

Miscellaneous util functions

munge.utils.misc.csv2dict(csv_file)

Converts and returns a CSV file to Dict

Parameters:csv_file – Path to the CSV file
Returns:Dict representation of the CSV file
munge.utils.misc.get_app_config(config_file)

Gets the application configuration as a dict from the given file

Parameters:config_file – path to the configuration file
Returns:Dict containing the application configuration
munge.utils.misc.get_bounding_box_coords(contour, window=30)

Given a contour and window, get the min and max co-ordinates of a bounding box around that window

Parameters:
  • contour – Array of co-ordinates defining the contour
  • window – The window size of the bounding box
Returns:

min_x, max_x, min_y and max_y of the bounding box

munge.utils.misc.get_ocontour_for_icontour(icontour_file, ocontour_dir)

Gets the ocontour file corresponding to the given icontour_path. If the ocontour file does not exist, None is returned.

Parameters:icontour_path – Full path to the icontour file
Returns:Path of the corresponding ocontour file, if exists or None
munge.utils.misc.get_uuid()

Generates and returns a random GUID

Returns:Random GUID V4

Indices and tables