API Reference

This page documents the code in the MSIght package.

Created on Fri Nov 15 17:01:41 2024

@author: lafields2

MSIght.refactor_affine_transform.display_and_save_image(image_array, title, filename, output_directory)[source]

Displays a binary image and saves it as a PNG file.

Parameters
  • image_array (numpy.ndarray) – The binary image array to be displayed and saved. Assumes a 2D grayscale format.

  • title (str) – The title to be displayed above the image when rendered.

  • filename (str) – The name of the output file (without the file extension) used for saving the image.

  • output_directory (str) – The path to the directory where the image will be saved.

Returns

This function does not return any value.

Return type

None

Notes

  • The image is displayed using matplotlib with the ‘gray’ colormap.

  • The axis is turned off for a cleaner display.

  • The output file is saved as a PNG image in the specified directory.

  • If the output directory does not exist, an error will be raised unless handled externally.

MSIght.refactor_affine_transform.register_he_msi(cropped_image, resized_msi_image, msi_threshold, he_threshold, output_directory, sample_name)[source]

Registers a cropped H&E image to a resized MSI image using affine transformation.

Parameters
  • cropped_image (numpy.ndarray) – The cropped H&E image, expected to be grayscale or RGB.

  • resized_msi_image (numpy.ndarray) – The resized MSI image, expected to be grayscale or RGB.

  • msi_threshold (int) – Threshold value for binarizing the MSI image (0-255).

  • he_threshold (int) – Threshold value for binarizing the H&E image (0-255).

  • output_directory (str) – Directory where registration results will be saved.

  • sample_name (str) – Name used to label the saved registration output files.

Returns

  • optimal_M (numpy.ndarray) – The 3x3 affine transformation matrix obtained after optimization.

  • final_registered_image (numpy.ndarray) – The final registered binary MSI image after applying the optimal affine transformation.

Notes

  • Converts RGB images to grayscale if needed.

  • Binarizes images using specified thresholds.

  • Uses phase cross-correlation for initial alignment.

  • Optimizes alignment using Sum of Squared Differences (SSD).

  • Saves the initial and optimized registration results as PNG files.

  • Displays intermediate binary and registered images along with SSD values.

Created on Fri Nov 15 17:11:11 2024

@author: lafields2

MSIght.refactor_bspline.perform_bspline(sized_he_image, transformed_ms_image, number_histograms, gradient_tolerance, optimizer_iterations, courseness)[source]

Performs B-spline image registration between a sized H&E image and a transformed MSI image.

Parameters
  • sized_he_image (numpy.ndarray) – The fixed H&E image, expected as a 2D grayscale array.

  • transformed_ms_image (numpy.ndarray) – The moving MSI image after affine transformation, expected as a 2D grayscale array.

  • number_histograms (int) – Number of histogram bins used for Mattes mutual information metric.

  • gradient_tolerance (float) – Convergence tolerance for the gradient during optimization.

  • optimizer_iterations (int) – Maximum number of iterations for the LBFGSB optimizer.

  • courseness (int) – Controls the spacing of the B-spline grid. Larger values create a coarser grid.

Returns

final_transform – The final B-spline transform after registration.

Return type

sitk.Transform

Notes

  • Converts the input images to SimpleITK format.

  • Initializes a B-spline transform based on the specified grid spacing.

  • Uses the Mattes mutual information metric for registration.

  • Configures the optimizer with specified tolerance and iteration limits.

  • Applies the final transformation to the MSI image.

  • Displays the fixed image, affine-transformed image, and B-spline refined image.

Created on Fri Nov 15 17:19:57 2024

@author: lafields2

Performs a global proteomics search on FragPipe results by filtering peptides based on mass differences and minimum protein occurrences.

Parameters
  • fragger_results_path (str) – Path to the FragPipe results file (.tsv or .txt format).

  • threshold (float) – Mass difference threshold for filtering peptides with similar calculated masses.

  • min_prot_instances (int) – Minimum number of instances a protein must appear in the PSM report for inclusion.

  • ppm_error (float) – PPM error tolerance for mass accuracy filtering.

  • output_path (str) – Directory where the processed results will be saved.

Returns

output_path_report – Path to the saved CSV file containing the processed global proteomics search results.

Return type

str

Notes

  • Filters peptides based on unique status and mass differences.

  • Excludes proteins with fewer than min_prot_instances.

  • Calculates theoretical mass-to-charge ratios (m/z) and PPM-based error thresholds.

  • Saves the filtered results as a CSV file.

MSIght.refactor_fragger_process.process_fragger(protein_oi_list, ppm_error, psm_path, sized_he_image, output_path)[source]

Processes FragPipe PSM reports by filtering peptides based on proteins of interest and calculating mass error thresholds for mass spectrometry integration.

Parameters
  • protein_oi_list (list of str) – List of protein IDs of interest to filter from the PSM report.

  • ppm_error (float) – PPM error tolerance for mass accuracy filtering.

  • psm_path (str) – Path to the PSM report file (.tsv or .txt format).

  • sized_he_image (numpy.ndarray) – Reference H&E image used for MSI data integration (not directly used in this function).

  • output_path (str) – Directory where the processed results will be saved.

Returns

output_path_report – Path to the saved CSV file containing the processed FragPipe results.

Return type

str

Notes

  • Filters unique peptides from the PSM report based on the ‘Protein ID’ column.

  • Calculates theoretical mass-to-charge ratios (m/z) and PPM-based error thresholds.

  • Saves the processed data as a CSV file for MSIght integration.

  • If no matching proteins are found, the CSV will be empty.

MSIght.refactor_fragger_process.process_fragger_gene(gene_oi_list, ppm_error, psm_path, sized_he_image, output_path)[source]

Processes FragPipe PSM reports based on a list of genes of interest and calculates error thresholds for mass spectrometry integration.

Parameters
  • gene_oi_list (list of str) – List of genes of interest to filter from the PSM report.

  • ppm_error (float) – PPM error tolerance for mass accuracy filtering.

  • psm_path (str) – Path to the PSM report file (.tsv or .txt format).

  • sized_he_image (numpy.ndarray) – Reference H&E image used for MSI data integration (not directly used in this function).

  • output_path (str) – Directory where the processed results will be saved.

Returns

output_path_report – Path to the saved CSV file containing the processed FragPipe results.

Return type

str

Notes

  • Filters unique peptides from the PSM report based on the ‘Gene’ column.

  • Calculates theoretical mass-to-charge ratios (m/z) and PPM-based error thresholds.

  • Saves the processed data as a CSV file for MSIght integration.

  • If no matching genes are found, the CSV will be empty.

Created on Fri Nov 15 17:08:52 2024

@author: lafields2

MSIght.refactor_histology_preprocess.bin_he_image(threshold_value, red_channel)[source]

Binarizes the red channel of an H&E image using a specified threshold.

Parameters
  • threshold_value (int) – The threshold value (0-255) used for binarization.

  • red_channel (numpy.ndarray) – The extracted red channel from the H&E image.

Returns

thresholded_image – A binary image where pixels above the threshold are set to 255 (white) and others to 0 (black).

Return type

numpy.ndarray

Notes

  • Uses OpenCV’s cv2.threshold for binarization.

  • Ensure the input red channel is a 2D array of type numpy.ndarray.

MSIght.refactor_histology_preprocess.foreground_extract(image, foreground_mask)[source]

Extracts the foreground (tissue region) from an H&E image using a binary mask.

Parameters
  • image (numpy.ndarray) – The input image from which the background should be removed.

  • foreground_mask (numpy.ndarray) – A binary mask where the tissue regions are white (255) and the background is black (0).

Returns

foreground_image – The extracted foreground image with the background removed.

Return type

numpy.ndarray

Notes

  • Uses OpenCV’s cv2.bitwise_and to apply the foreground mask.

  • Pixels outside the foreground mask are set to black (0).

  • Ensure the mask and the input image have the same dimensions.

MSIght.refactor_histology_preprocess.foreground_mask_make(image)[source]

Creates a foreground mask by isolating tissue regions from an H&E image using HSV color thresholding.

Parameters

image (numpy.ndarray) – The input image in BGR format.

Returns

foreground_mask – A binary mask where the tissue regions are white (255) and the background is black (0).

Return type

numpy.ndarray

Notes

  • Converts the image to HSV color space using OpenCV.

  • Applies a color threshold to separate the background from the tissue.

  • Inverts the background mask to obtain the foreground (tissue) mask.

  • Threshold values can be adjusted for better segmentation depending on the sample.

MSIght.refactor_histology_preprocess.load_he_image(image_path)[source]

Loads an H&E image from the specified file path.

Parameters

image_path (str) – Path to the H&E image file.

Returns

image – The loaded image as a BGR array.

Return type

numpy.ndarray

Notes

  • Uses OpenCV to read the image, returning it in BGR format.

  • The image can be converted to other color formats using OpenCV functions as needed.

MSIght.refactor_histology_preprocess.preprocess_he(image_path, threshold_value, sample_name, output_directory)[source]

Preprocesses an H&E image by extracting the tissue region, binarizing, and smoothing it.

Parameters
  • image_path (str) – Path to the input H&E image file.

  • threshold_value (int) – Threshold value (0-255) for binarization of the red channel.

  • sample_name (str) – Name used for labeling the saved output file.

  • output_directory (str) – Directory where the processed image will be saved.

Returns

final_he_image – The preprocessed H&E image after binarization and smoothing.

Return type

numpy.ndarray

Notes

  • Applies several preprocessing steps:
    1. Loads the H&E image.

    2. Creates a foreground mask using HSV thresholding.

    3. Extracts the tissue region using the mask.

    4. Extracts the red channel from the tissue region.

    5. Binarizes the red channel using the given threshold.

    6. Applies Gaussian smoothing to reduce noise.

  • Displays each preprocessing step for visualization.

  • Saves the final processed image as a PNG file.

MSIght.refactor_histology_preprocess.red_channel_extract(foreground_image)[source]

Extracts the red channel from the foreground image.

Parameters

foreground_image (numpy.ndarray) – The input image from which the red channel will be extracted. Expected to be in BGR format.

Returns

red_channel – A 2D array representing the red channel of the input image.

Return type

numpy.ndarray

Notes

  • Assumes the input image is in BGR format.

  • Extracts the third channel (index 2) corresponding to the red channel.

MSIght.refactor_histology_preprocess.smooth_he_image(thresholded_image)[source]

Applies Gaussian smoothing to a binarized H&E image.

Parameters

thresholded_image (numpy.ndarray) – The binarized H&E image to be smoothed.

Returns

smoothed_image – The smoothed binary image after applying Gaussian blur.

Return type

numpy.ndarray

Notes

  • Uses OpenCV’s cv2.GaussianBlur with a kernel size of (5, 5).

  • The standard deviation for the Gaussian kernel is set to 0 (calculated automatically).

  • Smoothing reduces noise and sharp edges in the binarized image.

Created on Fri Nov 15 17:05:38 2024

@author: lafields2

MSIght.refactor_interpolation.interpolate_MSI(filename, image_path, msi_image, smoothed_image, output_directory, sample_name)[source]

Interpolates an MSI image to match the dimensions of a corresponding H&E image.

Parameters
  • filename (str) – Path to the .imzML file containing MSI data.

  • image_path (str) – Path to the corresponding H&E image file (TIFF format).

  • msi_image (numpy.ndarray) – The MSI image to be resized.

  • smoothed_image (numpy.ndarray) – The smoothed and binarized H&E image used for cropping.

  • output_directory (str) – Directory where the resized MSI image will be saved.

  • sample_name (str) – Name used for labeling the saved output file.

Returns

  • cropped_image (numpy.ndarray) – The cropped H&E image after binarization and thresholding.

  • resized_msi_image (numpy.ndarray) – The resized MSI image matching the cropped H&E image’s dimensions.

Notes

  • Extracts image dimensions from the TIFF file and the .imzML file.

  • Applies a binary mask to the smoothed H&E image.

  • Determines cropping boundaries based on tissue presence.

  • Resizes the MSI image to match the cropped H&E image’s dimensions using linear interpolation.

  • Saves the resized MSI image as a PNG file.

MSIght.refactor_interpolation.interpolate_and_visualize(filename, image_path, msi_image, smoothed_image, output_directory, sample_name, original_areas_to_zoom)[source]

Interpolates an MSI image to match the H&E image dimensions and visualizes different interpolation methods.

Parameters
  • filename (str) – Path to the .imzML file containing MSI data.

  • image_path (str) – Path to the corresponding H&E image file (TIFF format).

  • msi_image (numpy.ndarray) – The MSI image to be resized.

  • smoothed_image (numpy.ndarray) – The smoothed and binarized H&E image used for cropping.

  • output_directory (str) – Directory where the visualization output will be saved.

  • sample_name (str) – Name used for labeling the saved output file.

  • original_areas_to_zoom (dict) – Dictionary containing areas to zoom in as tuples (x1, y1, x2, y2).

Returns

  • cropped_image (numpy.ndarray) – The cropped H&E image after binarization and thresholding.

  • resized_msi_image (numpy.ndarray) – The resized MSI image matching the cropped H&E image’s dimensions.

Notes

  • Extracts image dimensions from the .imzML file and the TIFF file.

  • Binarizes the H&E image and determines cropping boundaries.

  • Adjusts zoom areas to match the resized MSI image.

  • Compares multiple interpolation methods: Bilinear, Bicubic, Nearest Neighbor, and Lanczos.

  • Displays and saves the visualization as a PNG file.

Created on Thu Nov 21 15:28:00 2024

@author: lafields2

MSIght.refactor_common_functions.apply_dimensionality_reduction(intensity_matrix, pca_components, tsne_components, tsne_perplexity, tsne_interations, tsne_learning_rate)[source]

Applies PCA and t-SNE for dimensionality reduction on an intensity matrix.

Parameters
  • intensity_matrix (numpy.ndarray) – The 2D array where rows correspond to pixels and columns correspond to m/z values.

  • pca_components (int) – Number of principal components to retain during PCA.

  • tsne_components (int) – Number of components for t-SNE dimensionality reduction.

  • tsne_perplexity (float) – Perplexity parameter for t-SNE, balancing local and global data structure.

  • tsne_iterations (int) – Number of iterations for the t-SNE optimization process.

  • tsne_learning_rate (float) – Learning rate parameter for t-SNE, controlling the step size during optimization.

Returns

  • pca_result (numpy.ndarray) – PCA-transformed matrix of reduced dimensions.

  • tsne_result (numpy.ndarray) – t-SNE-transformed matrix of reduced dimensions.

Notes

  • Applies PCA for initial dimensionality reduction to speed up t-SNE.

  • Applies t-SNE on the PCA-transformed matrix for further reduction.

  • Returns both PCA and t-SNE results for further analysis or visualization.

MSIght.refactor_common_functions.create_intensity_matrix(coordinates, mz_values, intensities)[source]

Creates an intensity matrix from preprocessed MSI data.

Parameters
  • coordinates (list of tuples) – List of pixel coordinates (x, y) from the .imzML file.

  • mz_values (list of numpy.ndarray) – List of m/z values corresponding to each pixel.

  • intensities (list of numpy.ndarray) – List of preprocessed intensity values for each pixel.

Returns

  • intensity_matrix (numpy.ndarray) – A 2D array where each row represents a pixel, and each column corresponds to a unique m/z value.

  • all_mz_values (numpy.ndarray) – A sorted array of unique m/z values across all pixels.

Notes

  • Extracts unique m/z values across all pixels.

  • Initializes an intensity matrix with zeros.

  • Fills the matrix with intensity values using np.searchsorted for fast indexing.

  • Returns the intensity matrix and the corresponding m/z values.

MSIght.refactor_common_functions.load_and_preprocess_imzml(filename, sigma, structuring_element_size)[source]

Loads and preprocesses MSI data from an .imzML file by applying Gaussian smoothing and top-hat baseline correction.

Parameters
  • filename (str) – Path to the .imzML file containing the MSI data.

  • sigma (float) – Standard deviation for Gaussian smoothing applied to the intensity values.

  • structuring_element_size (int) – Size of the structuring element used for top-hat baseline correction.

Returns

  • coordinates (list of tuples) – List of pixel coordinates (x, y) from the .imzML file.

  • mz_values (list of numpy.ndarray) – List of m/z values corresponding to each pixel.

  • intensities (list of numpy.ndarray) – List of preprocessed intensity values for each pixel.

Notes

  • Uses PyImzML to parse the .imzML file.

  • Applies Gaussian smoothing to reduce noise in the intensity spectra.

  • Applies top-hat baseline correction to remove background noise.

  • Returns preprocessed data suitable for further analysis.

Created on Fri Nov 15 17:27:23 2024

@author: lafields2

MSIght.refactor_manual_affine.manual_register_he_msi(pts_ms, pts_he, resized_msi_image, cropped_image, output_directory, sample_name)[source]

Manually registers an MSI image to an H&E image using affine transformation.

Parameters
  • pts_ms (numpy.ndarray) – Coordinates from the MSI image (source points).

  • pts_he (numpy.ndarray) – Corresponding coordinates from the H&E image (destination points).

  • resized_msi_image (numpy.ndarray) – The resized MSI image to be registered.

  • cropped_image (numpy.ndarray) – The cropped and smoothed H&E image.

  • output_directory (str) – Directory where the transformed MSI image will be saved.

  • sample_name (str) – Name used for labeling the saved output file.

Returns

M – The estimated affine transformation matrix.

Return type

numpy.ndarray

Notes

  • Uses OpenCV’s cv2.estimateAffinePartial2D to calculate the affine matrix.

  • Applies the transformation to the resized MSI image.

  • Displays and saves the transformed MSI image.

  • Assumes the input points are selected manually or computed separately.

MSIght.refactor_manual_affine.show_msi_he_coords(final_MSI_image, final_he_image)[source]

Displays the MSI and H&E images side by side with coordinates using Plotly.

Parameters
  • final_MSI_image (numpy.ndarray) – The final MSI image after interpolation and transformation.

  • final_he_image (numpy.ndarray) – The final H&E image after processing and transformation.

Return type

None

Notes

  • Uses Plotly’s imshow for interactive visualization.

  • Displays coordinate axes for better comparison.

  • Titles the plots as ‘MSI Image’ and ‘H&E Image’.

Created on Fri Nov 15 17:26:18 2024

@author: lafields2

MSIght.refactor_manual_affine_transform.manual_register_he_msi(pts_ms, pts_he, resized_msi_image, cropped_image, output_directory, sample_name)[source]

Manually registers an MSI image to an H&E image using affine transformation.

Parameters
  • pts_ms (numpy.ndarray) – Coordinates from the MSI image (source points).

  • pts_he (numpy.ndarray) – Corresponding coordinates from the H&E image (destination points).

  • resized_msi_image (numpy.ndarray) – The resized MSI image to be registered.

  • cropped_image (numpy.ndarray) – The cropped and smoothed H&E image.

  • output_directory (str) – Directory where the transformed MSI image will be saved.

  • sample_name (str) – Name used for labeling the saved output file.

Returns

  • M (numpy.ndarray) – The estimated affine transformation matrix.

  • transformed_ms_image (numpy.ndarray) – The transformed MSI image after registration.

Notes

  • Uses OpenCV’s cv2.estimateAffinePartial2D to calculate the affine matrix.

  • Applies the transformation to the resized MSI image.

  • Displays and saves the transformed MSI image alongside the original images.

  • Assumes the input points are selected manually or computed separately.

Created on Fri Nov 15 17:28:29 2024

@author: lafields2

MSIght.refactor_mz_image_extract.apply_bspline_transform_to_msi(b_spline_transform, msi_data_image)[source]

Applies a B-spline transformation to an MSI image using SimpleITK.

Parameters
  • b_spline_transform (sitk.Transform) – The B-spline transformation object obtained from registration.

  • msi_data_image (numpy.ndarray) – The MSI image to be transformed, expected as a 2D array.

Returns

transformed_msi_image – The transformed MSI image as a numpy array.

Return type

numpy.ndarray

Notes

  • Converts the input MSI image to a SimpleITK image.

  • Uses sitk.ResampleImageFilter to apply the B-spline transformation.

  • Sets the interpolator to linear and the default pixel value to 0.

  • Converts the transformed image back to a numpy array for further processing.

MSIght.refactor_mz_image_extract.extract_mz_image_transform(filename, mz, mz_tolerance, z_value, b_spline_apply, sized_he_image)[source]

Extracts an m/z image from an .imzML file, resizes it, and applies a B-spline transformation.

Parameters
  • filename (str) – Path to the .imzML file containing the MSI data.

  • mz (float) – The target m/z value to extract from the MSI data.

  • mz_tolerance (float) – Tolerance for the target m/z value during image extraction.

  • z_value (int) – Charge state value for m/z extraction.

  • b_spline_apply (sitk.Transform) – The B-spline transform to apply to the extracted m/z image.

  • sized_he_image (numpy.ndarray) – The reference H&E image used for resizing the MSI image.

Returns

msi_result – The transformed m/z image after resizing and applying the B-spline transform.

Return type

numpy.ndarray

Notes

  • Extracts an m/z image using getionimage from PyImzML.

  • Resizes the m/z image to match the H&E image’s dimensions.

  • Applies a B-spline transformation using SimpleITK if provided.

  • Returns the transformed m/z image as a numpy array.

MSIght.refactor_mz_image_extract.overlay_msi_he(msi_result, sized_he_image, mz)[source]

Overlays an MSI image onto an H&E image and displays the result.

Parameters
  • msi_result (numpy.ndarray) – The transformed MSI image.

  • sized_he_image (numpy.ndarray) – The reference H&E image.

  • mz (float) – The m/z value corresponding to the MSI image.

Return type

None

Notes

  • Ensures the MSI image has the same data type as the H&E image.

  • Uses OpenCV’s cv2.addWeighted for overlaying the images with equal weights.

  • Displays the overlay using Matplotlib.