API Reference
This page documents the code in the MSIght package.
Created on Fri Nov 15 17:01:41 2024
@author: lafields2
- MSIght.refactor_affine_transform.display_and_save_image(image_array, title, filename, output_directory)[source]
Displays a binary image and saves it as a PNG file.
- Parameters
image_array (numpy.ndarray) – The binary image array to be displayed and saved. Assumes a 2D grayscale format.
title (str) – The title to be displayed above the image when rendered.
filename (str) – The name of the output file (without the file extension) used for saving the image.
output_directory (str) – The path to the directory where the image will be saved.
- Returns
This function does not return any value.
- Return type
None
Notes
The image is displayed using matplotlib with the ‘gray’ colormap.
The axis is turned off for a cleaner display.
The output file is saved as a PNG image in the specified directory.
If the output directory does not exist, an error will be raised unless handled externally.
- MSIght.refactor_affine_transform.register_he_msi(cropped_image, resized_msi_image, msi_threshold, he_threshold, output_directory, sample_name)[source]
Registers a cropped H&E image to a resized MSI image using affine transformation.
- Parameters
cropped_image (numpy.ndarray) – The cropped H&E image, expected to be grayscale or RGB.
resized_msi_image (numpy.ndarray) – The resized MSI image, expected to be grayscale or RGB.
msi_threshold (int) – Threshold value for binarizing the MSI image (0-255).
he_threshold (int) – Threshold value for binarizing the H&E image (0-255).
output_directory (str) – Directory where registration results will be saved.
sample_name (str) – Name used to label the saved registration output files.
- Returns
optimal_M (numpy.ndarray) – The 3x3 affine transformation matrix obtained after optimization.
final_registered_image (numpy.ndarray) – The final registered binary MSI image after applying the optimal affine transformation.
Notes
Converts RGB images to grayscale if needed.
Binarizes images using specified thresholds.
Uses phase cross-correlation for initial alignment.
Optimizes alignment using Sum of Squared Differences (SSD).
Saves the initial and optimized registration results as PNG files.
Displays intermediate binary and registered images along with SSD values.
Created on Fri Nov 15 17:11:11 2024
@author: lafields2
- MSIght.refactor_bspline.perform_bspline(sized_he_image, transformed_ms_image, number_histograms, gradient_tolerance, optimizer_iterations, courseness)[source]
Performs B-spline image registration between a sized H&E image and a transformed MSI image.
- Parameters
sized_he_image (numpy.ndarray) – The fixed H&E image, expected as a 2D grayscale array.
transformed_ms_image (numpy.ndarray) – The moving MSI image after affine transformation, expected as a 2D grayscale array.
number_histograms (int) – Number of histogram bins used for Mattes mutual information metric.
gradient_tolerance (float) – Convergence tolerance for the gradient during optimization.
optimizer_iterations (int) – Maximum number of iterations for the LBFGSB optimizer.
courseness (int) – Controls the spacing of the B-spline grid. Larger values create a coarser grid.
- Returns
final_transform – The final B-spline transform after registration.
- Return type
sitk.Transform
Notes
Converts the input images to SimpleITK format.
Initializes a B-spline transform based on the specified grid spacing.
Uses the Mattes mutual information metric for registration.
Configures the optimizer with specified tolerance and iteration limits.
Applies the final transformation to the MSI image.
Displays the fixed image, affine-transformed image, and B-spline refined image.
Created on Fri Nov 15 17:19:57 2024
@author: lafields2
- MSIght.refactor_fragger_process.global_proteomics_search(fragger_results_path, threshold, min_prot_instances, ppm_error, output_path)[source]
Performs a global proteomics search on FragPipe results by filtering peptides based on mass differences and minimum protein occurrences.
- Parameters
fragger_results_path (str) – Path to the FragPipe results file (.tsv or .txt format).
threshold (float) – Mass difference threshold for filtering peptides with similar calculated masses.
min_prot_instances (int) – Minimum number of instances a protein must appear in the PSM report for inclusion.
ppm_error (float) – PPM error tolerance for mass accuracy filtering.
output_path (str) – Directory where the processed results will be saved.
- Returns
output_path_report – Path to the saved CSV file containing the processed global proteomics search results.
- Return type
str
Notes
Filters peptides based on unique status and mass differences.
Excludes proteins with fewer than min_prot_instances.
Calculates theoretical mass-to-charge ratios (m/z) and PPM-based error thresholds.
Saves the filtered results as a CSV file.
- MSIght.refactor_fragger_process.process_fragger(protein_oi_list, ppm_error, psm_path, sized_he_image, output_path)[source]
Processes FragPipe PSM reports by filtering peptides based on proteins of interest and calculating mass error thresholds for mass spectrometry integration.
- Parameters
protein_oi_list (list of str) – List of protein IDs of interest to filter from the PSM report.
ppm_error (float) – PPM error tolerance for mass accuracy filtering.
psm_path (str) – Path to the PSM report file (.tsv or .txt format).
sized_he_image (numpy.ndarray) – Reference H&E image used for MSI data integration (not directly used in this function).
output_path (str) – Directory where the processed results will be saved.
- Returns
output_path_report – Path to the saved CSV file containing the processed FragPipe results.
- Return type
str
Notes
Filters unique peptides from the PSM report based on the ‘Protein ID’ column.
Calculates theoretical mass-to-charge ratios (m/z) and PPM-based error thresholds.
Saves the processed data as a CSV file for MSIght integration.
If no matching proteins are found, the CSV will be empty.
- MSIght.refactor_fragger_process.process_fragger_gene(gene_oi_list, ppm_error, psm_path, sized_he_image, output_path)[source]
Processes FragPipe PSM reports based on a list of genes of interest and calculates error thresholds for mass spectrometry integration.
- Parameters
gene_oi_list (list of str) – List of genes of interest to filter from the PSM report.
ppm_error (float) – PPM error tolerance for mass accuracy filtering.
psm_path (str) – Path to the PSM report file (.tsv or .txt format).
sized_he_image (numpy.ndarray) – Reference H&E image used for MSI data integration (not directly used in this function).
output_path (str) – Directory where the processed results will be saved.
- Returns
output_path_report – Path to the saved CSV file containing the processed FragPipe results.
- Return type
str
Notes
Filters unique peptides from the PSM report based on the ‘Gene’ column.
Calculates theoretical mass-to-charge ratios (m/z) and PPM-based error thresholds.
Saves the processed data as a CSV file for MSIght integration.
If no matching genes are found, the CSV will be empty.
Created on Fri Nov 15 17:08:52 2024
@author: lafields2
- MSIght.refactor_histology_preprocess.bin_he_image(threshold_value, red_channel)[source]
Binarizes the red channel of an H&E image using a specified threshold.
- Parameters
threshold_value (int) – The threshold value (0-255) used for binarization.
red_channel (numpy.ndarray) – The extracted red channel from the H&E image.
- Returns
thresholded_image – A binary image where pixels above the threshold are set to 255 (white) and others to 0 (black).
- Return type
numpy.ndarray
Notes
Uses OpenCV’s cv2.threshold for binarization.
Ensure the input red channel is a 2D array of type numpy.ndarray.
- MSIght.refactor_histology_preprocess.foreground_extract(image, foreground_mask)[source]
Extracts the foreground (tissue region) from an H&E image using a binary mask.
- Parameters
image (numpy.ndarray) – The input image from which the background should be removed.
foreground_mask (numpy.ndarray) – A binary mask where the tissue regions are white (255) and the background is black (0).
- Returns
foreground_image – The extracted foreground image with the background removed.
- Return type
numpy.ndarray
Notes
Uses OpenCV’s cv2.bitwise_and to apply the foreground mask.
Pixels outside the foreground mask are set to black (0).
Ensure the mask and the input image have the same dimensions.
- MSIght.refactor_histology_preprocess.foreground_mask_make(image)[source]
Creates a foreground mask by isolating tissue regions from an H&E image using HSV color thresholding.
- Parameters
image (numpy.ndarray) – The input image in BGR format.
- Returns
foreground_mask – A binary mask where the tissue regions are white (255) and the background is black (0).
- Return type
numpy.ndarray
Notes
Converts the image to HSV color space using OpenCV.
Applies a color threshold to separate the background from the tissue.
Inverts the background mask to obtain the foreground (tissue) mask.
Threshold values can be adjusted for better segmentation depending on the sample.
- MSIght.refactor_histology_preprocess.load_he_image(image_path)[source]
Loads an H&E image from the specified file path.
- Parameters
image_path (str) – Path to the H&E image file.
- Returns
image – The loaded image as a BGR array.
- Return type
numpy.ndarray
Notes
Uses OpenCV to read the image, returning it in BGR format.
The image can be converted to other color formats using OpenCV functions as needed.
- MSIght.refactor_histology_preprocess.preprocess_he(image_path, threshold_value, sample_name, output_directory)[source]
Preprocesses an H&E image by extracting the tissue region, binarizing, and smoothing it.
- Parameters
image_path (str) – Path to the input H&E image file.
threshold_value (int) – Threshold value (0-255) for binarization of the red channel.
sample_name (str) – Name used for labeling the saved output file.
output_directory (str) – Directory where the processed image will be saved.
- Returns
final_he_image – The preprocessed H&E image after binarization and smoothing.
- Return type
numpy.ndarray
Notes
- Applies several preprocessing steps:
Loads the H&E image.
Creates a foreground mask using HSV thresholding.
Extracts the tissue region using the mask.
Extracts the red channel from the tissue region.
Binarizes the red channel using the given threshold.
Applies Gaussian smoothing to reduce noise.
Displays each preprocessing step for visualization.
Saves the final processed image as a PNG file.
- MSIght.refactor_histology_preprocess.red_channel_extract(foreground_image)[source]
Extracts the red channel from the foreground image.
- Parameters
foreground_image (numpy.ndarray) – The input image from which the red channel will be extracted. Expected to be in BGR format.
- Returns
red_channel – A 2D array representing the red channel of the input image.
- Return type
numpy.ndarray
Notes
Assumes the input image is in BGR format.
Extracts the third channel (index 2) corresponding to the red channel.
- MSIght.refactor_histology_preprocess.smooth_he_image(thresholded_image)[source]
Applies Gaussian smoothing to a binarized H&E image.
- Parameters
thresholded_image (numpy.ndarray) – The binarized H&E image to be smoothed.
- Returns
smoothed_image – The smoothed binary image after applying Gaussian blur.
- Return type
numpy.ndarray
Notes
Uses OpenCV’s cv2.GaussianBlur with a kernel size of (5, 5).
The standard deviation for the Gaussian kernel is set to 0 (calculated automatically).
Smoothing reduces noise and sharp edges in the binarized image.
Created on Fri Nov 15 17:05:38 2024
@author: lafields2
- MSIght.refactor_interpolation.interpolate_MSI(filename, image_path, msi_image, smoothed_image, output_directory, sample_name)[source]
Interpolates an MSI image to match the dimensions of a corresponding H&E image.
- Parameters
filename (str) – Path to the .imzML file containing MSI data.
image_path (str) – Path to the corresponding H&E image file (TIFF format).
msi_image (numpy.ndarray) – The MSI image to be resized.
smoothed_image (numpy.ndarray) – The smoothed and binarized H&E image used for cropping.
output_directory (str) – Directory where the resized MSI image will be saved.
sample_name (str) – Name used for labeling the saved output file.
- Returns
cropped_image (numpy.ndarray) – The cropped H&E image after binarization and thresholding.
resized_msi_image (numpy.ndarray) – The resized MSI image matching the cropped H&E image’s dimensions.
Notes
Extracts image dimensions from the TIFF file and the .imzML file.
Applies a binary mask to the smoothed H&E image.
Determines cropping boundaries based on tissue presence.
Resizes the MSI image to match the cropped H&E image’s dimensions using linear interpolation.
Saves the resized MSI image as a PNG file.
- MSIght.refactor_interpolation.interpolate_and_visualize(filename, image_path, msi_image, smoothed_image, output_directory, sample_name, original_areas_to_zoom)[source]
Interpolates an MSI image to match the H&E image dimensions and visualizes different interpolation methods.
- Parameters
filename (str) – Path to the .imzML file containing MSI data.
image_path (str) – Path to the corresponding H&E image file (TIFF format).
msi_image (numpy.ndarray) – The MSI image to be resized.
smoothed_image (numpy.ndarray) – The smoothed and binarized H&E image used for cropping.
output_directory (str) – Directory where the visualization output will be saved.
sample_name (str) – Name used for labeling the saved output file.
original_areas_to_zoom (dict) – Dictionary containing areas to zoom in as tuples (x1, y1, x2, y2).
- Returns
cropped_image (numpy.ndarray) – The cropped H&E image after binarization and thresholding.
resized_msi_image (numpy.ndarray) – The resized MSI image matching the cropped H&E image’s dimensions.
Notes
Extracts image dimensions from the .imzML file and the TIFF file.
Binarizes the H&E image and determines cropping boundaries.
Adjusts zoom areas to match the resized MSI image.
Compares multiple interpolation methods: Bilinear, Bicubic, Nearest Neighbor, and Lanczos.
Displays and saves the visualization as a PNG file.
Created on Thu Nov 21 15:28:00 2024
@author: lafields2
- MSIght.refactor_common_functions.apply_dimensionality_reduction(intensity_matrix, pca_components, tsne_components, tsne_perplexity, tsne_interations, tsne_learning_rate)[source]
Applies PCA and t-SNE for dimensionality reduction on an intensity matrix.
- Parameters
intensity_matrix (numpy.ndarray) – The 2D array where rows correspond to pixels and columns correspond to m/z values.
pca_components (int) – Number of principal components to retain during PCA.
tsne_components (int) – Number of components for t-SNE dimensionality reduction.
tsne_perplexity (float) – Perplexity parameter for t-SNE, balancing local and global data structure.
tsne_iterations (int) – Number of iterations for the t-SNE optimization process.
tsne_learning_rate (float) – Learning rate parameter for t-SNE, controlling the step size during optimization.
- Returns
pca_result (numpy.ndarray) – PCA-transformed matrix of reduced dimensions.
tsne_result (numpy.ndarray) – t-SNE-transformed matrix of reduced dimensions.
Notes
Applies PCA for initial dimensionality reduction to speed up t-SNE.
Applies t-SNE on the PCA-transformed matrix for further reduction.
Returns both PCA and t-SNE results for further analysis or visualization.
- MSIght.refactor_common_functions.create_intensity_matrix(coordinates, mz_values, intensities)[source]
Creates an intensity matrix from preprocessed MSI data.
- Parameters
coordinates (list of tuples) – List of pixel coordinates (x, y) from the .imzML file.
mz_values (list of numpy.ndarray) – List of m/z values corresponding to each pixel.
intensities (list of numpy.ndarray) – List of preprocessed intensity values for each pixel.
- Returns
intensity_matrix (numpy.ndarray) – A 2D array where each row represents a pixel, and each column corresponds to a unique m/z value.
all_mz_values (numpy.ndarray) – A sorted array of unique m/z values across all pixels.
Notes
Extracts unique m/z values across all pixels.
Initializes an intensity matrix with zeros.
Fills the matrix with intensity values using np.searchsorted for fast indexing.
Returns the intensity matrix and the corresponding m/z values.
- MSIght.refactor_common_functions.load_and_preprocess_imzml(filename, sigma, structuring_element_size)[source]
Loads and preprocesses MSI data from an .imzML file by applying Gaussian smoothing and top-hat baseline correction.
- Parameters
filename (str) – Path to the .imzML file containing the MSI data.
sigma (float) – Standard deviation for Gaussian smoothing applied to the intensity values.
structuring_element_size (int) – Size of the structuring element used for top-hat baseline correction.
- Returns
coordinates (list of tuples) – List of pixel coordinates (x, y) from the .imzML file.
mz_values (list of numpy.ndarray) – List of m/z values corresponding to each pixel.
intensities (list of numpy.ndarray) – List of preprocessed intensity values for each pixel.
Notes
Uses PyImzML to parse the .imzML file.
Applies Gaussian smoothing to reduce noise in the intensity spectra.
Applies top-hat baseline correction to remove background noise.
Returns preprocessed data suitable for further analysis.
Created on Fri Nov 15 17:27:23 2024
@author: lafields2
- MSIght.refactor_manual_affine.manual_register_he_msi(pts_ms, pts_he, resized_msi_image, cropped_image, output_directory, sample_name)[source]
Manually registers an MSI image to an H&E image using affine transformation.
- Parameters
pts_ms (numpy.ndarray) – Coordinates from the MSI image (source points).
pts_he (numpy.ndarray) – Corresponding coordinates from the H&E image (destination points).
resized_msi_image (numpy.ndarray) – The resized MSI image to be registered.
cropped_image (numpy.ndarray) – The cropped and smoothed H&E image.
output_directory (str) – Directory where the transformed MSI image will be saved.
sample_name (str) – Name used for labeling the saved output file.
- Returns
M – The estimated affine transformation matrix.
- Return type
numpy.ndarray
Notes
Uses OpenCV’s cv2.estimateAffinePartial2D to calculate the affine matrix.
Applies the transformation to the resized MSI image.
Displays and saves the transformed MSI image.
Assumes the input points are selected manually or computed separately.
- MSIght.refactor_manual_affine.show_msi_he_coords(final_MSI_image, final_he_image)[source]
Displays the MSI and H&E images side by side with coordinates using Plotly.
- Parameters
final_MSI_image (numpy.ndarray) – The final MSI image after interpolation and transformation.
final_he_image (numpy.ndarray) – The final H&E image after processing and transformation.
- Return type
None
Notes
Uses Plotly’s imshow for interactive visualization.
Displays coordinate axes for better comparison.
Titles the plots as ‘MSI Image’ and ‘H&E Image’.
Created on Fri Nov 15 17:26:18 2024
@author: lafields2
- MSIght.refactor_manual_affine_transform.manual_register_he_msi(pts_ms, pts_he, resized_msi_image, cropped_image, output_directory, sample_name)[source]
Manually registers an MSI image to an H&E image using affine transformation.
- Parameters
pts_ms (numpy.ndarray) – Coordinates from the MSI image (source points).
pts_he (numpy.ndarray) – Corresponding coordinates from the H&E image (destination points).
resized_msi_image (numpy.ndarray) – The resized MSI image to be registered.
cropped_image (numpy.ndarray) – The cropped and smoothed H&E image.
output_directory (str) – Directory where the transformed MSI image will be saved.
sample_name (str) – Name used for labeling the saved output file.
- Returns
M (numpy.ndarray) – The estimated affine transformation matrix.
transformed_ms_image (numpy.ndarray) – The transformed MSI image after registration.
Notes
Uses OpenCV’s cv2.estimateAffinePartial2D to calculate the affine matrix.
Applies the transformation to the resized MSI image.
Displays and saves the transformed MSI image alongside the original images.
Assumes the input points are selected manually or computed separately.
Created on Fri Nov 15 17:28:29 2024
@author: lafields2
- MSIght.refactor_mz_image_extract.apply_bspline_transform_to_msi(b_spline_transform, msi_data_image)[source]
Applies a B-spline transformation to an MSI image using SimpleITK.
- Parameters
b_spline_transform (sitk.Transform) – The B-spline transformation object obtained from registration.
msi_data_image (numpy.ndarray) – The MSI image to be transformed, expected as a 2D array.
- Returns
transformed_msi_image – The transformed MSI image as a numpy array.
- Return type
numpy.ndarray
Notes
Converts the input MSI image to a SimpleITK image.
Uses sitk.ResampleImageFilter to apply the B-spline transformation.
Sets the interpolator to linear and the default pixel value to 0.
Converts the transformed image back to a numpy array for further processing.
- MSIght.refactor_mz_image_extract.extract_mz_image_transform(filename, mz, mz_tolerance, z_value, b_spline_apply, sized_he_image)[source]
Extracts an m/z image from an .imzML file, resizes it, and applies a B-spline transformation.
- Parameters
filename (str) – Path to the .imzML file containing the MSI data.
mz (float) – The target m/z value to extract from the MSI data.
mz_tolerance (float) – Tolerance for the target m/z value during image extraction.
z_value (int) – Charge state value for m/z extraction.
b_spline_apply (sitk.Transform) – The B-spline transform to apply to the extracted m/z image.
sized_he_image (numpy.ndarray) – The reference H&E image used for resizing the MSI image.
- Returns
msi_result – The transformed m/z image after resizing and applying the B-spline transform.
- Return type
numpy.ndarray
Notes
Extracts an m/z image using getionimage from PyImzML.
Resizes the m/z image to match the H&E image’s dimensions.
Applies a B-spline transformation using SimpleITK if provided.
Returns the transformed m/z image as a numpy array.
- MSIght.refactor_mz_image_extract.overlay_msi_he(msi_result, sized_he_image, mz)[source]
Overlays an MSI image onto an H&E image and displays the result.
- Parameters
msi_result (numpy.ndarray) – The transformed MSI image.
sized_he_image (numpy.ndarray) – The reference H&E image.
mz (float) – The m/z value corresponding to the MSI image.
- Return type
None
Notes
Ensures the MSI image has the same data type as the H&E image.
Uses OpenCV’s cv2.addWeighted for overlaying the images with equal weights.
Displays the overlay using Matplotlib.