ProcessSCanningData is a Python class designed to process data from MRXS, NDPI and SVS files, merge it with inventory data, calculate immunopositivity statistics, and generate useful visualizations like heatmaps and scatterplots.
This is the second half of the HDAB - Immunopositivity Tutorial available here ().
- Python 3.8
- Required Python packages:
pandas==1.5.3,numpy==1.21.4,seaborn==0.11.2,matplotlib,scipy==1.7.3,openpyxl==3.0.10 - Pip library opepyl > 3.2.0
- QuPath 5.01 required!
-
Clone or download this repository to your local machine.
-
Extract the directory (whenever you like) and open the directory.
unzip process_scan-main
cd process_scan-main
- Call directly the class (internal script providing already with the resulting rates and images):
There are different ways to do it via BASH terminal:
Using the already workflow_template in this directory:
In this example the reference files are gonna be the mrxs, but it will work with the aforementioned scanning files.
python workflow_template.py path/to/your/results path/to/your/Inventory.xlsx path/to/your/directory/to/outputs xlsx/csv (you will need to choose one of the two)
Calling the script directly with the path stated for complete:
In this example the reference files are gonna be the mrxs, but it will work with the aforementioned scanning files.
Please, change the class according to your scanning images files extension: process_mrxs_data.py, process_svs_data.py and/or process_ndpi_data.py
python process_mrxs_data.py path/to/your/results path/to/your/Inventory.xlsx path/to/your/directory/to/outputs xlsx/csv
Or using an ENV (environmental) variables.
EXPORT mrxs_directory="path/to/your/mrxs_files"
EXPORT inventory_file="path/to/your/Inventory.xlsx"
EXPORT output_path="path/to/your/directory/to/outputs" #make sure to create it previously
EXPORT output_extension="xlsx/csv"
python process_mrxs_data.py $mrxs_directory $inventory_file $output_path $output_extensionPlease, consider that the current development is only considering to be launched with the Prerequisites complied on a SH Operating System.
The Inventory file can be .csv or .xslx, and multi-sheet format is expected.
When defining the output_filename, both the option .csv and .xslx are implemented, so could choose based on your preference.
-
Clone or download this repository to your local machine.
-
Import the
ProcessScanDataclass into your Python script:
Please, change the class according to your scanning images files extension: ProcessMRXSData, ProcessSVSData and/or ProcessNDPIData
from process_mrxs_data import ProcessMRXSData- Create an instance of the ProcessMRXSData class by providing paths to your MRXS files in a specific directory and inventory file, as well as the other files necessary to launch the call:
mrxs_directory = "path/to/your/mrxs_files"
inventory_file = "path/to/your/Inventory.xslx"
output_path = "path/to/your/directory/to/outputs"
output_extension = "xlsx/csv"
processor = ProcessMRXSData.process_directory(mrxs_directory, inventory_file, output_path, output_extension)- Call the other functions for the rate calculation and relative images:
rate = ProcessMRXSData.process_rate(output_path, output_filename)
for file in rate:
ProcessMRXSData.process_heatmaps(rate)
ProcessMRXSData.process_scatterplots(rate)- Check the output images in the output_path
Call the process_data method to process MRXS data, merge it with inventory, and calculate immunopositivity statistics for every slide (this function is called internally for each file in the process_directory method):
result_df = processor.process_data()To process immunopositivity rate from Excel/CSV files and merge into a unique main DataFrame, call the process_positivity method:
xlsx_file = "path/to/your/immunopositivity_data.xlsx"
final_df = ProcessMRXSData.process_positivity(xlsx_file, result_df)Process MRXS data from a directory, save antibody-specific data, and return the final DataFrame using the process_directory method (internally calling the process_data):
directory_path = "path/to/your/mrxs/files/directory"
output_path = "path/to/output/data/directory"
final_data = ProcessMRXSData.process_directory(directory_path, inventory_file, output_path, output_extension)Process immunopositivity rate from saved files and merge it into a final DataFrame using the process_rate method:
final_rate = ProcessMRXSData.process_rate(output_path, final_filename)To generate and save correlation heatmaps based on immunopositivity rate data, call the process_heatmaps method:
data_file = "path/to/your/immunopositivity_data.csv"
ProcessMRXSData.process_heatmaps(data_file)To generate and save scatterplots based on immunopositivity rate data, use the process_scatterplots method:
data_file = "path/to/your/immunopositivity_data.csv"
ProcessMRXSData.process_scatterplots(data_file)For more details, examples, and command-line usage, please refer to the code and documentation in this repository.
This project is licensed under the MIT License - see the LICENSE file for details.
Note: This is a simplified example README file. You should replace the placeholder paths and filenames with the actual paths to your data files and directories.