Segmentation¶
Segmentation is the process of identifying and delineating individual cells in IMC images. OpenIMC provides four segmentation methods: CellSAM (via DeepCell API), Cellpose, Watershed, and Ilastik.
Overview¶
Cell segmentation is a critical first step in IMC analysis, as it defines the boundaries of individual cells from which features will be extracted. The choice of segmentation method depends on your data characteristics, computational resources, and accuracy requirements.
Options¶
OpenIMC supports four segmentation methods:
CellSAM (default): Deep learning-based segmentation using DeepCell’s CellSAM model via API
Cellpose: Local deep learning-based segmentation (supports GPU acceleration)
Watershed: Traditional marker-controlled watershed segmentation
Ilastik: Segmentation using pre-trained Ilastik models (
.ilpproject files)
Parameters¶
Common Parameters¶
These parameters apply to all segmentation methods:
nuclear_channels (required): List of channel names to use for nuclear detection - Example:
["DNA1_Ir191", "DNA2_Ir193"]- Used to identify cell nuclei for seeding segmentationcytoplasm_channels (optional for CellSAM/Cellpose, required for Watershed): List of channel names for cytoplasm detection - Example:
["CD3_1841", "CD4_2293"]- Used to define cell boundariesnuclear_fusion_method (default:
"mean"): Method to combine multiple nuclear channels - Options:"single","mean","weighted","max","pca1"-"single": Use only the first channel -"mean": Average all channels -"weighted": Weighted average (requiresnuclear_weights) -"max": Maximum intensity across channels -"pca1": First principal componentcyto_fusion_method (default:
"mean"): Method to combine cytoplasm channels (same options as nuclear_fusion_method)nuclear_weights (optional): List of weights for weighted fusion of nuclear channels - Example:
[0.5, 0.3, 0.2]- Must match the number of nuclear channelscyto_weights (optional): List of weights for weighted fusion of cytoplasm channels
normalization_method (default:
"None"): Normalization method to apply before segmentation -"None": No normalization -"arcsinh": Arcsinh transformation (helps normalize intensity distributions, recommended for data with high dynamic range) -"channelwise_minmax": Min-max normalization per channel (scales each channel independently to 0-1 range, useful when channels have different intensity ranges) -"percentile_clip": Percentile-based clipping normalizationarcsinh_cofactor (default:
10.0): Cofactor for arcsinh transformation (only used if normalization_method is “arcsinh”) - Lower values increase compression of high intensities - Typical range: 5.0-20.0arcsinh (default:
false): Legacy parameter, equivalent to setting normalization_method to “arcsinh” - Deprecated: Use normalization_method insteaddenoise_settings (optional): Dictionary with denoise settings per channel - Format:
{"Channel1": {"method": "gaussian", "sigma": 1.0}}- Can also be a path to a JSON fileacquisition (optional): Specific acquisition ID or name to segment - If not specified, processes all acquisitions
CellSAM Parameters¶
deepcell_api_key (optional): DeepCell API access token - If not provided, uses
DEEPCELL_ACCESS_TOKENenvironment variable - Required for CellSAM method - Get your API key from DeepCellbbox_threshold (default:
0.4): Bounding box detection threshold - Lower values (0.01-0.1) detect fainter cells - Higher values (0.4-0.8) detect only bright cells - Adjust based on cell visibility in your datause_wsi (default:
false): Use whole-slide imaging (WSI) mode - Enable for ROIs with >500 cells - Improves performance for large imageslow_contrast_enhancement (default:
false): Apply low contrast enhancement - Helps with faint or low-contrast cellsgauge_cell_size (default:
false): Automatically gauge cell size - Can improve segmentation for variable cell sizes
Cellpose Parameters¶
model (default:
"cyto3"): Cellpose model type -"cyto3": General cytoplasm segmentation (requires cytoplasm channels) -"nuclei": Nuclear segmentation onlydiameter (optional): Expected cell diameter in pixels - If not specified, Cellpose estimates automatically - Specify if you know the approximate cell size
flow_threshold (default:
0.4): Flow field threshold - Lower values (0.0-0.2) allow more cell boundaries - Higher values (0.4-0.8) require stronger boundaries - Adjust if cells are over- or under-segmentedcellprob_threshold (default:
0.0): Cell probability threshold - Lower values (negative) include more uncertain regions - Higher values (0.0-0.5) require higher confidence - Typically kept at 0.0 for best resultsgpu_id (optional): GPU device ID to use for acceleration - Example:
0for first GPU,1for second GPU - If not specified, uses CPU (slower but works on all systems)
Watershed Parameters¶
min_cell_area (default:
100): Minimum cell area in pixels - Filters out small objects (likely noise) - Increase if you have many small false positivesmax_cell_area (default:
10000): Maximum cell area in pixels - Filters out very large objects (likely merged cells) - Decrease if cells are being mergedcompactness (default:
0.01): Watershed compactness parameter - Lower values (0.001-0.01) allow irregular shapes - Higher values (0.01-0.1) prefer compact, round shapes - Adjust based on expected cell morphology
Ilastik Parameters¶
ilp_file (required): Path to Ilastik project file (
.ilp) - Must be a trained Ilastik project file - Train your model in Ilastik GUI and save as.ilpfile - The model should be trained on similar data for best resultsoutput_format (default:
"Simple Segmentation"): Output format from Ilastik -"Simple Segmentation": Segmentation masks (default) -"Probabilities": Probability maps for each class
Note
Ilastik must be installed separately and available in your PATH. See the Installation guide for details.
Using Segmentation in the GUI¶
Load your IMC data file (
.mcdor OME-TIFF directory)Navigate to Analysis → Segmentation or click the segmentation button in the toolbar
In the segmentation dialog: - Select the segmentation method (CellSAM, Cellpose, Watershed, or Ilastik) - Choose nuclear channels from the channel list (not required for Ilastik) - Optionally select cytoplasm channels (not required for Ilastik) - For Ilastik: Browse and select your trained
.ilpproject file - Adjust method-specific parameters - Configure preprocessing options (denoising, normalization)Click Run Segmentation to start the process
Segmentation masks are automatically saved and can be visualized in the image viewer
Masks are stored per acquisition and can be used for subsequent feature extraction
Using Segmentation in the CLI¶
Basic Command¶
openimc segment input.mcd output/ --method cellpose \\
--nuclear-channels DNA1_Ir191,DNA2_Ir193 \\
--cytoplasm-channels CD3_1841,CD4_2293
CellSAM Example¶
openimc segment input.mcd output/ --method cellsam \\
--nuclear-channels DNA1_Ir191,DNA2_Ir193 \\
--cytoplasm-channels CD3_1841 \\
--bbox-threshold 0.3 \\
--use-wsi
Cellpose Example¶
openimc segment input.mcd output/ --method cellpose \\
--nuclear-channels DNA1_Ir191 \\
--cytoplasm-channels CD3_1841,CD4_2293 \\
--model cyto3 \\
--diameter 30 \\
--flow-threshold 0.4 \\
--gpu-id 0
Watershed Example¶
openimc segment input.mcd output/ --method watershed \\
--nuclear-channels DNA1_Ir191,DNA2_Ir193 \\
--cytoplasm-channels CD3_1841 \\
--min-cell-area 100 \\
--max-cell-area 10000 \\
--compactness 0.01
Ilastik Example¶
Note
Ilastik segmentation is primarily available through the GUI. For CLI usage, ensure Ilastik is installed and the ilastik command is available in your PATH.
In the GUI, select “Ilastik” as the segmentation method and browse to your trained .ilp project file.
Workflow YAML Example¶
segmentation:
enabled: true
method: "cellsam"
nuclear_channels:
- "DNA1_Ir191"
- "DNA2_Ir193"
cytoplasm_channels:
- "CD3_1841"
nuclear_fusion_method: "mean"
cyto_fusion_method: "mean"
normalization_method: "channelwise_minmax" # Options: "None", "arcsinh", "channelwise_minmax", "percentile_clip"
arcsinh_cofactor: 10.0
bbox_threshold: 0.4
use_wsi: false
Method Details¶
CellSAM¶
CellSAM uses DeepCell’s CellSAM model, which is a state-of-the-art deep learning model for cell segmentation. It leverages the Segment Anything Model (SAM) architecture adapted for cell segmentation tasks.
How it works: 1. Nuclear channels are combined using the specified fusion method 2. The combined nuclear image is sent to DeepCell’s API 3. CellSAM detects cell bounding boxes using the bbox_threshold 4. Within each bounding box, precise cell boundaries are segmented 5. Results are returned as segmentation masks
Advantages: - High accuracy, especially for complex cell morphologies - Handles variable cell sizes well - No local GPU required (uses cloud API)
Limitations: - Requires internet connection and API key - Processing time depends on API availability - May have usage limits depending on API plan
Citation: - DeepCell CellSAM: DeepCell Platform - Segment Anything Model: Kirillov, A., et al. (2023). “Segment Anything.” arXiv:2304.02643
Cellpose¶
Cellpose is a generalist algorithm for cell segmentation that uses a deep neural network trained on diverse cell types.
How it works: 1. Nuclear and/or cytoplasm channels are preprocessed and combined 2. The Cellpose model predicts a flow field and cell probability map 3. The flow field guides cell boundary detection 4. Thresholds (flow_threshold, cellprob_threshold) filter the results 5. Final segmentation masks are generated
Advantages: - Works offline (no API required) - Supports GPU acceleration for faster processing - Generalizable across many cell types - Can be fine-tuned for specific applications
Limitations: - Requires local installation and potentially GPU setup - May need parameter tuning for optimal results
Citation: - Stringer, C., et al. (2021). “Cellpose: a generalist algorithm for cellular segmentation.” Nature Methods, 18(1), 100-106. DOI: 10.1038/s41592-020-01018-x - Cellpose GitHub
Watershed¶
Watershed segmentation is a classical image processing technique that uses marker-controlled watershed transformation.
How it works: 1. Nuclear channels are combined to create a nuclear marker image 2. Cytoplasm channels are combined to create a membrane/cytoplasm image 3. Distance transform is applied to the nuclear markers 4. Watershed algorithm floods from markers, using membrane image as boundaries 5. Post-processing filters cells by area (min_cell_area, max_cell_area)
Advantages: - Fast and computationally efficient - No external dependencies or API keys required - Deterministic results (reproducible) - Good for well-separated cells with clear nuclei
Limitations: - Less accurate for touching or overlapping cells - Requires good nuclear and membrane channel contrast - May need careful parameter tuning
Citation: - Vincent, L., & Soille, P. (1991). “Watersheds in digital spaces: an efficient algorithm based on immersion simulations.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6), 583-598. DOI: 10.1109/34.87344 - Implementation based on scikit-image: scikit-image Watershed
Ilastik¶
Ilastik is an interactive learning and segmentation toolkit that allows you to train custom segmentation models on your specific data.
How it works:
1. Train a segmentation model in Ilastik’s GUI using your IMC data
2. Save the trained model as a .ilp project file
3. In OpenIMC, select Ilastik as the segmentation method
4. Load your trained .ilp project file
5. OpenIMC runs inference using Ilastik’s headless mode
6. Results are returned as segmentation masks
Advantages: - Train custom models tailored to your specific data - Interactive training allows fine-tuning on difficult cases - Can handle complex segmentation tasks - Works well for specialized cell types or tissue structures
Limitations: - Requires separate Ilastik installation - Requires training a model before use (time investment) - Model quality depends on training data quality - Processing can be slower than other methods
Citation: - Berg, S., et al. (2019). “ilastik: interactive machine learning for (bio)image analysis.” Nature Methods, 16(12), 1226-1232. DOI: 10.1038/s41592-019-0582-9 - Ilastik Website - Ilastik Documentation
Tips and Best Practices¶
Channel Selection: Choose nuclear channels with strong, consistent staining. For cytoplasm channels, select markers that outline cell boundaries well.
Preprocessing: - Use
channelwise_minmaxnormalization when channels have different intensity ranges (recommended for most cases) - Applyarcsinhnormalization if your data has a wide dynamic range - Use denoising for noisy channelsParameter Tuning: Start with default parameters and adjust based on results: - If cells are over-segmented (too many small pieces), increase thresholds or min_cell_area - If cells are under-segmented (merged together), decrease thresholds or adjust fusion methods
Method Selection: - Use CellSAM for best accuracy and when API access is available - Use Cellpose for quicker processing with good accuracy - Use Watershed for fast processing of well-separated cells - Use Ilastik when you need custom segmentation models tailored to your specific data or cell types
Validation: Always visually inspect segmentation results before proceeding to feature extraction. Adjust parameters as needed.