Custom Algorithms¶

OpenIMC provides base classes that allow developers to easily integrate novel segmentation, clustering, and feature extraction algorithms into the framework. These base classes define clear interfaces with standardized input/output formats, making it straightforward to add new methods while maintaining compatibility with the existing OpenIMC pipeline.

Overview¶

The base classes are located in openimc.processing.base and provide:

Clear interface definitions: Standardized input/output formats
Input validation: Automatic validation of inputs before processing
Output validation: Automatic validation of outputs after processing
Documentation: Comprehensive docstrings explaining expected formats
Error handling: Consistent error messages and exception types

Base Classes¶

BaseSegmenter¶

Abstract base class for segmentation algorithms.

Location: openimc.processing.base.BaseSegmenter

Expected Inputs:

nuclear_image: np.ndarray, shape (H, W), dtype float32 - Preprocessed nuclear channel image (0-1 normalized)
cyto_image: np.ndarray, shape (H, W), dtype float32, optional - Preprocessed cytoplasm channel image (0-1 normalized)
**kwargs: Additional algorithm-specific parameters

Expected Output:

mask: np.ndarray, shape (H, W), dtype uint32 - Segmentation mask where each cell has a unique integer label - 0 = background, 1+ = cell labels

Example Implementation:

from openimc.processing.base import BaseSegmenter
import numpy as np

class MyCustomSegmenter(BaseSegmenter):
    def __init__(self):
        super().__init__(name="my_custom_segmenter")

    def segment(self, nuclear_image, cyto_image=None, **kwargs):
        # Validate inputs (optional, but recommended)
        self.validate_inputs(nuclear_image, cyto_image)

        # Your segmentation algorithm here
        # ...

        # Create mask (example: simple thresholding)
        threshold = kwargs.get('threshold', 0.5)
        mask = (nuclear_image > threshold).astype(np.uint32)

        # Validate output (optional, but recommended)
        self.validate_output(mask, nuclear_image.shape)

        return mask

BaseClusterer¶

Abstract base class for clustering algorithms.

Location: openimc.processing.base.BaseClusterer

Expected Inputs:

features_df: pd.DataFrame - Feature matrix with one row per cell and one column per feature - Required columns: None (all numeric columns are used) - Excluded columns: 'cell_id', 'acquisition_id', 'acquisition_name', 'well', 'cluster', 'label', 'source_file', etc.
columns: List[str], optional - Specific feature columns to use for clustering - If None, auto-detects all numeric columns
**kwargs: Additional algorithm-specific parameters

Expected Output:

features_df: pd.DataFrame - Same DataFrame as input with 'cluster' column added - 'cluster' column: int, 1-based cluster labels (0 = unassigned/noise)

Example Implementation:

from openimc.processing.base import BaseClusterer
import pandas as pd
from sklearn.cluster import KMeans

class MyCustomClusterer(BaseClusterer):
    def __init__(self):
        super().__init__(name="my_custom_clusterer")

    def cluster(self, features_df, columns=None, **kwargs):
        # Validate and prepare inputs
        data, column_names = self.validate_inputs(features_df, columns)
        original_shape = features_df.shape

        # Your clustering algorithm here
        n_clusters = kwargs.get('n_clusters', 5)
        kmeans = KMeans(n_clusters=n_clusters, random_state=42)
        cluster_labels = kmeans.fit_predict(data.values)

        # Convert to 1-based labels
        cluster_labels = (cluster_labels + 1).astype(int)

        # Add cluster column
        result_df = features_df.copy()
        result_df['cluster'] = cluster_labels

        # Validate output
        self.validate_output(result_df, original_shape)

        return result_df

BaseFeatureExtractor¶

Abstract base class for feature extraction algorithms.

Location: openimc.processing.base.BaseFeatureExtractor

Expected Inputs:

mask: np.ndarray, shape (H, W), dtype uint32 - Segmentation mask with cell labels (0 = background, 1+ = cells)
image_stack: np.ndarray, shape (H, W, C), dtype float32 - Image stack with C channels
channel_names: List[str], length C - Names of each channel in image_stack
**kwargs: Additional algorithm-specific parameters

Expected Output:

features_df: pd.DataFrame - Feature matrix with one row per cell - Required columns:
- 'cell_id': int, unique identifier for each cell (1-based)
- 'label': int, cell label from mask (1-based)
- Additional feature columns (algorithm-specific)

Example Implementation:

from openimc.processing.base import BaseFeatureExtractor
import numpy as np
import pandas as pd

class MyCustomFeatureExtractor(BaseFeatureExtractor):
    def __init__(self):
        super().__init__(name="my_custom_extractor")

    def extract(self, mask, image_stack, channel_names, **kwargs):
        # Validate inputs
        self.validate_inputs(mask, image_stack, channel_names)

        # Get unique cell labels (exclude background = 0)
        unique_labels = np.unique(mask)
        unique_labels = unique_labels[unique_labels > 0]

        features_list = []
        for idx, label in enumerate(unique_labels):
            cell_id = idx + 1  # 1-based
            features = {'cell_id': cell_id, 'label': int(label)}

            # Create binary mask for this cell
            cell_mask = (mask == label)

            # Extract your custom features here
            # ...

            features_list.append(features)

        # Create DataFrame
        features_df = pd.DataFrame(features_list)

        # Validate output
        expected_n_cells = len(unique_labels)
        self.validate_output(features_df, expected_n_cells)

        return features_df

Integration with OpenIMC¶

Once you’ve implemented a custom algorithm, you can integrate it into OpenIMC by modifying the core functions to support your new algorithm. Here’s how:

Segmentation Integration¶

Modify openimc.core.segment() to add your segmenter:

def segment(loader, acquisition, method, ...):
    # ... existing code ...

    elif method == 'my_custom_segmenter':
        from my_module import MyCustomSegmenter

        # Preprocess channels (same as other methods)
        nuclear_img, cyto_img = _preprocess_channels_for_segmentation(...)

        # Create segmenter instance
        segmenter = MyCustomSegmenter()

        # Run segmentation
        mask = segmenter.segment(
            nuclear_img,
            cyto_image=cyto_img,
            threshold=0.5,  # Your custom parameters
            min_cell_area=50
        )

    # ... rest of code ...

Clustering Integration¶

Modify openimc.core.cluster() to add your clusterer:

def cluster(features_df, method='leiden', ...):
    # ... existing code ...

    elif method == 'my_custom_clusterer':
        from my_module import MyCustomClusterer

        # Create clusterer instance
        clusterer = MyCustomClusterer()

        # Run clustering
        result_df = clusterer.cluster(
            features_df,
            columns=columns,
            n_clusters=5,  # Your custom parameters
            seed=42
        )

        return result_df

    # ... rest of code ...

Feature Extraction Integration¶

Modify openimc.processing.feature_worker.extract_features_for_acquisition() to add your extractor:

def extract_features_for_acquisition(..., feature_extractor=None):
    # ... existing code ...

    if feature_extractor is not None:
        # Use custom extractor
        from my_module import MyCustomFeatureExtractor
        extractor = MyCustomFeatureExtractor()
        features_df = extractor.extract(
            mask,
            img_stack,
            channel_names,
            morphological=True,
            intensity=True
        )
    else:
        # Use default extractor
        # ... existing code ...

Example Implementations¶

Complete example implementations are available in openimc.processing.examples:

ExampleThresholdSegmenter: Simple thresholding-based segmentation
ExampleKMeansClusterer: K-means clustering implementation
ExampleBasicFeatureExtractor: Basic morphological and intensity features

These examples demonstrate:

Proper input validation
Correct output format
Error handling
Integration patterns

Best Practices¶

Always validate inputs: Use the validate_inputs() method before processing
Always validate outputs: Use the validate_output() method after processing
Handle edge cases: Empty masks, no cells, missing channels, etc.
Document parameters: Clearly document all **kwargs parameters
Preserve data types: Ensure outputs match expected dtypes (uint32, float32, etc.)
Use 1-based indexing: Cell IDs and labels should start at 1 (0 = background/unassigned)
Handle memory efficiently: For large datasets, process in chunks if needed
Provide informative errors: Raise ValueError with clear messages for invalid inputs

Common Pitfalls¶

Wrong dtype: Segmentation masks must be uint32, not uint8 or int32
Wrong indexing: Cell IDs and labels must be 1-based (1, 2, 3, …), not 0-based
Missing required columns: Feature DataFrames must have 'cell_id' and 'label' columns
Shape mismatches: Output shapes must match input shapes
NaN values: Cluster labels cannot contain NaN (use 0 for unassigned cells)
Background handling: Background pixels should be labeled as 0 in masks

Testing Your Implementation¶

Before integrating your algorithm, test it with the base class validation:

import numpy as np
from my_module import MyCustomSegmenter

# Create test data
nuclear_img = np.random.rand(100, 100).astype(np.float32)

# Test segmenter
segmenter = MyCustomSegmenter()
mask = segmenter.segment(nuclear_img, threshold=0.5)

# Validation is automatic, but you can also check:
assert mask.dtype == np.uint32
assert mask.shape == nuclear_img.shape
assert mask.min() >= 0

For more complex testing, use the OpenIMC test fixtures and integration tests.

Custom Algorithms¶

Overview¶

Base Classes¶

BaseSegmenter¶

BaseClusterer¶

BaseFeatureExtractor¶

Integration with OpenIMC¶

Segmentation Integration¶

Clustering Integration¶

Feature Extraction Integration¶

Example Implementations¶

Best Practices¶

Common Pitfalls¶

Testing Your Implementation¶

OpenIMC

Navigation

Related Topics