Simple Spatial Analysis

Simple Spatial Analysis provides fundamental spatial analysis tools for exploring cell spatial relationships, including spatial graph construction, pairwise enrichment, distance distributions, and spatial visualization.

Overview

Spatial analysis examines how cells are organized in tissue space, identifying spatial patterns, cell-cell interactions, and tissue architecture. Simple Spatial Analysis includes core spatial analysis methods that work without additional dependencies like squidpy.

Options

Simple Spatial Analysis includes:

  1. Spatial Graph Construction: Build spatial networks connecting neighboring cells

  2. Pairwise Enrichment: Test for spatial co-occurrence or avoidance between cell types

  3. Distance Distributions: Analyze nearest-neighbor distances between cell types

  4. Spatial Visualization: Visualize cell spatial organization

  5. Spatial Communities: Identify spatially coherent cell communities

Parameters

Spatial Graph Construction

  • method (default: "kNN"): Graph construction method - "kNN": k-nearest neighbors graph - "Radius": Connect all cells within a specified radius - "Delaunay": Delaunay triangulation (connects cells in triangular mesh)

  • k_neighbors (default: 10): Number of nearest neighbors for kNN method - More neighbors (15-30) create denser graphs - Fewer neighbors (5-10) create sparser graphs - Typical range: 5-30

  • radius (required for Radius method): Maximum distance for edges in pixels - Only used when method is “Radius” - Larger radius (50-100) connects more distant cells - Smaller radius (20-50) connects only nearby cells - Should be adjusted based on cell density

  • pixel_size_um (default: 1.0): Pixel size in micrometers - Used to convert pixel distances to physical distances - Important for distance-based analyses - Should match your image acquisition settings

  • seed (default: 42): Random seed for reproducibility - Used for permutation tests and community detection

Pairwise Enrichment

  • n_permutations (default: 100): Number of permutations for statistical testing - More permutations (500-1000) provide more accurate p-values - Fewer permutations (100-200) are faster but less precise - Typical range: 100-1000

  • workers (default: auto): Number of parallel workers for permutation tests - More workers speed up computation - Default: number of CPU cores - 2

Distance Distributions

  • workers (default: auto): Number of parallel workers for distance computation - More workers speed up computation for large datasets

Spatial Communities

  • min_cells (default: 5): Minimum number of cells in a community - Filters out very small communities - Increase to focus on larger spatial structures

Using Simple Spatial Analysis in the GUI

  1. Ensure clustering has been completed (cells need cluster assignments)

  2. Navigate to Analysis → Spatial Analysis → Simple Spatial Analysis

  3. In the spatial analysis dialog: - Build Spatial Graph:

    • Select graph construction method (kNN, Radius, or Delaunay)

    • Set k_neighbors (for kNN) or radius (for Radius)

    • Set pixel size if known

    • Click “Build Graph”

    • Pairwise Enrichment Tab: - Set number of permutations - Set number of workers - Click “Run Enrichment Analysis” - Results show z-scores and p-values for each cluster pair

    • Distance Distributions Tab: - Click “Run Distance Analysis” - Select clusters to display in the plot - Results show nearest-neighbor distance distributions

    • Spatial Visualization Tab: - Select ROI to visualize - Choose color encoding (cluster or feature) - Optionally show edges - Click “Generate Spatial Plot”

    • Spatial Communities Tab: - Select ROI - Set minimum cells per community - Optionally exclude specific cell types - Click “Run Community Analysis”

  4. Export results using the “Export Results” or “Export Graph” buttons

Using Simple Spatial Analysis in the CLI

Basic Command

openimc spatial features.csv spatial_edges.csv \\
    --method kNN \\
    --k-neighbors 10 \\
    --pixel-size-um 1.0

With Radius Method

openimc spatial features.csv spatial_edges.csv \\
    --method Radius \\
    --radius 50.0 \\
    --pixel-size-um 1.0

With Community Detection

openimc spatial features.csv spatial_edges.csv \\
    --method kNN \\
    --k-neighbors 10 \\
    --detect-communities \\
    --seed 42

Workflow YAML Example

spatial_analysis:
  enabled: true
  method: "kNN"
  k_neighbors: 10
  radius: null  # Not used for kNN
  pixel_size_um: 1.0
  detect_communities: false
  seed: 42

Method Details

Spatial Graph Construction

Spatial graphs represent cell neighborhoods by connecting cells that are spatially close.

k-Nearest Neighbors (kNN): - Connects each cell to its k nearest neighbors - Creates a directed graph (can be made undirected) - Good for uniform cell densities - Fast computation using KD-tree

Radius-based: - Connects all cells within a specified radius - Creates an undirected graph - Good for variable cell densities - More edges than kNN for dense regions

Delaunay Triangulation: - Connects cells in a triangular mesh - Ensures all cells are connected to neighbors - Good for exploring local neighborhoods - Creates many edges

Citation: - Implementation based on scipy.spatial: scipy.spatial.cKDTree and scipy.spatial.Delaunay

Pairwise Enrichment

Pairwise enrichment tests whether two cell types co-occur or avoid each other more than expected by chance.

How it works:

  1. Observed Co-occurrence: Count edges between cell type A and cell type B in the spatial graph

  2. Expected Co-occurrence: Compute expected number of edges under random spatial distribution - Based on proportions of each cell type

  3. Permutation Test: Randomly shuffle cell type labels while preserving graph structure - Repeat n_permutations times - Compute z-score: (observed - mean(permuted)) / std(permuted)

  4. P-value: Proportion of permutations with z-score as extreme or more extreme

Interpretation: - Positive z-score + significant p-value: Enrichment (co-occurrence) - Negative z-score + significant p-value: Depletion (avoidance) - Non-significant: Random spatial distribution

Citation: - Based on standard spatial co-occurrence analysis methods used in spatial transcriptomics and imaging mass cytometry - Similar to methods in: Schapiro, D., et al. (2017). “histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data.” Nature Methods, 14(9), 873-876. DOI: 10.1038/s41592-017-0001-x

Distance Distributions

Distance distribution analysis computes the distribution of nearest-neighbor distances between cell types.

How it works:

  1. For each cell: Find nearest neighbor of each cell type (including same type)

  2. Distance Calculation: Compute Euclidean distance to nearest neighbor - Converted to micrometers using pixel_size_um

  3. Distribution Analysis: Aggregate distances across all cells - Compare distances between different cell type pairs - Visualize as violin/box plots

Interpretation: - Shorter distances: Cell types are spatially close - Longer distances: Cell types are spatially separated - Compare distributions to identify spatial relationships

Spatial Visualization

Spatial visualization displays cells in their spatial coordinates, colored by cluster or feature values.

Features: - Color cells by cluster assignment or feature expression - Optionally display spatial graph edges - Adjustable point sizes - Per-ROI visualization

Use cases: - Visual inspection of spatial organization - Identifying spatial patterns - Validating clustering results - Exploring feature spatial distributions

Spatial Communities

Spatial community detection identifies spatially coherent groups of cells using graph-based clustering.

How it works:

  1. Graph Construction: Build spatial graph (kNN, Radius, or Delaunay)

  2. Community Detection: Apply Leiden algorithm to spatial graph - Identifies communities based on graph structure - Communities are spatially coherent groups

  3. Filtering: Remove communities smaller than min_cells

Interpretation: - Communities represent spatially organized cell groups - May correspond to tissue structures or functional units - Can be used to identify spatial niches

Citation: - Leiden algorithm: Traag, V. A., et al. (2019). “From Louvain to Leiden: guaranteeing well-connected communities.” Scientific Reports, 9(1), 5233. DOI: 10.1038/s41598-019-41695-z - Implementation: leidenalg Python Package

Tips and Best Practices

  1. Graph Construction Method: - Use kNN for most cases (fast, good default) - Use Radius when cell density varies significantly - Use Delaunay for detailed local neighborhood analysis

  2. Parameter Selection: - k_neighbors: Start with 10, adjust based on cell density - radius: Should be 1-2 cell diameters - pixel_size_um: Critical for distance-based analyses, verify from metadata

  3. Pairwise Enrichment: - Use at least 100 permutations for reliable results - Increase to 500-1000 for publication-quality p-values - Interpret z-scores in context of p-values

  4. Distance Distributions: - Compare distances between different cell type pairs - Look for systematic differences indicating spatial relationships - Consider biological context when interpreting results

  5. Spatial Visualization: - Always visually inspect spatial organization - Use different color encodings to explore different aspects - Compare across ROIs to identify consistent patterns

  6. Validation: - Verify that spatial patterns are biologically meaningful - Check that graph construction parameters are appropriate - Ensure pixel size is correct for distance measurements

  7. Performance: - Use parallel workers for large datasets - Consider processing ROIs separately if memory is limited - Graph construction is fast, but enrichment analysis can be slow for many permutations

Spatial Analysis Visualizations

Simple Spatial Analysis provides several visualization options to explore spatial relationships and patterns. All visualizations are accessible from the spatial analysis dialog after building the spatial graph.

Available Visualizations

  1. Pairwise Enrichment: Heatmap showing spatial co-occurrence/avoidance between cluster pairs

  2. Distance Distributions: Violin/box plots of nearest-neighbor distances between cell types

  3. Spatial Visualization: Scatter plot of cells in spatial coordinates

  4. Spatial Communities: Visualization of spatially coherent cell communities

Pairwise Enrichment Visualization

Shows a heatmap of z-scores and p-values for spatial co-occurrence or avoidance between cluster pairs.

Parameters:

  • Permutations: Number of permutations for statistical testing (default: 100, range: 10-10000) - More permutations provide more accurate p-values - Recommended: 500-1000 for publication

  • Workers: Number of parallel workers for permutation tests (default: auto) - More workers speed up computation - Default: number of CPU cores - 2

How it works:

  1. Computes observed co-occurrence between cluster pairs in the spatial graph

  2. Performs permutation tests by randomly shuffling cluster labels

  3. Calculates z-scores: (observed - mean(permuted)) / std(permuted)

  4. Computes p-values from permutation distribution

  5. Displays results as a heatmap with z-scores color-coded

Interpretation:

  • Positive z-score + significant p-value: Enrichment (cell types co-occur more than expected)

  • Negative z-score + significant p-value: Depletion (cell types avoid each other)

  • Non-significant: Random spatial distribution

  • Color intensity indicates strength of association

Export:

  • Click “Save Plot” button to export

  • Options: PNG, JPG, or PDF format

  • Adjustable DPI (default: 300)

  • Optional font size and figure size override

Distance Distributions Visualization

Shows the distribution of nearest-neighbor distances between cell types using violin or box plots.

Parameters:

  • Clusters to display: Select which source clusters to analyze (multi-select) - When you select cluster(s), the plot shows distances FROM those clusters TO all other clusters - For example, selecting “Cluster 3” shows distances from Cluster 3 cells to their nearest neighbors in all other clusters - Can compare distances to same cluster vs. different clusters - Useful for identifying spatial relationships

How it works:

  1. For each cell in the selected cluster(s), finds the nearest neighbor of each cluster type

  2. Computes Euclidean distance to nearest neighbor

  3. Converts to micrometers using pixel_size_um

  4. Aggregates distances across all cells

  5. Displays as box plots grouped by cluster pair (Source → Target)

Important Note on Directionality:

  • Distance measurements are directional (asymmetric)

  • “Cluster 3 → Cluster 4” measures distances FROM Cluster 3 cells TO their nearest Cluster 4 neighbors

  • “Cluster 4 → Cluster 3” measures distances FROM Cluster 4 cells TO their nearest Cluster 3 neighbors

  • These can differ because spatial distributions are not symmetric

Interpretation:

  • Shorter distances: Cell types are spatially close

  • Longer distances: Cell types are spatially separated

  • Compare distributions to identify spatial relationships

  • Same-cluster distances (e.g., 3→3) show within-cluster spatial organization

  • Cross-cluster distances (e.g., 3→4) show how far cells must travel to reach another cluster type

Export:

  • Click “Save Plot” button to export

  • Same export options as other visualizations

Spatial Visualization

Displays cells in their spatial coordinates (x, y positions), colored by cluster or feature expression.

Parameters:

  • ROI: Select which ROI to visualize (dropdown) - Each ROI is visualized separately - Select from available ROIs in the dataset

  • Color by: Choose how to color cells - "cluster": Color by cluster assignment (default) - Feature columns: Color by continuous feature expression (e.g., marker intensities) - Searchable dropdown for easy feature selection

  • Point Size: Multiplier for point sizes (default: 1.0, range: 0.1-10.0) - 1.0 = default size - Increase for larger points (useful for sparse plots) - Decrease for smaller points (useful for dense plots)

  • Show edges: Checkbox to display spatial graph edges - Shows connections between neighboring cells - Can be slow for large datasets - Useful for visualizing graph structure

How it works:

  1. Extracts spatial coordinates (centroid_x, centroid_y) for selected ROI

  2. Colors cells based on selected attribute (cluster or feature)

  3. Optionally draws edges from spatial graph

  4. Displays as scatter plot with legend

Use cases:

  • Visual inspection of spatial organization

  • Identifying spatial patterns and domains

  • Validating clustering results

  • Exploring feature spatial distributions

  • Checking for batch effects across ROIs

Export:

  • Click “Save Plot” button to export

  • Same export options as other visualizations

Spatial Communities Visualization

Shows spatially coherent communities of cells identified using graph-based clustering.

Parameters:

  • ROI: Select which ROI to analyze (dropdown)

  • Min cells: Minimum number of cells in a community (default: 5, range: 1-100) - Filters out very small communities - Increase to focus on larger spatial structures

  • Exclude cell types: Optionally exclude specific cell types from community detection - Enable exclusion checkbox - Multi-select clusters to exclude - Useful for focusing on specific cell populations

How it works:

  1. Builds spatial graph for selected ROI

  2. Applies Leiden algorithm to identify communities

  3. Filters communities smaller than min_cells

  4. Visualizes communities as colored regions in spatial coordinates

  5. Shows community assignments and spatial organization

Interpretation:

  • Communities represent spatially organized cell groups

  • May correspond to tissue structures or functional units

  • Can be used to identify spatial niches

  • Compare community structure across ROIs

Export:

  • Click “Save Plot” button to export

  • Same export options as other visualizations

Exporting Plots

All visualizations can be exported using the “Save Plot” button in each tab.

Export Options:

  1. Format: Choose output format - PNG: Raster image (default, good for presentations) - JPG: Compressed raster image - PDF: Vector format (good for publications, scalable)

  2. DPI (Dots Per Inch): Resolution for raster formats - Default: 300 DPI (publication quality) - Range: 72-1200 DPI - Higher DPI = larger file size, better quality

  3. Font Size Override: Optionally override all font sizes - Check “Override figure font size” - Set font size in points (default: 10.0, range: 6.0-72.0) - Useful for adjusting text size for publications

  4. Figure Size Override: Optionally change figure dimensions - Check “Override figure size” - Set width and height in inches (default: 8.0 x 6.0) - Range: 1.0-100.0 inches

Export Workflow:

  1. Run the desired analysis (enrichment, distance, spatial viz, or communities)

  2. Adjust any parameters (point size, show edges, etc.)

  3. Click “Save Plot” button in the relevant tab

  4. In the save dialog: - Choose filename and location - Select format (PNG/JPG/PDF) - Set DPI (for raster formats) - Optionally override font size - Optionally override figure size

  5. Click “Save”

Tips for Export:

  • Use PDF format for publications (vector graphics, scalable)

  • Use PNG at 300 DPI for presentations and web

  • Increase font size for small figures in publications

  • Adjust figure size to match journal requirements

  • Spatial visualizations benefit from larger figure sizes to show detail

Accessing Visualizations in the GUI

  1. Build Spatial Graph: First, build the spatial graph using the controls at the top - Select graph construction method (kNN, Radius, or Delaunay) - Set parameters (k_neighbors, radius, pixel_size_um) - Click “Build Graph” - Graph must be built before visualizations are available

  2. Open Spatial Analysis Dialog: Navigate to Analysis → Spatial Analysis → Simple Spatial Analysis

  3. Select Tab: Use the tabs to access different visualizations - Pairwise Enrichment: Run enrichment analysis and view heatmap - Distance Distributions: Run distance analysis and view distributions - Spatial Visualization: Generate spatial scatter plots - Spatial Communities: Run community detection and view communities

  4. Adjust Parameters: Use controls in each tab to customize visualizations

  5. Export: Click “Save Plot” in each tab to export visualizations

Tab-Specific Controls:

  • Pairwise Enrichment: Permutations, Workers, Run button, Save Plot button

  • Distance Distributions: Cluster selection, Run button, Save Plot button

  • Spatial Visualization: ROI selection, Color by, Point Size, Show edges, Generate button, Save Plot button

  • Spatial Communities: ROI selection, Min cells, Exclude cell types, Run button, Save Plot button

Tips and Best Practices for Visualizations

  1. Pairwise Enrichment: - Use at least 100 permutations for reliable results - Increase to 500-1000 for publication-quality p-values - Interpret z-scores in context of p-values - Look for consistent patterns across multiple ROIs

  2. Distance Distributions: - Compare distances between different cell type pairs - Look for systematic differences indicating spatial relationships - Consider biological context when interpreting results - Compare same-cluster vs. cross-cluster distances

  3. Spatial Visualization: - Always visually inspect spatial organization - Use different color encodings to explore different aspects - Compare across ROIs to identify consistent patterns - Adjust point size for optimal visibility - Use “Show edges” sparingly (can be slow for large datasets)

  4. Spatial Communities: - Adjust min_cells to focus on relevant spatial scales - Exclude cell types that are not of interest - Compare community structure across ROIs - Use communities to identify spatial niches

  5. Export: - Use PDF for publications (vector graphics) - Use PNG at 300 DPI for presentations - Adjust font sizes for small figures - Spatial visualizations may need larger figure sizes