Advanced Spatial Analysis¶
Advanced Spatial Analysis provides sophisticated spatial analysis methods using squidpy, including neighborhood enrichment, co-occurrence analysis, spatial autocorrelation, and Ripley functions.
Overview¶
Advanced Spatial Analysis extends Simple Spatial Analysis with additional methods from the squidpy package, enabling more sophisticated spatial pattern analysis, statistical testing, and spatial statistics.
Key Features: - Squidpy Integration: All analyses are implemented using the squidpy package - AnnData Format: Data is converted to AnnData format for compatibility with the scverse ecosystem - Export Capability: AnnData objects can be exported to H5AD files for downstream analysis in other tools
Note
Advanced Spatial Analysis requires the squidpy and anndata packages to be installed. Install with: pip install squidpy anndata
Options¶
Advanced Spatial Analysis includes:
Neighborhood Enrichment: Analyze enrichment of cell types in neighborhoods
Co-occurrence Analysis: Test for spatial co-occurrence patterns
Spatial Autocorrelation: Measure spatial correlation of features
Ripley Functions: Analyze spatial point patterns (K and L functions)
Additional Spatial Statistics: Various spatial metrics and tests
Parameters¶
Graph Construction¶
The graph-construction controls resemble Simple Spatial Analysis in the GUI, but the implementation is different:
method:
kNN,Radius, orDelaunayk_neighbors: Number of neighbors for kNN graph construction through Squidpy
radius: Maximum distance for
Radiusgraphs in micrometers after coordinate scalingpixel_size_um: Pixel size in micrometers used to convert centroid coordinates before building AnnData graphs
Neighborhood Enrichment¶
ROI (optional): Run a single ROI or use All ROIs to aggregate across the current ROI set
Aggregation:
MeanorSumwhen All ROIs is selectedcluster_column: Cluster annotation used for the Squidpy enrichment matrix
Note
The current OpenIMC wrapper does not expose Squidpy’s neighborhood-enrichment n_perms or enrichment-level seed options. The GUI and CLI surface the resulting z-score/count-style matrices rather than neighborhood-enrichment p-values.
Co-occurrence Analysis¶
reference_cluster (optional): Reference cluster for one-vs-others analysis - If specified, compares reference cluster against all others - If not specified, performs pairwise comparisons
method (default:
"pairwise"): Analysis method -"pairwise": Compare all cluster pairs -"one_vs_others": Compare reference cluster against all others
Spatial Autocorrelation¶
feature (required): Feature column to analyze - Can be a marker expression or other numeric feature
method (default:
"moran"): Autocorrelation method -"moran": Moran’s I statistic -"geary": Geary’s C statisticn_permutations (default:
100): Number of permutations for significance testing
Ripley Functions¶
cluster_column (required): Column name containing cluster assignments - Typically
"cluster"mode (default:
"K"): Ripley function type -"K": Ripley’s K function -"L": Ripley’s L function (normalized K function)max_dist (optional): Maximum distance to compute function - If not specified, uses a default based on data extent
roi_column (optional): Column name for ROI grouping - If specified, computes Ripley functions per ROI
Using Advanced Spatial Analysis in the GUI¶
Ensure clustering has been completed
Navigate to Analysis → Spatial Analysis → Advanced Spatial Analysis in the menu bar
In the advanced spatial analysis dialog:
Build Spatial Graph:
Select graph construction method (kNN, Radius, or Delaunay)
Set parameters (k_neighbors, radius, pixel_size_um)
Click “Build Graph”
This converts your data to AnnData format and builds spatial graphs using squidpy
Neighborhood Enrichment Tab:
Optionally choose a single ROI or All ROIs
Choose aggregation mode and cluster column
Click “Run Neighborhood Enrichment”
Results show a cluster-by-neighbor-cluster z-score matrix
Single-ROI results are stored in AnnData objects; multi-ROI results are aligned and aggregated for display
Co-occurrence Analysis Tab:
Select analysis method (pairwise or one-vs-others)
Optionally specify reference cluster
Click “Run Co-occurrence Analysis”
Results are stored in AnnData objects
Spatial Autocorrelation Tab:
Select feature to analyze
Choose autocorrelation method (Moran’s I or Geary’s C)
Set number of permutations
Click “Run Autocorrelation Analysis”
Results are stored in AnnData objects
Ripley Functions Tab:
Select cluster column
Choose function type (K or L)
Set maximum distance
Click “Run Ripley Analysis”
Results are stored in AnnData objects
Export AnnData:
Click “Export AnnData” button to save AnnData objects
Choose to export as:
Combined file: Single H5AD file with all ROIs
Separate files: One H5AD file per ROI
Exported H5AD files can be used in other tools (scanpy, squidpy, etc.)
Export analysis results using the export buttons
Using Advanced Spatial Analysis in the CLI¶
Neighborhood Enrichment¶
openimc spatial-nhood-enrichment features.csv \\
--output enrichment_results.h5ad \\
--method kNN \\
--k-neighbors 10 \\
--cluster-column cluster \\
--aggregation mean
Co-occurrence Analysis¶
openimc spatial-cooccurrence features.csv cooccurrence_results.csv \\
--method pairwise \\
--reference-cluster "Cluster_1"
Spatial Autocorrelation¶
openimc spatial-autocorr features.csv autocorr_results.csv \\
--feature CD3_1841_mean \\
--method moran \\
--n-permutations 500
Ripley Functions¶
openimc spatial-ripley features.csv ripley_results.h5ad \\
--cluster-column cluster \\
--mode K \\
--max-dist 100.0 \\
--pixel-size-um 1.0
Build Spatial Graph and Export AnnData¶
openimc spatial-anndata features.csv --output spatial_graph.h5ad \\
--method kNN \\
--k-neighbors 10 \\
--pixel-size-um 1.0 \\
--combined
Export AnnData Objects¶
openimc export-anndata input.h5ad output.h5ad \\
--combined
Method Details¶
Neighborhood Enrichment¶
OpenIMC advanced neighborhood enrichment wraps squidpy.gr.nhood_enrichment on ROI-specific AnnData spatial graphs.
The result is best interpreted as a focal-cluster by neighbor-cluster enrichment matrix rather than the undirected edge-pair summary used by Simple Spatial Analysis.
How it works:
Graph Construction: Build or reuse ROI-specific AnnData spatial connectivities with Squidpy
Neighborhood Counting: For each cluster, count neighboring clusters in the AnnData connectivity graph
Squidpy Statistic: Run Squidpy’s neighborhood-enrichment permutation test to compute z-scores and counts - Squidpy stores these results under
adata.uns['{cluster_key}_nhood_enrichment']OpenIMC Aggregation: If multiple ROIs are selected, OpenIMC aligns ROI matrices to the union of clusters and aggregates the z-score matrices by
meanorsumfor display
Interpretation: - Rows correspond to the focal cell cluster and columns correspond to the neighboring cluster - Positive z-scores indicate more neighbors than expected; negative z-scores indicate fewer neighbors than expected - The current OpenIMC wrapper surfaces z-score/count-style outputs from Squidpy, not neighborhood-enrichment p-values - Depending on graph construction and aggregation, the matrix should not be interpreted as the same statistic as simple pairwise enrichment
Why Results Differ from Simple Pairwise Enrichment¶
Statistic definition: Advanced neighborhood enrichment uses Squidpy’s cluster-by-neighbor-cluster enrichment statistic. Simple pairwise enrichment counts unordered edges between cluster pairs.
Graph representation: Advanced enrichment uses AnnData spatial connectivities. Simple enrichment uses OpenIMC’s deduplicated undirected edge list.
Symmetry and directionality: Advanced heatmaps are interpreted as row cluster versus neighbor cluster. Simple pairwise heatmaps are symmetric by construction.
ROI aggregation: Advanced OpenIMC displays align ROI matrices and aggregate z-scores by
meanorsum. Simple GUI displays average ROI-levelz_scoreandp_valuevalues by cluster pair.Reproducibility controls: Simple pairwise enrichment exposes
n_permutationsandseeddirectly for the enrichment step. Advanced neighborhood enrichment currently does not expose Squidpy’s neighborhood-enrichmentn_permsor enrichment-levelseedcontrols through the OpenIMC wrapper.
Citation: - Based on methods in: Schapiro, D., et al. (2017). “histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data.” Nature Methods, 14(9), 873-876. DOI: 10.1038/s41592-017-0001-x - Implementation: squidpy.gr.nhood_enrichment - Additional implementation references: Squidpy graph builder and Squidpy neighborhood-enrichment source
Co-occurrence Analysis¶
Co-occurrence analysis tests whether cell types tend to appear together in spatial proximity.
How it works:
Spatial Proximity: Define spatial proximity based on spatial graph (kNN, Radius, etc.)
Observed Co-occurrence: Count how often cell type pairs appear in proximity
Expected Co-occurrence: Compute expected co-occurrence under random distribution
Statistical Testing: Use permutation tests to assess significance
Pairwise Mode: Compares all pairs of cell types
One-vs-Others Mode: Compares a reference cell type against all others
Citation: - Implementation: squidpy.gr.co_occurrence
Spatial Autocorrelation¶
Spatial autocorrelation measures how similar feature values are for spatially nearby cells.
Moran’s I: - Range: -1 to 1 - Positive values: Similar values cluster together (positive autocorrelation) - Negative values: Dissimilar values cluster together (negative autocorrelation) - Near 0: No spatial autocorrelation (random spatial distribution)
Geary’s C: - Range: 0 to 2 - Values < 1: Positive autocorrelation - Values > 1: Negative autocorrelation - Values near 1: No autocorrelation
How it works:
Spatial Weights: Define spatial weights matrix based on spatial graph
Autocorrelation Statistic: Compute Moran’s I or Geary’s C using spatial weights
Statistical Testing: Use permutation tests to assess significance
Interpretation: - Positive autocorrelation: Feature values are spatially clustered - Negative autocorrelation: Feature values are spatially dispersed - Useful for identifying spatial gradients or domains
Citation: - Moran, P. A. P. (1950). “Notes on continuous stochastic phenomena.” Biometrika, 37(1/2), 17-23. DOI: 10.2307/2332142 - Geary, R. C. (1954). “The contiguity ratio and statistical mapping.” The Incorporated Statistician, 5(3), 115-145. DOI: 10.2307/2986645 - Implementation: squidpy.gr.spatial_autocorr
Ripley Functions¶
Ripley functions analyze spatial point patterns to test for clustering or dispersion.
Ripley’s K Function: - Measures the expected number of points within distance r of a randomly chosen point - Under complete spatial randomness (CSR): K(r) = πr² - K(r) > πr²: Clustering - K(r) < πr²: Dispersion
Ripley’s L Function: - Normalized version: L(r) = √(K(r)/π) - r - Under CSR: L(r) = 0 - L(r) > 0: Clustering - L(r) < 0: Dispersion
How it works:
Distance Calculation: For each point, count neighbors within distance r
Edge Correction: Apply edge correction for points near ROI boundaries
Function Computation: Compute K(r) or L(r) for a range of distances
Comparison to CSR: Compare observed function to expected under complete spatial randomness
Interpretation: - Clustering: Cell type is more clustered than random - Dispersion: Cell type is more dispersed than random - Useful for identifying spatial organization patterns
Citation: - Ripley, B. D. (1976). “The second-order analysis of stationary point processes.” Journal of Applied Probability, 13(2), 255-266. DOI: 10.2307/3212829 - Ripley, B. D. (1977). “Modelling spatial patterns.” Journal of the Royal Statistical Society: Series B, 39(2), 172-192. DOI: 10.1111/j.2517-6161.1977.tb01615.x - Implementation: squidpy.gr.ripley
Squidpy and AnnData Integration¶
Advanced Spatial Analysis uses the squidpy package, which provides a comprehensive toolkit for spatial omics analysis. All analyses are performed using AnnData objects, which provide a standardized format for single-cell and spatial omics data.
Data Flow:
Input: Feature DataFrame (CSV) with cell features and spatial coordinates
Conversion: DataFrame is converted to AnnData format using
dataframe_to_anndata()- Features stored inadata.Xoradata.obs- Spatial coordinates stored inadata.obsm['spatial']- Metadata stored inadata.obsGraph Construction: Spatial graphs are built using squidpy - Graphs stored in
adata.obsp['spatial_connectivities']andadata.obsp['spatial_distances']Analysis: Squidpy functions operate on AnnData objects - Results stored in
adata.uns,adata.obs, oradata.obsmExport: AnnData objects can be exported to H5AD files for use in other tools
AnnData Export:
AnnData objects can be exported in two formats:
Combined Export: All ROIs combined into a single H5AD file - Useful for downstream analysis in scanpy or other tools - Preserves ROI information in
adata.obsSeparate Export: One H5AD file per ROI - Useful when analyzing ROIs independently - Files named as
anndata_roi_{roi_id}.h5ad
Using Exported AnnData:
Exported H5AD files can be used in:
scanpy: For additional single-cell analysis
squidpy: For additional spatial analysis methods
Python scripts: Load with
anndata.read_h5ad()R/Bioconductor: Using the
zellkonverterpackage
Citation: - Palla, G., et al. (2022). “Squidpy: a scalable framework for spatial omics analysis.” Nature Methods, 19(2), 171-178. DOI: 10.1038/s41592-021-01358-2 - squidpy Documentation - squidpy GitHub - AnnData: Virshup, I., et al. (2023). “The scverse project provides a computational ecosystem for single-cell omics data analysis.” Nature Biotechnology, 41(5), 604-606. DOI: 10.1038/s41587-023-01733-8 - AnnData Documentation
Tips and Best Practices¶
Installation: Ensure squidpy is installed:
pip install squidpyMethod Selection: - Use Neighborhood Enrichment to identify cell type interactions - Use Co-occurrence Analysis for pairwise spatial relationships - Use Spatial Autocorrelation to identify spatial gradients - Use Ripley Functions to test for clustering/dispersion
Parameter Tuning: - Tune graph parameters first (
k_neighbors,radius, and ROI selection), since they strongly affect neighborhood-based statistics - max_dist (Ripley): Should cover relevant spatial scales (1-5 cell diameters)Statistical Interpretation: - Interpret each method using the outputs it actually returns - For neighborhood enrichment, focus on the z-score matrix and row/column semantics - Multiple testing correction may be needed for many comparisons - Visualize results to understand spatial patterns
Validation: - Compare results across different graph construction methods - Verify that spatial patterns are biologically meaningful - Check edge effects in Ripley functions
Performance: - Advanced methods can be computationally intensive - Use parallel processing when available - Consider analyzing subsets of data for exploration
Integration with Simple Spatial Analysis: - Use Simple Spatial Analysis for initial exploration - Use Advanced Spatial Analysis for Squidpy-based neighborhood and spatial statistics - Treat simple pairwise enrichment and advanced neighborhood enrichment as complementary, not interchangeable
AnnData Export: - Export AnnData objects after building graphs or running analyses - Use combined export for multi-ROI analysis in other tools - Use separate export for ROI-specific analysis - Exported H5AD files are compatible with the entire scverse ecosystem
Workflow Integration: - Build spatial graphs first (creates AnnData objects) - Run analyses (results stored in AnnData) - Export AnnData for downstream analysis or visualization - Can reload exported AnnData files for further analysis
Advanced Spatial Analysis Visualizations¶
Advanced Spatial Analysis provides sophisticated visualization options for exploring spatial patterns using squidpy methods. All visualizations are accessible from the advanced spatial analysis dialog after building the spatial graph.
Available Visualizations¶
Neighborhood Enrichment: Heatmap showing enrichment/depletion of cell types in neighborhoods
Co-occurrence Analysis: Heatmap showing spatial co-occurrence patterns
Spatial Autocorrelation: Visualization of spatial autocorrelation statistics
Ripley Functions: Plots of Ripley’s K and L functions for spatial point patterns
Neighborhood Enrichment Visualization¶
Shows a heatmap of neighborhood-enrichment z-scores, with focal cell clusters on the rows and neighbor clusters on the columns.
Parameters:
ROI: Plot a selected ROI or an aggregated view across All ROIs
Aggregation:
MeanorSumwhen plotting multiple ROIs togetherCluster column: Select the cluster annotation used to build the Squidpy enrichment matrix
How it works:
Build or reuse the AnnData spatial graph for each ROI
Run Squidpy neighborhood enrichment on the selected cluster column
Read Squidpy’s z-score/count results from the AnnData object
If multiple ROIs are selected, align cluster sets and aggregate z-score matrices by
meanorsumDisplay the resulting matrix as a heatmap
Interpretation:
Positive z-score: The column cluster appears more often than expected in the row cluster’s neighborhood
Negative z-score: The column cluster appears less often than expected in the row cluster’s neighborhood
OpenIMC currently plots z-score matrices for this analysis rather than neighborhood-enrichment p-values
Color intensity indicates strength of enrichment/depletion
Export:
Click “Save Plot” button to export
Options: PNG, JPG, or PDF format
Adjustable DPI (default: 300)
Optional font size and figure size override
Co-occurrence Analysis Visualization¶
Shows a heatmap of co-occurrence scores indicating whether cell types tend to appear together in spatial proximity.
Parameters:
Method: Analysis method -
"pairwise": Compare all cluster pairs (default) -"one_vs_others": Compare reference cluster against all othersReference cluster (for one_vs_others mode): Select cluster to compare against all others
How it works:
Defines spatial proximity based on spatial graph (kNN, Radius, etc.)
Counts observed co-occurrence of cell type pairs in proximity
Computes expected co-occurrence under random distribution
Performs permutation tests to assess significance
Displays results as heatmap with co-occurrence scores and p-values
Pairwise Mode: Compares all pairs of cell types
One-vs-Others Mode: Compares a reference cell type against all others
Interpretation:
Positive score + significant p-value: Co-occurrence (cell types appear together more than expected)
Negative score + significant p-value: Avoidance (cell types appear together less than expected)
Non-significant: Random spatial distribution
Export:
Click “Save Plot” button to export
Same export options as other visualizations
Spatial Autocorrelation Visualization¶
Shows spatial autocorrelation statistics (Moran’s I or Geary’s C) for selected features, indicating spatial clustering or dispersion.
Parameters:
Feature: Select feature column to analyze (required) - Can be a marker expression or other numeric feature - Dropdown with all available features
Method: Autocorrelation method -
"moran": Moran’s I statistic (default) -"geary": Geary’s C statisticn_permutations (default:
100): Number of permutations for significance testing - Recommended: 500-1000 for publication
Moran’s I: - Range: -1 to 1 - Positive values: Similar values cluster together (positive autocorrelation) - Negative values: Dissimilar values cluster together (negative autocorrelation) - Near 0: No spatial autocorrelation (random spatial distribution)
Geary’s C: - Range: 0 to 2 - Values < 1: Positive autocorrelation - Values > 1: Negative autocorrelation - Values near 1: No autocorrelation
How it works:
Defines spatial weights matrix based on spatial graph
Computes Moran’s I or Geary’s C using spatial weights
Performs permutation tests to assess significance
Displays statistic value, p-value, and visualization
Interpretation:
Positive autocorrelation: Feature values are spatially clustered
Negative autocorrelation: Feature values are spatially dispersed
Useful for identifying spatial gradients or domains
High autocorrelation indicates spatial structure in feature expression
Export:
Click “Save Plot” button to export
Same export options as other visualizations
Ripley Functions Visualization¶
Shows Ripley’s K or L functions for analyzing spatial point patterns, testing for clustering or dispersion.
Parameters:
Cluster column: Column name containing cluster assignments (default:
"cluster")Mode: Ripley function type -
"K": Ripley’s K function (default) -"L": Ripley’s L function (normalized K function)max_dist (optional): Maximum distance to compute function - If not specified, uses a default based on data extent - Should cover relevant spatial scales (1-5 cell diameters)
Ripley’s K Function: - Measures the expected number of points within distance r of a randomly chosen point - Under complete spatial randomness (CSR): K(r) = πr² - K(r) > πr²: Clustering - K(r) < πr²: Dispersion
Ripley’s L Function: - Normalized version: L(r) = √(K(r)/π) - r - Under CSR: L(r) = 0 - L(r) > 0: Clustering - L(r) < 0: Dispersion
How it works:
For each point, counts neighbors within distance r
Applies edge correction for points near ROI boundaries
Computes K(r) or L(r) for a range of distances
Compares observed function to expected under complete spatial randomness
Displays function plot with confidence intervals
Interpretation:
Clustering: Cell type is more clustered than random
Dispersion: Cell type is more dispersed than random
Useful for identifying spatial organization patterns
Compare functions across different cell types
Look for peaks (clustering) or valleys (dispersion) at specific distances
Export:
Click “Save Plot” button to export
Same export options as other visualizations
Exporting Plots¶
All visualizations can be exported using the “Save Plot” button in each tab.
Export Options:
Format: Choose output format -
PNG: Raster image (default, good for presentations) -JPG: Compressed raster image -PDF: Vector format (good for publications, scalable)DPI (Dots Per Inch): Resolution for raster formats - Default: 300 DPI (publication quality) - Range: 72-1200 DPI - Higher DPI = larger file size, better quality
Font Size Override: Optionally override all font sizes - Check “Override figure font size” - Set font size in points (default: 10.0, range: 6.0-72.0) - Useful for adjusting text size for publications
Figure Size Override: Optionally change figure dimensions - Check “Override figure size” - Set width and height in inches (default: 8.0 x 6.0) - Range: 1.0-100.0 inches
Export Workflow:
Build spatial graph (creates AnnData objects)
Run the desired analysis (neighborhood enrichment, co-occurrence, autocorrelation, or Ripley)
Adjust any parameters
Click “Save Plot” button in the relevant tab
In the save dialog: - Choose filename and location - Select format (PNG/JPG/PDF) - Set DPI (for raster formats) - Optionally override font size - Optionally override figure size
Click “Save”
Tips for Export:
Use PDF format for publications (vector graphics, scalable)
Use PNG at 300 DPI for presentations and web
Increase font size for small figures in publications
Adjust figure size to match journal requirements
Heatmaps may need larger figure sizes to show all labels clearly
Accessing Visualizations in the GUI¶
Build Spatial Graph: First, build the spatial graph using the controls at the top - Select graph construction method (kNN, Radius, or Delaunay) - Set parameters (k_neighbors, radius, pixel_size_um) - Click “Build Graph” - This converts data to AnnData format and builds spatial graphs using squidpy - Graph must be built before analyses are available
Open Advanced Spatial Analysis Dialog: Navigate to Analysis → Spatial Analysis → Advanced Spatial Analysis in the menu bar
Select Tab: Use the tabs to access different analyses and visualizations - Neighborhood Enrichment: Run enrichment analysis and view heatmap - Co-occurrence Analysis: Run co-occurrence analysis and view heatmap - Spatial Autocorrelation: Run autocorrelation analysis and view results - Ripley Functions: Run Ripley analysis and view function plots
Adjust Parameters: Use controls in each tab to customize analyses
Export: Click “Save Plot” in each tab to export visualizations
Export AnnData: Click “Export AnnData” button to save AnnData objects for downstream analysis
Tab-Specific Controls:
Neighborhood Enrichment: ROI selection, Aggregation, Cluster column, Run button, Save Plot button
Co-occurrence Analysis: Method, Reference cluster (for one_vs_others), Run button, Save Plot button
Spatial Autocorrelation: Feature selection, Method, n_permutations, Run button, Save Plot button
Ripley Functions: Cluster column, Mode, max_dist, Run button, Save Plot button
Tips and Best Practices for Visualizations¶
Neighborhood Enrichment: - Interpret the heatmap by row cluster versus neighbor cluster - Compare results across graph settings and ROI aggregation modes - Look for consistent patterns across multiple ROIs - Consider biological context when interpreting results
Co-occurrence Analysis: - Use pairwise mode to explore all relationships - Use one_vs_others mode to focus on specific cell types - Compare results across different graph construction methods - Consider multiple testing correction for many comparisons
Spatial Autocorrelation: - Select informative features (known spatial markers) - Use Moran’s I for most cases (more commonly used) - Use Geary’s C for alternative perspective - High autocorrelation indicates spatial structure
Ripley Functions: - Set max_dist to cover relevant spatial scales (1-5 cell diameters) - Compare functions across different cell types - Look for peaks (clustering) or valleys (dispersion) at specific distances - Consider edge effects near ROI boundaries
Export: - Use PDF for publications (vector graphics) - Use PNG at 300 DPI for presentations - Adjust font sizes for small figures - Heatmaps may need larger figure sizes - Export AnnData objects for further analysis in other tools