Working with histological annotations and regions#
## The following code ensures that all functions and init files are reloaded before executions.
%load_ext autoreload
%autoreload 2
from pathlib import Path
from insitupy import InSituData, CACHE
Load Xenium data into InSituData object#
Now the Xenium data can be parsed by providing the data path to the InSituPy project folder
insitupy_project = Path(CACHE / "out/demo_insitupy_project")
xd = InSituData.read(insitupy_project)
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
No modalities loaded.
Here we only load the images and cells modalities. Since InSitupy v0.6.2 transcripts are loaded lazily, so it would be also just as fast to use load_all() here.
xd.load_images()
xd.load_cells()
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
➤ images
CD20: (25778, 35416)
HE: (25778, 35416, 3)
HER2: (25778, 35416)
nuclei: (25778, 35416)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 157600 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
Histological information in InSituPy#
InSituPy organizes histological information into two categories: Annotations and Regions.
Annotations#
Annotations are user-defined geometries (polygons, points, or lines) that mark biological features:
Class: The biological category (e.g., “tumor”, “immune cells”, “stroma”)
Key: The annotation source (e.g., pathologist name, annotation batch)
ID: Unique identifier to distinguish individual annotations
Classes within a key need not be unique, but multiple annotations can share the same class label.
Regions#
Regions are polygons that define spatial areas within a dataset:
Key: Groups related regions (e.g., “TMA”, “tissue_sections”)
Name: Unique identifier within each key (e.g., “TMA_A1”, “TMA_A2”)
Common use cases:
TMA cores
Multiple tissue sections on one slide
Regions of interest for focused analysis
Adding histological information#
This notebook demonstrates two workflows to add histological annotations:
Interactive annotation using InSituPy’s napari viewer integration
Import annotations from QuPath
1. In napari viewer#
First visualize the Xenium data using .show().
xd.show()
INFO: Added 3 new annotations to key 'TestKey'
INFO: Added 6 new annotations to key 'test'
A new annotation layer can be added using the “Add geometries” widget on the bottom right.
As described above, InSituPy differentiates between “regions” and histological “annotations”. Since napari creates separate layers for point and shape annotations, the “annotations” are split further into two subtypes, resulting in a total of three possible geometry types one can chose from:
Since InSituPy uses different icons to differentiate between the types, it is important to add the geometries via this widget and not via the normal napari annotation panel.
After adding the respective shapes layer, one can now add shapes using the tool box of napari on the top left. E.g. for a geometric annotation, the tool set would look like this:

After adding the geometries, they can be imported into the InSituPy object using the sync_geometries() function or the "Sync Geometries" button on the bottom right in napari.
# from insitupy.interactive import sync_geometries
# sync_geometries()
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
➤ images
CD20: (25778, 35416)
HE: (25778, 35416, 3)
HER2: (25778, 35416)
nuclei: (25778, 35416)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 157600 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
TestKey: 3 annotations, 1 class ('TestClass')
test: 6 annotations, 1 class ('testclass')
xd.annotations["TestKey"].head()
| objectType | geometry | name | color | origin | layer_type | |
|---|---|---|---|---|---|---|
| id | ||||||
| babb8bc0-e60c-40b3-8914-19f8784341a3 | annotation | POLYGON ((1286.349 1709.97253, 1269.88757 1915... | TestClass | [255, 0, 0] | manual | Shapes |
| d9630404-c83b-486c-960b-747b90c897b4 | annotation | POLYGON ((3558.02319 2664.73413, 3327.56348 24... | TestClass | [255, 0, 0] | manual | Shapes |
| 02a54e9b-aff2-4709-825c-5e97e10cbc7a | annotation | POLYGON ((5105.39551 2105.04614, 5072.47266 22... | TestClass | [255, 0, 0] | manual | Shapes |
Export annotations or regions#
To export annotations or regions, one can save the AnnotationsData or RegionsData object as .geojson file.
xd.annotations.save(path=CACHE / "out/annotations_export", overwrite=True)
2. Create annotations in QuPath#
To create annotations in QuPath, follow these steps:
Export the registered HE image as OME-TIFF by setting
as_zarr=False:
xd.images.save(CACHE / "out/image_export", keys_to_save="HE", as_zarr=False, overwrite=True)
Open the exported image in QuPath and start with the annotations. Documentations on QuPath can be found here.
Select an annotation tool from the bar on the top left:
Add as many annotations as you want and label them by setting classes in the annotation list. Do not forget to press the “Set class” button:
Export annotations using
File > Export objects as GeoJSON. TickPretty JSONto get an easily readable JSON file. The file name needs to have following structure:annotation-{slide_id}__{sample_id}__{annotation_label}.
Import annotations into InSituData#
For demonstration purposes, we created dummy annotation files in ./demo_annotations/. To add the annotations to InSituData follow the steps below.
3. Create regions in QuPath#
For creating regions, one can use the same annotation tools as described above. But instead of setting a class for the annotation, you can name the region by double-clicking on it, and selecting “Set Properties”:
For export, regions can then be selected and exported in the same way as described above for annotations.
Import annotations and regions#
Since annotations could have been done on images with different resolutions, it is important to specify the pixel_size during the import. In standard Xenium experiments the pixel size is 0.2125 µm but if the images were downscaled before the annotation, this value might differ.
In QuPath the pixel size and other image metadata can be looked up under “Image” and “Pixel width” or “Pixel height”:
xd.import_annotations(
files=[
"../../demo_data/demo_annotations/annotations-0001879__Replicate 1__demo.geojson",
"../../demo_data/demo_annotations/annotations-0001879__Replicate 1__demo2.geojson",
"../../demo_data/demo_annotations/annotations-mixed_types.geojson"
],
keys=["demo", "demo2", "demo3"],
scale_factor=0.2125
)
xd.annotations["demo"]
| objectType | geometry | name | color | origin | layer_type | |
|---|---|---|---|---|---|---|
| id | ||||||
| bd3aacca-1716-4df8-91dd-bf8f6413a7bd | annotation | POLYGON ((1883.3875 2297.975, 1883.3875 2300.1... | Positive | [250, 62, 62] | file | Shapes |
| 69814505-4059-42cd-8df2-752f7eb0810d | annotation | POLYGON ((2782.9 2654.55, 2777.885 2655.0175, ... | Positive | [250, 62, 62] | file | Shapes |
| 1957cd32-0a21-4b45-9dae-ecf236217140 | annotation | POLYGON ((6582.24275 4874.325, 6583.675 4874.3... | Negative | [112, 112, 225] | file | Shapes |
| 19d2197a-1b8e-456f-8223-fba74641ac1c | annotation | POLYGON ((6622.5625 3486.7, 6619.1625 3487.125... | Negative | [112, 112, 225] | file | Shapes |
xd.annotations["demo2"]
| objectType | geometry | name | color | origin | layer_type | |
|---|---|---|---|---|---|---|
| id | ||||||
| 1970eccb-ad38-4b4b-b7a8-54509027b57d | annotation | POLYGON ((5380.2875 827.05, 5379.0125 827.475,... | Negative | [112, 112, 225] | file | Shapes |
| a3b32cce-1bb9-4a6f-b1d1-9e0c44420cfa | annotation | POLYGON ((6576.875 2306.6875, 6575.6 2307.1125... | Positive | [250, 62, 62] | file | Shapes |
| 92bfe928-a21f-4864-b7cb-f0d300113d88 | annotation | MULTIPOLYGON (((4575.975 4152.4625, 4575.975 4... | Other | [255, 200, 0] | file | Shapes |
| a6c17a54-6839-40b2-8531-c9227635f344 | annotation | POLYGON ((1381.4625 3639.275, 1380.1875 3639.7... | Other | [255, 200, 0] | file | Shapes |
| e78efe2f-d185-4ab6-9cc9-6621897f3662 | annotation | POLYGON ((6272.92138 3936.1375, 6263.65 3945.2... | Negative | [112, 112, 225] | file | Shapes |
xd.annotations["demo3"]
| objectType | geometry | name | color | origin | layer_type | |
|---|---|---|---|---|---|---|
| id | ||||||
| 8f57c3c3-2216-48b7-99bd-aba12d8c3c41 | annotation | POLYGON ((3828.4 2261.74375, 3827.8135 2274.22... | Stroma | [150, 200, 150] | file | Shapes |
| 7e8f8db4-81d4-472e-8e93-0fc756df87aa | annotation | POLYGON ((2618.425 1436.075, 2618.07225 1436.1... | Stroma | [150, 200, 150] | file | Shapes |
| 38a48ddb-f33c-4c61-b996-330b25d84081 | annotation | LINESTRING (3600.5575 1648.7535, 3878.03575 13... | Necrosis | [50, 50, 50] | file | Shapes |
| eee244c9-e919-41ae-bb91-44c7abcc0cec | annotation | LINESTRING (2483.6235 1334.39588, 2855.93412 1... | Immune cells | [160, 90, 160] | file | Shapes |
| e3d4c0b6-0998-4692-ab7d-f580f713e275 | annotation | POINT (5096.23663 1632.56312) | unclassified | [0, 0, 0] | file | Points |
| e9105240-3b35-489e-994f-e8f9c4786516 | annotation | MULTIPOINT ((4219.655 1650.71912), (4261.94675... | Stroma | [150, 200, 150] | file | Points |
| 2802df97-78ad-44ac-8e6b-d9b9406c8e3f | annotation | MULTIPOINT ((3372.7915 2005.41562), (3530.0415... | Tumor | [200, 0, 0] | file | Points |
Load regions#
Regions can be created in QuPath either as described above or using tools like the TMA dearrayer. They are also exported as objects as annotations but different to annotations they do not have a classification and each name of a region has to be unique.
In the following demo regions are read. One of the region files has non-unique names to demonstrate the warning that appears in this case.
In regions classes have to be unique#
When reading an “Annotation” .geojson as shown below, the import_regions function throws an error indicating that in regions only one geometry per class is allowed. Further, only normal polygons (shapely.Polygon-typed) are allowed. Any other types of geometries (Points, Lines, MultiPolygons, …) are skipped.
xd.import_regions(
files=[
"../../demo_data/demo_annotations/annotations-mixed_types.geojson"
],
keys=['test'],
scale_factor=0.2125
)
Multiple regions can be imported simultaneously.
xd.import_regions(
files=[
"../../demo_data/demo_regions/regions-0001879__Replicate 1__demo_regions.geojson",
"../../demo_data/demo_regions/regions-0001879__Replicate 1__TMA.geojson",
],
keys=['demo_regions', 'TMA'],
scale_factor=0.2125
)
Properties of the anotations and regions modalities can be inspected in the InSituData representation:
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
➤ images
CD20: (25778, 35416)
HE: (25778, 35416, 3)
HER2: (25778, 35416)
nuclei: (25778, 35416)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 157600 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
TestKey: 3 annotations, 1 class ('TestClass')
test: 6 annotations, 1 class ('testclass')
demo: 4 annotations, 2 classes ('Negative', 'Positive')
demo2: 5 annotations, 3 classes ('Negative', 'Other', 'Positive')
demo3: 7 annotations, 5 classes ('Immune cells', 'Necrosis', 'Stroma', 'Tumor', 'unclassified')
➤ regions
demo_regions: 3 regions, 3 classes ('Region1', 'Region2', 'Region3')
TMA: 6 regions, 6 classes ('A-1', 'A-2', 'A-3', 'B-1', 'B-2', 'B-3')
Visualization of annotations and regions using napari viewer#
Ìf the InSituData object only contains .annotations or .regions attributes, one can choose between the “Add geometries” and “Show geometries” widgets:
Annotations and regions stored in the InSituData object can be visualized using the “Show geometries” widget:
To show the names of the annotations, tick “Show names”:
xd.show()
INFO: New layer '🔬 TestClass (TestKey)' created.
INFO: New layer '🌍 Region1 (demo_regions)' created.
INFO: New layer '🌍 Region2 (demo_regions)' created.
INFO: New layer '🌍 Region3 (demo_regions)' created.
Assign annotations to observations#
To use the annotations in analyses (e.g. to select only observations within a certain annotation or compare gene expression between different annotations) one can use the assign_annotations function. It adds columns containing the annotation class to xd.matrix.obs. The column has the syntax annotation-{Label} and if an observation is not part of any annotation within this label, it contains NaN.
xd.assign_annotations(overwrite=True)
Using CellData from MultiCellData layer 'main'.
Assigning key 'TestKey'...
Added results to `.cells['main'].matrix.obsm['annotations']
Assigning key 'test'...
Added results to `.cells['main'].matrix.obsm['annotations']
Assigning key 'demo'...
Added results to `.cells['main'].matrix.obsm['annotations']
Assigning key 'demo2'...
Added results to `.cells['main'].matrix.obsm['annotations']
Assigning key 'demo3'...
Added results to `.cells['main'].matrix.obsm['annotations']
xd.assign_regions()
Using CellData from MultiCellData layer 'main'.
Assigning key 'demo_regions'...
Added results to `.cells['main'].matrix.obsm['regions']
Assigning key 'TMA'...
Added results to `.cells['main'].matrix.obsm['regions']
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
➤ images
CD20: (25778, 35416)
HE: (25778, 35416, 3)
HER2: (25778, 35416)
nuclei: (25778, 35416)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 157600 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'spatial', 'annotations', 'regions'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
TestKey: 3 annotations, 1 class ('TestClass') ✔
test: 6 annotations, 1 class ('testclass') ✔
demo: 4 annotations, 2 classes ('Negative', 'Positive') ✔
demo2: 5 annotations, 3 classes ('Negative', 'Other', 'Positive') ✔
demo3: 7 annotations, 5 classes ('Immune cells', 'Necrosis', 'Stroma', 'Tumor', 'unclassified') ✔
➤ regions
demo_regions: 3 regions, 3 classes ('Region1', 'Region2', 'Region3') ✔
TMA: 6 regions, 6 classes ('A-1', 'A-2', 'A-3', 'B-1', 'B-2', 'B-3') ✔
After assigning the annotations, the already analyzed labels analyzed are marked with a ✔:
xd.regions["demo_regions"]
| objectType | name | geometry | origin | color | layer_type | |
|---|---|---|---|---|---|---|
| id | ||||||
| 2d0da635-c408-459f-9178-839097fe5a98 | annotation | Region1 | POLYGON ((1564.425 1321.9625, 2267.8 1321.9625... | file | [255, 0, 0] | Shapes |
| ce6c2342-620d-4f44-be03-68a4454e9b33 | annotation | Region2 | POLYGON ((4541.7625 1356.3875, 5613.825 1356.3... | file | [255, 0, 0] | Shapes |
| 70a125ec-c53e-469b-8927-efe224e504c1 | annotation | Region3 | POLYGON ((2110.7625 2708.3125, 3387.675 2708.3... | file | [255, 0, 0] | Shapes |
Following cells show examples how to explore the assigned annotations:
xd.cells.matrix.obsm['annotations']['demo2']
2 unassigned
5 unassigned
8 unassigned
10 unassigned
13 unassigned
...
167776 unassigned
167777 unassigned
167778 unassigned
167779 unassigned
167780 unassigned
Name: demo2, Length: 157600, dtype: object
# print number of cells within each annotation
annots = xd.cells.matrix.obsm['annotations']['demo2']
annots.value_counts()
demo2
unassigned 148458
Negative 4839
Other 2605
Positive 1698
Name: count, dtype: int64
# show geopandas dataframe for one annotation
xd.annotations["demo2"]
| objectType | geometry | name | color | origin | layer_type | |
|---|---|---|---|---|---|---|
| id | ||||||
| 1970eccb-ad38-4b4b-b7a8-54509027b57d | annotation | POLYGON ((5380.2875 827.05, 5379.0125 827.475,... | Negative | [112, 112, 225] | file | Shapes |
| a3b32cce-1bb9-4a6f-b1d1-9e0c44420cfa | annotation | POLYGON ((6576.875 2306.6875, 6575.6 2307.1125... | Positive | [250, 62, 62] | file | Shapes |
| 92bfe928-a21f-4864-b7cb-f0d300113d88 | annotation | MULTIPOLYGON (((4575.975 4152.4625, 4575.975 4... | Other | [255, 200, 0] | file | Shapes |
| a6c17a54-6839-40b2-8531-c9227635f344 | annotation | POLYGON ((1381.4625 3639.275, 1380.1875 3639.7... | Other | [255, 200, 0] | file | Shapes |
| e78efe2f-d185-4ab6-9cc9-6621897f3662 | annotation | POLYGON ((6272.92138 3936.1375, 6263.65 3945.2... | Negative | [112, 112, 225] | file | Shapes |
Save imported annotations in InSituPy project#
xd.save()
Saving to existing path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
Updating project in C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
Updating cells...
Updating annotations...
Updating regions...
Saved.



