Cropping the data#

## The following code ensures that all functions and init files are reloaded before executions.
%load_ext autoreload
%autoreload 2
from pathlib import Path
from insitupy import InSituData, CACHE

Load Xenium data into InSituData object#

Now the Xenium data can be parsed by providing the data path to the InSituPy project folder

insitupy_project = Path(CACHE / "out/demo_insitupy_project")
xd = InSituData.read(insitupy_project)
xd
InSituData
Method:		Xenium
Slide ID:	0001879
Sample ID:	Replicate 1
Path:		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project


No modalities loaded.
# read all data modalities but the transcripts
xd.load_all(skip="transcripts")
xd
InSituData
Method:		Xenium
Slide ID:	0001879
Sample ID:	Replicate 1
Path:		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project

    ➤ images
       CD20:	(25778, 35416)
       HE:	(25778, 35416, 3)
       HER2:	(25778, 35416)
       nuclei:	(25778, 35416)
    ➤ cells
       MultiCellData with main layer 'main'
           matrix
               AnnData object with n_obs × n_vars = 157600 × 297
               obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
               var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
               uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
               obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
               varm: 'PCs'
               layers: 'counts', 'norm_counts'
               obsp: 'connectivities', 'distances'
           boundaries
               BoundariesData object with 2 entries:
                   cells
                   nuclei annotations
       TestKey:	3 annotations, 1 class ('TestClass') ✔
       test:	6 annotations, 1 class ('testclass') ✔
       demo:	4 annotations, 2 classes ('Negative', 'Positive') ✔
       demo2:	5 annotations, 3 classes ('Negative', 'Other', 'Positive') ✔
       demo3:	7 annotations, 5 classes ('Immune cells', 'Necrosis', 'Stroma', 'Tumor', 'unclassified') ✔
    ➤ regions
       demo_regions:	3 regions, 3 classes ('Region1', 'Region2', 'Region3') ✔
       TMA:	6 regions, 6 classes ('A-1', 'A-2', 'A-3', 'B-1', 'B-2', 'B-3') ✔
# Visualize the data
xd.show()

Cropping of data#

There are two different methods implemented for cropping the data.

Option 1: Crop using limit values#

# alternatively you can also crop using the xlim/ylim arguments
xd_cropped = xd.crop(xlim=(2000,3000), ylim=(2000,3000))
xd_cropped
InSituData
Method:		Xenium
Slide ID:	0001879
Sample ID:	Replicate 1
Path:		None

    ➤ images
       CD20:	(4706, 4706)
       HE:	(4706, 4706, 3)
       HER2:	(4706, 4706)
       nuclei:	(4706, 4706)
    ➤ cells
       MultiCellData with main layer 'main'
           matrix
               AnnData object with n_obs × n_vars = 4550 × 297
               obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
               var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
               uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
               obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
               varm: 'PCs'
               layers: 'counts', 'norm_counts'
               obsp: 'connectivities', 'distances'
           boundaries
               BoundariesData object with 2 entries:
                   cells
                   nuclei annotations
       TestKey:	2 annotations, 1 class ('TestClass') 
       test:	1 annotations, 1 class ('testclass') 
       demo:	2 annotations, 1 class ('Positive') 
       demo3:	1 annotations, 1 class ('Stroma') 
    ➤ regions
       demo_regions:	1 regions, 1 class ('Region3') 
xd_cropped.show()
INFO: New layer '🔬 TestClass (TestKey)' created.

Option 2: Crop from regions#

We can also crop a region from the dataset. To specify the region, a tuple in the shape (region_key, region_name) is used.

xd_cropped = xd.crop(
    region_tuple=("demo_regions", "Region1"))
Region1
xd_cropped
InSituData
Method:		Xenium
Slide ID:	0001879
Sample ID:	Replicate 1
Path:		None

    ➤ images
       CD20:	(2701, 3309)
       HE:	(2701, 3309, 3)
       HER2:	(2701, 3309)
       nuclei:	(2701, 3309)
    ➤ cells
       MultiCellData with main layer 'main'
           matrix
               AnnData object with n_obs × n_vars = 2289 × 297
               obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
               var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
               uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
               obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
               varm: 'PCs'
               layers: 'counts', 'norm_counts'
               obsp: 'connectivities', 'distances'
           boundaries
               BoundariesData object with 2 entries:
                   cells
                   nuclei annotations
       TestKey:	2 annotations, 1 class ('TestClass') 
    ➤ regions
       demo_regions:	1 regions, 1 class ('Region1') 
xd_cropped.show()
INFO: New layer '🔬 TestClass (TestKey)' created.
INFO: New layer '🌍 Region1 (demo_regions)' created.

Saving the cropped data#

Saving to the existing project path is not possible#

Due to the cropping event, saving to the existing project path is not possible and the .save() function throws an error:

xd_cropped.save()

Reload also does not work because it was not saved as an InSituPy project.

xd_cropped.reload()
No modalities with existing save path found. Consider saving the data with `saveas()` first.

Saving to new project directory#

cropped_insitupy_project = insitupy_project.parent / f"{insitupy_project.name}_cropped"
xd_cropped.saveas(cropped_insitupy_project, overwrite=True)
Saving data to C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project_cropped
Saved.

Reload from InSituPy project folder#

Reloading from project folder makes visualizations more efficient. But of course only the modalities that had been loaded before the cropping event can be reloaded in this step.

# reload from insitupy project
xd_cropped = InSituData.read(cropped_insitupy_project)
xd_cropped.load_all()
xd_cropped
InSituData
Method:		Xenium
Slide ID:	0001879
Sample ID:	Replicate 1
Path:		C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project_cropped

    ➤ images
       CD20:	(2701, 3309)
       HE:	(2701, 3309, 3)
       HER2:	(2701, 3309)
       nuclei:	(2701, 3309)
    ➤ cells
       MultiCellData with main layer 'main'
           matrix
               AnnData object with n_obs × n_vars = 2289 × 297
               obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
               var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
               uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
               obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
               varm: 'PCs'
               layers: 'counts', 'norm_counts'
               obsp: 'connectivities', 'distances'
           boundaries
               BoundariesData object with 2 entries:
                   cells
                   nuclei annotations
       TestKey:	2 annotations, 1 class ('TestClass') 
    ➤ regions
       demo_regions:	1 regions, 1 class ('Region1') 
xd_cropped.show()