Cropping the data#
## The following code ensures that all functions and init files are reloaded before executions.
%load_ext autoreload
%autoreload 2
from pathlib import Path
from insitupy import InSituData, CACHE
Load Xenium data into InSituData object#
Now the Xenium data can be parsed by providing the data path to the InSituPy project folder
insitupy_project = Path(CACHE / "out/demo_insitupy_project")
xd = InSituData.read(insitupy_project)
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
Metadata file: .ispy
No modalities loaded.
# read all data modalities but the transcripts
xd.load_all(skip="transcripts")
xd
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project
Metadata file: .ispy
➤ images
nuclei: (25778, 35416)
CD20: (25778, 35416)
HER2: (25778, 35416)
HE: (25778, 35416, 3)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 157600 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
TestKey: 5 annotations, 2 classes ('TestClass', 'test') ✔
demo: 4 annotations, 2 classes ('Negative', 'Positive') ✔
demo2: 5 annotations, 3 classes ('Negative', 'Other', 'Positive') ✔
demo3: 7 annotations, 5 classes ('Immune cells', 'Necrosis', 'Stroma', 'Tumor', 'unclassified') ✔
➤ regions
demo_regions: 3 regions, 3 classes ('Region1', 'Region2', 'Region3') ✔
TMA: 6 regions, 6 classes ('A-1', 'A-2', 'A-3', 'B-1', 'B-2', 'B-3') ✔
# Visualize the data
xd.show()
Cropping of data#
There are two different methods implemented for cropping the data.
Option 1: Crop using limit values#
# alternatively you can also crop using the xlim/ylim arguments
xd_cropped = xd.crop(xlim=(2000,3000), ylim=(2000,3000))
xd_cropped
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: None
Metadata file: .ispy
➤ images
nuclei: (4706, 4706)
CD20: (4706, 4706)
HER2: (4706, 4706)
HE: (4706, 4706, 3)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 4550 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
TestKey: 1 annotations, 1 class ('TestClass')
demo: 2 annotations, 1 class ('Positive')
demo3: 1 annotations, 1 class ('Stroma')
➤ regions
demo_regions: 1 regions, 1 class ('Region3')
xd_cropped.show()
Option 2: Crop from regions#
We can also crop a region from the dataset. To specify the region, a tuple in the shape (region_key, region_name) is used.
xd_cropped = xd.crop(
region_tuple=("demo_regions", "Region1"))
Region1
xd_cropped
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: None
Metadata file: .ispy
➤ images
nuclei: (2701, 3309)
CD20: (2701, 3309)
HER2: (2701, 3309)
HE: (2701, 3309, 3)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 2289 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
➤ regions
demo_regions: 1 regions, 1 class ('Region1')
xd_cropped.show()
Saving the cropped data#
Saving to the existing project path is not possible#
Due to the cropping event, saving to the existing project path is not possible and the .save() function throws an error:
xd_cropped.save()
Reload also does not work because it was not saved as an InSituPy project.
xd_cropped.reload()
No modalities with existing save path found. Consider saving the data with `saveas()` first.
Saving to new project directory#
cropped_insitupy_project = insitupy_project.parent / f"{insitupy_project.name}_cropped"
xd_cropped.saveas(cropped_insitupy_project, overwrite=True)
Saving data to C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project_cropped
Saved.
Reload from InSituPy project folder#
Reloading from project folder makes visualizations more efficient. But of course only the modalities that had been loaded before the cropping event can be reloaded in this step.
# reload from insitupy project
xd_cropped = InSituData.read(cropped_insitupy_project)
xd_cropped.load_all()
xd_cropped
InSituData
Method: Xenium
Slide ID: 0001879
Sample ID: Replicate 1
Path: C:\Users\ge37voy\.cache\InSituPy\out\demo_insitupy_project_cropped
Metadata file: .ispy
➤ images
nuclei: (2701, 3309)
CD20: (2701, 3309)
HER2: (2701, 3309)
HE: (2701, 3309, 3)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 2289 × 297
obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'n_genes_by_counts', 'n_genes', 'leiden'
var: 'gene_ids', 'feature_types', 'genome', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'n_cells'
uns: 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
obsm: 'X_pca', 'X_umap', 'annotations', 'regions', 'spatial'
varm: 'PCs'
layers: 'counts', 'norm_counts'
obsp: 'connectivities', 'distances'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ annotations
➤ regions
demo_regions: 1 regions, 1 class ('Region1')
xd_cropped.show()