Reading data from QuPath projects#
Overview#
The functions read_qupath_project and read_qupath load and processes spatial data exported from QuPath into an InSituData object. They integrate annotations, cellular measurements, cell boundaries, and image data for downstream spatial analysis.
To export data from QuPath in the correct format, use the following script:
👉 QuPath Export Script
File Descriptions#
Following files are generated with the script linked above:
annotation.geojson:
Contains a single geometry used to shift coordinates to the origin.measurements.tsv:
Tabular data with measurements for ‘Cell’, ‘Nucleus’, ‘Cytoplasm’, and ‘Membrane’. These are stored indata.cells.matrix.cells.geojson:
Contains boundaries for individual cells.image.ome.tif:
The image file corresponding to the spatial data.
📦 Downloading Example Data#
Example data to test is publicly available HERE and can be programmatically downloaded as follows.
## The following code ensures that all functions and init files are reloaded before executions.
%load_ext autoreload
%autoreload 2
import requests, zipfile, io
import os
from pathlib import Path
# Download link for data
url = "https://syncandshare.lrz.de/dl/fiCzXM29fitaAwgknh6LHr/QuPath-Demo.dir"
# Define your custom extraction path
download_path = Path("../../downloads/QuPath-Demo")
# Ensure the directory exists
os.makedirs(download_path, exist_ok=True)
# Download and extract the zip file
response = requests.get(url)
with zipfile.ZipFile(io.BytesIO(response.content)) as z:
z.extractall(download_path)
from insitupy import read_qupath, read_qupath_project
With read_qupath_project you can read all datasets exported with the QuPath script by either pointing to the QuPath project folder or to the folder with the exported datasets. Pointing to the QuPath project folder has the advantage that the pixel sizes can be inferred automatically.
pixel_size = 0.3774
exp = read_qupath_project(
path=download_path,
pixel_size=pixel_size,
# dataset_name="Slide001",
# sample_name="SampleA"
)
Data folders found:
- 'Demo-Data': 3 dataset(s)
Reading 'Demo-Data'...
exp
InSituExperiment with 3 samples:
uid CITAR slide_id sample_id
0 77baa467 ++--+ Demo-Data circle
1 8b3110ca ++--+ Demo-Data polygon
2 071bce84 ++--+ Demo-Data rectangle
To read a single dataset one can use read_qupath:
circle = read_qupath(
path=download_path / "Demo-Data/circle/",
pixel_size=pixel_size,
dataset_name="Demo-Data",
sample_name="circle"
)
polygon = read_qupath(
path=download_path / "Demo-Data/polygon/",
pixel_size=pixel_size,
dataset_name="Demo-Data",
sample_name="polygon"
)
circle
InSituData
Method: mIF
Slide ID: Demo-Data
Sample ID: circle
Path: None
➤ images
mIF: (19, 502, 503)
➤ cells
MultiCellData with main layer 'main'
matrix
AnnData object with n_obs × n_vars = 494 × 19
obs: 'Object ID', 'Centroid X µm', 'Centroid Y µm', 'Nucleus: Area µm^2', 'Nucleus: Length µm', 'Nucleus: Circularity', 'Nucleus: Solidity', 'Nucleus: Max diameter µm', 'Nucleus: Min diameter µm', 'Cell: Area µm^2', 'Cell: Length µm', 'Cell: Circularity', 'Cell: Solidity', 'Cell: Max diameter µm', 'Cell: Min diameter µm'
obsm: 'spatial'
layers: 'nucleus', 'cytoplasm', 'membrane'
boundaries
BoundariesData object with 2 entries:
cells
nuclei
➤ regions
data: 1 regions, 1 class ('circle')
circle.show()