Geometry encoding parameter for shapes #951

keller-mark · 2025-07-03T20:35:53Z

Fixes #799
The to_parquet function supports a geometry_encoding parameter. When geoarrow, it will be more efficient to read/parse the geometries, as the data can stay in its parquet/arrow memory layout during downstream usage. Visualization applications will benefit from this (and other applications such as data processing pipelines should too).

for more information, see https://pre-commit.ci

codecov · 2025-07-18T18:49:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.20%. Comparing base (0731edd) to head (0637dac).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #951      +/-   ##
==========================================
+ Coverage   92.19%   92.20%   +0.01%     
==========================================
  Files          49       49              
  Lines        7561     7572      +11     
==========================================
+ Hits         6971     6982      +11     
  Misses        590      590

Files with missing lines	Coverage Δ
src/spatialdata/__init__.py	`100.00% <100.00%> (ø)`
src/spatialdata/_core/spatialdata.py	`91.93% <100.00%> (ø)`
src/spatialdata/_io/io_shapes.py	`94.87% <100.00%> (+0.20%)`	⬆️
src/spatialdata/config.py	`100.00% <100.00%> (ø)`
src/spatialdata/models/models.py	`88.61% <100.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

LucaMarconato · 2026-01-04T19:45:29Z

Hi @keller-mark, this is ready for review, correct?

keller-mark · 2026-01-05T13:49:06Z

Yes, thanks for the reminder, I have just updated the status

LucaMarconato · 2026-01-05T15:21:49Z

Thanks! Doing some light adjustments on the PR. I will push soon. Meanwhile you can find here a benchmark in Python for read-write operations with the new encoding. https://github.com/giovp/spatialdata-sandbox/blob/main/notebooks/benchmark_geoparquet_encoding.ipynb

Take home message: write operations are only slightly slower with geoarrow, but read operations are generally faster. The benchmark is done in pure geopandas and spatialdata. spatialdata has some overhead that disappears when the data is large.

LucaMarconato · 2026-01-05T16:10:19Z

Final changes are up. Key points:

I added some global settings to allow reducing the number of arguments to pass to functions.
I will keep the default as WKB. We can experiment and eventually change.
I keep the shapes format unchanged since APIs to read back share the same syntax
There is an edge case with geoarrow: mixed types force a coercion to the more general type. In practice this means that mixed columns polygons+multipolygon are written to disk as a column of multipolygon (see tests). Anyway, since downstream applications should expect the possibility of having multipolygons in the shapes layer, and in particular multipolygons containing a single polygon, this does not introduce a breaking change.

keller-mark and others added 5 commits July 3, 2025 16:31

Geometry encoding parameter for shapes

22feee5

[pre-commit.ci] auto fixes from pre-commit.com hooks

74690e4

for more information, see https://pre-commit.ci

Update spatialdata.py

fc7d5ce

Update io_shapes.py

76dd54e

Update io_shapes.py

8680540

Merge branch 'main' into keller-mark/geometry-encoding

5c0446e

keller-mark mentioned this pull request Sep 17, 2025

Support ambiguous (unclear if WKB- or Geoarrow-encoded) Geopandas outputs vitessce/vitessce#2265

Closed

Merge branch 'main' into keller-mark/geometry-encoding

4f703cb

keller-mark marked this pull request as ready for review January 5, 2026 13:48

add setting for geometry_encoding; add tests

0637dac

LucaMarconato merged commit 2794fb0 into scverse:main Jan 5, 2026
9 checks passed

LucaMarconato added the release-added label Jan 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Geometry encoding parameter for shapes #951

Geometry encoding parameter for shapes #951

Uh oh!

keller-mark commented Jul 3, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 18, 2025 •

edited

Loading

Uh oh!

LucaMarconato commented Jan 4, 2026

Uh oh!

keller-mark commented Jan 5, 2026

Uh oh!

LucaMarconato commented Jan 5, 2026

Uh oh!

LucaMarconato commented Jan 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Geometry encoding parameter for shapes #951

Geometry encoding parameter for shapes #951

Uh oh!

Conversation

keller-mark commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LucaMarconato commented Jan 4, 2026

Uh oh!

keller-mark commented Jan 5, 2026

Uh oh!

LucaMarconato commented Jan 5, 2026

Uh oh!

LucaMarconato commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

keller-mark commented Jul 3, 2025 •

edited

Loading

codecov bot commented Jul 18, 2025 •

edited

Loading

LucaMarconato commented Jan 5, 2026 •

edited

Loading