Performance of extract_points vs rasterio sample

One of the questions in the recent Earthmover's webinar on xvec was about the performance of `extract_points` compared to rasterio's `sample` method. I have never tested this before so wanted to give it a go and for a large lazy-loaded raster (digital terrain model), our `extract_points` is waaaay slower. See https://notebooksharing.space/view/4459f651d27b2f214f8590c30aba0782f7515d4c16b00dcd3283492f19f8e694#displayOptions=

The DTM is from https://geoportalpraha.cz/en/data-and-services/97d2c9c11aa9478cb21b469b8a4f820e in case you'd like to test the same but any raster should do the trick I assume.

Under the hood,  `extract_points` is simply passing the coordinates to `.sel` with `method='nearest'`, which should be doing exactly the same as rasterio's `sample`.  https://github.com/xarray-contrib/xvec/blob/66b541bd509b4bcaade0bbeed3dd90b852b602a3/xvec/accessor.py#L1261-L1263

This is not optimal.

We could possibly use `sample` via rioxarray if the raster is loaded via xarray as it is available through `dtm_da.rio._manager.acquire().sample(list(zip(x, y)))` but that is relying on a private API of rasterio. I'll open an issue there if there's an appetite to expose sample on the `rio` accessor.

Outside of relying on rasterio, is there a way of speeding it up using some xarray magic?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance of extract_points vs rasterio sample #81

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	subset = self._obj.sel(
	{x_coords: x_, y_coords: y_}, method="nearest", tolerance=tolerance
	)

Performance of extract_points vs rasterio sample #81

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions