-
Notifications
You must be signed in to change notification settings - Fork 14
Description
One of the questions in the recent Earthmover's webinar on xvec was about the performance of extract_points compared to rasterio's sample method. I have never tested this before so wanted to give it a go and for a large lazy-loaded raster (digital terrain model), our extract_points is waaaay slower. See https://notebooksharing.space/view/4459f651d27b2f214f8590c30aba0782f7515d4c16b00dcd3283492f19f8e694#displayOptions=
The DTM is from https://geoportalpraha.cz/en/data-and-services/97d2c9c11aa9478cb21b469b8a4f820e in case you'd like to test the same but any raster should do the trick I assume.
Under the hood, extract_points is simply passing the coordinates to .sel with method='nearest', which should be doing exactly the same as rasterio's sample.
Lines 1261 to 1263 in 66b541b
| subset = self._obj.sel( | |
| {x_coords: x_, y_coords: y_}, method="nearest", tolerance=tolerance | |
| ) |
This is not optimal.
We could possibly use sample via rioxarray if the raster is loaded via xarray as it is available through dtm_da.rio._manager.acquire().sample(list(zip(x, y))) but that is relying on a private API of rasterio. I'll open an issue there if there's an appetite to expose sample on the rio accessor.
Outside of relying on rasterio, is there a way of speeding it up using some xarray magic?