import moraine as mr
rtree
Rectangle tree for fast point cloud data query
HilbertRtree
HilbertRtree (bounds_tree:numpy.ndarray, n_points:int, page_size=512)
*This class provides a read-only Hilbert R-tree for spatial indexing.
See Hilbert_R-tree for more info on the Hilbert R-tree.
This implementation stores the R-tree as an array representation of a binary tree. See Binary_tree for more info on the array representation of a binary tree.*
Type | Default | Details | |
---|---|---|---|
bounds_tree | ndarray | an array representation of a binary tree. shape [n_pages, 4]. Every row is a bounding box of a node: [x0, y0, xm, ym] | |
n_points | int | number of total points | |
page_size | int | 512 | number of points in every leaf node. |
HilbertRtree.build
HilbertRtree.build (x:numpy.ndarray, y:numpy.ndarray, page_size=512)
classmethod to build a HilbertRtree.
Type | Default | Details | |
---|---|---|---|
x | ndarray | x coordinates, e.g., lon, shape [n_points,] | |
y | ndarray | y coordinates, e.g., lat, shape [n_points,] | |
page_size | int | 512 | number of points in every leaf node. should be same as pc_chunk_size. |
HilbertRtree.save
HilbertRtree.save (path:str)
Save the HilbertRtree.
Type | Details | |
---|---|---|
path | str | zarr path |
HilbertRtree.load
HilbertRtree.load (zarr_path:str)
classmethod to load the saved HilbertRtree.
Type | Details | |
---|---|---|
zarr_path | str | zarr path |
HilbertRtree.bbox_query
HilbertRtree.bbox_query (bounds:tuple|list, x:numpy.ndarray|zarr.core.Array, y:numpy.ndarray|zarr.core.Array)
get the index of points that are in the bounding box.
Type | Details | |
---|---|---|
bounds | tuple | list | data query bounding box, [x0, y0, xm, ym] |
x | numpy.ndarray | zarr.core.Array | x coordinate |
y | numpy.ndarray | zarr.core.Array | y coordinate |
Example: create the data:
= np.random.randint(0, 10000*10000-1, size=1000000)
gix
gix.sort()= np.stack(np.unravel_index(gix,shape=(10000,10000)),axis=-1).astype(np.int32)
gix = mr.pc_hix(gix,shape=(10000,10000))
hix = mr.pc_sort(hix)
key
= gix[:,1][key]/10000; y = gix[:,0][key]/10000 # any kind of coordinates
x print(x.min(),y.min(),x.max(),y.max())
0.0 0.0 0.9999 0.9999
build r tree:
= HilbertRtree.build(x,y,page_size=4096) rtree
data query:
= 0.9, 0.9, 1.0, 1.0
x0, y0, xm, ym = [x0, y0, xm, ym]
bounds = rtree.bbox_query(bounds,x,y) q_idx
compare with the brute-force check:
= np.where((x>x0) & (x<xm) & (y>y0) & (y<ym))[0]
q_idx_ np.testing.assert_array_equal(q_idx, q_idx_)
speed compare:
= rtree.bbox_query(bounds,x,y) q_idx
426 µs ± 2.46 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
= np.where((x>x0) & (x<xm) & (y>y0) & (y<ym))[0] q_idx_
1.22 ms ± 3.39 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
HilbertRtree
is designed for query small amount of data from a big dataset. The bigger the dataset, the greater the improvement.
for data on the disk:
= zarr.open('rtree/x.zarr','w',shape=x.shape,chunks=(4096,),dtype=x.dtype)
x_zarr = zarr.open('rtree/y.zarr','w',shape=x.shape,chunks=(4096,),dtype=x.dtype)
y_zarr = x; y_zarr[:] = y x_zarr[:]
= zarr.open('rtree/x.zarr','r')
x_zarr = zarr.open('rtree/y.zarr','r') y_zarr
= rtree.bbox_query(bounds,x_zarr,y_zarr) q_idx
4.6 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
= x_zarr[:]; y = y_zarr[:]
x = np.where((x>x0) & (x<xm) & (y>y0) & (y<ym))[0] idx
215 ms ± 1.44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)