Opened 3 months ago
Last modified 6 days ago
#32170 new enhancement
Containers for families of geometric objects
Reported by:  mkoeppe  Owned by:  

Priority:  major  Milestone:  sage9.5 
Component:  geometry  Keywords:  
Cc:  ghDRKWang  Merged in:  
Authors:  Reviewers:  
Report Upstream:  N/A  Work issues:  
Branch:  public/32170 (Commits, GitHub, GitLab)  Commit:  d160842d5b2c1f129ef843377e443636393b8027 
Dependencies:  Stopgaps: 
Description (last modified by )
We define APIs for static and dynamic containers storing finite families of geometric objects, extending Container
, Set
, MutableSet
, Mapping
, MutableMapping
https://docs.python.org/3/library/collections.abc.html
The family must define a collection of "measurement maps" B_i
, each sending an object to an InternalRealInterval
.
The product of these intervals can be thought of as a "generalized bounding box".
Ideally the maps would be inclusionpreserving.
The implementation is provided by rtree
/libspatialindex
(#32000).
The Sagespecific class will take care of:
 replacing an
InternalRealInterval
by a rescaling or inclusionpreserving overestimation, making the coordinates suitable for the underlying library.  providing accelerated set operations such as:
 a fast path for computing empty
intersection
when the result will be empty,  a fast path for
__contains__
when the result will beFalse
.
 a fast path for computing empty
Geometric lookup operations to be supported:
 ...
Applications:
Change History (49)
comment:1 Changed 3 months ago by
comment:2 followup: ↓ 4 Changed 3 months ago by
OK, the next step would be to find out whether the library does any arithmetic with these double
values (in this case we would have to check whether they apply directed rounding correctly), or whether they only use comparisons on these double
values.
comment:3 Changed 3 months ago by
 Description modified (diff)
comment:4 in reply to: ↑ 2 Changed 3 months ago by
The library libspatialindex
uses arithmetic operations, including +
, 
and *
, with those double
values in order to support some functions, such as
LineSegment::Collinear
( https://libspatialindex.org/en/latest/doxygen/LineSegment_8cc_source.html#l00439),

LineSegment::getArea
(https://github.com/libspatialindex/libspatialindex/blob/master/src/spatialindex/Region.cc)
and etc.
Besides, the library libspatialindex
also uses std::numeric_limits<double>::epsilon()
to deal with machine precision issue, such as Region::touchesPoint(const Point& p)
(https://github.com/libspatialindex/libspatialindex/blob/master/src/spatialindex/Region.cc)
std::numeric_limits<double>::max();
(https://github.com/libspatialindex/libspatialindex/blob/master/src/spatialindex/Point.cc).
comment:5 Changed 3 months ago by
 Milestone changed from sage9.4 to sage9.5
comment:6 Changed 2 months ago by
A few test codes have been made for testing arithmetic precision. The conclusions are as follows, more details and codes can be found from this(https://github.com/DRKWang/Datastructureforpiecewiselinearfunction/blob/main/log35.ipynb).
 The dimension of the index database should always be greater than 1. (0 or 1 dimension will fail).
 The arithmetic precision is 16 decimal digits, which comes from the double type's precision.
 The minimal capturing of arithmetic precision is 324 decimal digits, which means the minimal difference to distinguish two numbers is 1*10^{{324}. otherwise, they will be viewed as the same number. }
 The intersection for retrieving is a closed intersection. (0.000) is also considered as the same number as 0, but both representations will be displayed at the same time.
 Rounding. The round processing is a little complex and untraceable, it is neither rounding up or rounding down.
comment:7 Changed 2 months ago by
I have made some debugs for this library. I would like to list the key points below. The entire log file can be found here https://github.com/DRKWang/Datastructureforpiecewiselinearfunction/blob/main/logsfile/log36.ipynb.
 The python wrapper part for rtree was only used for passing the values and data to 2 basic functions,
core.rt.Index_InsertData()
and
core.rt.Index_Intersects_obj()
. Those two functions are stored here https://github.com/Toblerity/rtree/blob/master/rtree/core.py. Thus, there is no arithmetic error in this part.
 The corresponding part in libspatialindex for
core.rt.Index_InsertData()
and
core.rt.Index_Intersects_obj()
are placed in line 499 and line 678 in https://github.com/libspatialindex/libspatialindex/blob/master/src/capi/sidx_api.cc .
 In the
Index_InsertData()
in C++, the author uses the concepts "points" and "boxes" to refine the input information, and then call the function
index().insertData(**)
to compute. As we discussed last time, the author mistakenly uses the
std::numeric_limits<double>::epsilon()
as the range of approximation, that is, if the bound difference is within this range, then they will be seen as a single point instead of a box. According to this note, https://stackoverflow.com/questions/48133572/whatcanstdnumericlimitsdoubleepsilonbeusedfor, we may correct it by multiplying the magnitudes of the comparison. However, because it is for scientific computation, we should get rid of approximation as much as possible.
 Except for the above place, the author also mistakenly has used
std::numeric_limits<double>::epsilon()
in many other places. I think we could inform the author to check them deeply.
comment:8 Changed 2 months ago by
 The extensions,
.idx
and.dat
, are two files used to serialize the rtree to disk, according to https://rtree.readthedocs.io/en/latest/tutorial.html.
 Those two files are created by
#include <fstream>
internally, based on this source code, https://github.com/libspatialindex/libspatialindex/blob/master/src/storagemanager/DiskStorageManager.cc.
 The
.idx
is used to store vital information like the page size, the next available page, a list of empty pages, and the sequence of pages associated with every entity ID.There is no note for the detail or purpose of.dat
file. Thus, I made a test where putting 50 boxes into this library. The shows that the.dat
is way larger than.idx
, which indicates the.dat
is used to store the real data, whereas.idx
is used for storage the rest of it.
 The correct way to close and reopen those serialized files can be found here, https://gis.stackexchange.com/questions/254781/savingpythonrtreespatialindextofile
 Except for
DiskStorageManager
, it also has aMemoryStorageManager
. The detail of it is that "Everything is stored in main memory using a simple vector. No properties are needed to initialize aMemoryStorageManager
object. When aMemoryStorageManager
instance goes out of scope, all data that it contains are lost."
 The
.dat_extension = xxx
and.idx_extension = xxx
can be used to change the names of those extensions.
comment:9 followup: ↓ 10 Changed 7 weeks ago by
I am trying to build a subclass called rtree_based_realset()
, which is inherited from the class RealSet()
. But got stuck on the purpose of this subclass. I would like to list my chain of thoughts below and hope to get a verification.
The intention of building this subclass is to keep the interface of this class, (I feel basically the interface can be considered as functions), the same and use rtree
data structure to be a preliminary filter for overestimating in order to reduce the computation time. But since rtree only serves as an overestimate from one side, in order to be accurate, this subclass, which we want to define, still needs the finite unit intervals
structure inside. In that case, seemingly, the subclass does not have a big difference from the RealSet()
class. And the difference between those two classes only comes from whether we use rtree
as an auxiliary tools or not.
comment:10 in reply to: ↑ 9 Changed 7 weeks ago by
Replying to ghDRKWang:
since rtree only serves as an overestimate from one side, in order to be accurate, this subclass, which we want to define, still needs the
finite unit intervals
structure inside.
That's right, we are not replacing the internal representation of RealSet
instances.
We are only adding an index data structure to it.
comment:11 Changed 6 weeks ago by
Since the rtree.count()
, which will return the number of objects whose boxes have intersections with the target point, is not exactly suitable for our expected function, I try to figure out some other functions to replace it. In my opinion, the best function would be rtree.getfirst()
, if we should have, which ideally can assist with both RealSet.contains()
and RealSet.interesction()
. However, I did not get such a function, instead, I found rtree.nearest()
, which will return the item whose box has minimal distance to the target point. However, after comparing rtree.nearest()
with rtree.count()
, the test results shows the rtree.nearest()
is slower than rtree.count()
.
comment:12 Changed 6 weeks ago by
As the boxes overestimate the true objects, a count
result of 0 is useful.
comment:13 followup: ↓ 14 Changed 6 weeks ago by
For intersection
, can you propose a function for rtree
that would help?
comment:14 in reply to: ↑ 13 Changed 6 weeks ago by
comment:15 followup: ↓ 16 Changed 6 weeks ago by
I am thinking about changing the internal code of function RealSet.union()
. Instead of ordinary function union(A,B)
, which has two equally operands, the function RealSet.union()
has already self
, and this format kind of indicates the self
set is dominating and the other set will be added to self
. In this case, I feel when we are trying to extend this function onto the rtree_bssed_Realset
class, it is better to return a rtree_based_RealSet
subclass instead of a RealSet
class, since the programmer kind of wants to maintain the internal rtree
data structure. And since this function only covers the +
operation, which correspondingly is also easy to implement with rtree
, thus, I feel I can try to achieve that.
comment:16 in reply to: ↑ 15 ; followup: ↓ 17 Changed 6 weeks ago by
Replying to ghDRKWang:
I feel when we are trying to extend this function onto the
rtree_bssed_Realset
class, it is better to return artree_based_RealSet
subclass instead of aRealSet
class,
Yes, I agree. This is a general change that should be made throughout the RealSet
class. For example, in RealSet.complement
, it should probably use self.__class__
instead of RealSet
when it constructs the result.
comment:17 in reply to: ↑ 16 ; followup: ↓ 18 Changed 6 weeks ago by
Yes, I agree. This is a general change that should be made throughout the
RealSet
class. For example, inRealSet.complement
, it should probably useself.__class__
instead ofRealSet
when it constructs the result.
I see. I have made the changes inside the RealSet
class by replacing RealSet
with self.__class__
. And I feel those replacements only need to happen under 2 following conditions,
 It was after the
return
. Therefore, the function will finallyreturn
the corresponding class, and apart fromreturn
, it is necessary to make a replacement.  We can not make those replacements under the function that has a
@staticmethod
decorator. Since function with@staticmethod
does not haveself
parameter, which means we can not useself.__class__
to help.
Specifically, the functions I made the change are union()
, intersection()
, complement()
, __mul__()
.
comment:18 in reply to: ↑ 17 ; followup: ↓ 20 Changed 6 weeks ago by
Replying to ghDRKWang:
 We can not make those replacements under the function that has a
@staticmethod
decorator. Since function with@staticmethod
does not haveself
parameter, which means we can not useself.__class__
to help.
Try changing these methods to @classmethod  https://docs.python.org/3/library/functions.html#classmethod
And push the branch please to the ticket
comment:19 Changed 6 weeks ago by
 Branch set to public/32170
 Commit set to 89595a6a1cb9ac62abdc1b3b5235ee81833bbb4a
New commits:
fa4e4ab  build/pkgs/cgal: New

30b487f  build/pkgs/swig:New

df9069a  build/pkgs/cgal_swig_bindings: New

82bf167  add the upstream_url to checksums

40b8a43  Merge tag '9.3.beta6' into t/31098/cgal_swig_bindings

6be25f7  build/pkgs/cgal_swig_bindings/checksums.ini: Fix upstream_url

5695105  build/pkgs/cgal_swig_bindings: Use unpatched cgalswigbindings, set Python_EXECUTABLE

e380217  merge th current version to public/32000

89595a6  make a whole change for replacing RealSet() objectas a return with 'self.__class__()' or 'cls()' as a return for 'RealSet' class; add 'Rtree_based_RealSet' class for rtree extension.

comment:20 in reply to: ↑ 18 Changed 6 weeks ago by
Try changing these methods to @classmethod  https://docs.python.org/3/library/functions.html#classmethod
Got it. I have made the change inside the RealSet
object for placing the @staticmethod
with @classmethod
and use cls()
as a wrapper for returning.
And push the branch please to the ticket
The above changes, including whole changes for RealSet
, Rtree_based_RealSet
class, have been made inside the sage/src/sage/sets/real_set.py
file. And the code have been submitted to the branch to the ticket.
comment:21 followup: ↓ 26 Changed 6 weeks ago by
Since I also want to take a test to check whether it performs correctly and since it has been changed under the src
folder, for which I thought we can use it explicitly without importing any more. However, sage
IDE shows that name 'Rtree_based_RealSet' is not defined
.
comment:22 Changed 6 weeks ago by
Also, I always have a question on how to pushing the latest changes correctly for the sagemath. Assuming there are no previous branches was uploaded for a ticket, If I want to upload a new initial branch for it, either should I merge to the latest version, for example, currently it is sage9.5, before uploading, or is it ok to keep an old version only if we can run the code without errors?
comment:23 Changed 6 weeks ago by
Yes, your current branch has a merge conflict with the current develop
, 9.5.beta0, so it is best to merge it in, resolving the conflicts
comment:24 Changed 6 weeks ago by
Actually I see the branch is on top of the cgal swig bindings stuff.
Best to do is the following:
 Start a new branch from 9.5.beta0
 Merge the branch of #32000
 cherrypick the commit 89595a6 with your realset changes
comment:25 Changed 6 weeks ago by
A comment on naming the class: For an implementation variant of a class, we would use an underscoreseparated suffix: Instead of Rtree_based_RealSet
, use RealSet_rtree
comment:26 in reply to: ↑ 21 ; followups: ↓ 28 ↓ 39 Changed 6 weeks ago by
Replying to ghDRKWang:
Since I also want to take a test to check whether it performs correctly and since it has been changed under the
src
folder, for which I thought we can use it explicitly without importing any more.
If you built Sage without enableeditable
(https://wiki.sagemath.org/ReleaseTours/sage9.3#Editable_.28.22inplace.22.2C_.22develop.22.29_installs_of_the_Sage_library), you will have to use ./sage b
to install the changes to the Python sources that you made.
comment:27 Changed 6 weeks ago by
 Commit changed from 89595a6a1cb9ac62abdc1b3b5235ee81833bbb4a to dfc5410a11c9e32eabf13648f4bce7dd20552cb4
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
594bb26  make a whole change for RealSet class with using 'self.__class__' and '@classmethod', as well as adding a new subclass RealSet_rtree.

2330ca9  add new package rtree and libspatialindex

5902e3c  build/pkgs/libspatialindex: Make it a normal package

791aafd  update SPKG.rst, correct distros, add url in checksums.ini, revise spkgconfigure

68735cb  temporarily rename the distributions files under package rtree and libspatialindex with adding prefix_ for later renaming those files with all lowercase, this is caused by git did not make casesensitive.

720ff79  Rename the distribution files under package rtree and libspatialindex with lowercases.

dfc5410  Merge branch 'public/32000' of trac.sagemath.org:sage into realset_rtree

comment:28 in reply to: ↑ 26 Changed 5 weeks ago by
Replying to mkoeppe:
If you built Sage without
enableeditable
(https://wiki.sagemath.org/ReleaseTours/sage9.3#Editable_.28.22inplace.22.2C_.22develop.22.29_installs_of_the_Sage_library), you will have to use./sage b
to install the changes to the Python sources that you made.
Since software is not built with enableeditable
, I choose to use ./sage b
. However, still, it is not able to import the RealSet_rtree
class explicitly.
comment:29 Changed 5 weeks ago by
 Commit changed from dfc5410a11c9e32eabf13648f4bce7dd20552cb4 to 674d9827224beb4f481451e70b332f989635d367
Branch pushed to git repo; I updated commit sha1. New commits:
674d982  "Fixed 2 bugs:

comment:30 followup: ↓ 31 Changed 5 weeks ago by
A few test programs have been made for this class RealSet_rtree()
, including case with +infinity
, type returned class test
. Those test codes can be found at the end here. https://github.com/DRKWang/Datastructureforpiecewiselinearfunction/blob/main/logsfile/test%20for%20real_set_with_rtree.ipynb
As a result, I found 2 bugs,
 one is for misplacing the
return self.__class__(*intervals)
;  the other is that since
floor()
andceil()
can not deal with+infinity
properly, we have to make it separately.
comment:31 in reply to: ↑ 30 ; followup: ↓ 32 Changed 5 weeks ago by
Replying to ghDRKWang:
As a result, I found 2 bugs,
 one is for misplacing the
return self.__class__(*intervals)
; the other is that since
floor()
andceil()
can not deal with+infinity
properly, we have to make it separately.
Best practice is to add a test to the doctests that demonstrate that it now works correctly.
comment:32 in reply to: ↑ 31 Changed 5 weeks ago by
Replying to mkoeppe:
Replying to ghDRKWang:
As a result, I found 2 bugs,
 one is for misplacing the
return self.__class__(*intervals)
; the other is that since
floor()
andceil()
can not deal with+infinity
properly, we have to make it separately.Best practice is to add a test to the doctests that demonstrate that it now works correctly.
Following this instruction https://doc.sagemath.org/html/en/developer/doctesting.html#beyondthesagelibrary, I tried to use ./sage t src/sage/sets/real_set.py
for docstring. However, All the testing results are failures. There are 2 types of arisen errors, one failure comes from SyntaxError: multiple statements found while compiling a single statement
, the other comes from NameError: name'RealSet_rtree' is not defined
. For the first, I guess it is caused by format error, something like an invisible return, but I have no idea about the second one. The log file is here. https://github.com/DRKWang/Datastructureforpiecewiselinearfunction/blob/main/logsfile/failure%20info.txt
comment:33 followup: ↓ 35 Changed 5 weeks ago by
The current branch defines class RealSet_rtree
3 times  that's probably the first thing that should be cleaned up
comment:34 followup: ↓ 42 Changed 5 weeks ago by
I tried the first example from your worksheet, after importing the class:
sage: from sage.sets.real_set import RealSet_rtree
sage: A = RealSet_rtree(0,1) ....: B = RealSet_rtree(0.5,3) ....: C = A.intersection(B) ....: type(C)
I get this error:
~/s/sage/sagerebasing/local/lib/python3.9/sitepackages/sage/sets/real_set.py in __init__(self, *intervals) 2707 if (x_max < lower) and (lower != +Infinity): 2708 x_max = lower > 2709 if (x_max < upper) and (upper != +Infinity): 2710 x_max = upper 2711 if x_max == 0: #rtree is not needed, since they are only +infinity and zero! NameError: name 'Infinity' is not defined
This means that you need to add some import statements to the file.
You can type import_statements('Infinity')
to help with this.
The doctests of the new class RealSet_rtree
are copypaste from RealSet
and do not test the new class. Best to replace them by the tests that you have in your worksheet.
comment:35 in reply to: ↑ 33 ; followup: ↓ 36 Changed 5 weeks ago by
Replying to mkoeppe:
The current branch defines
class RealSet_rtree
3 times  that's probably the first thing that should be cleaned up
So desperate! It is really a fatal and longhistory mistake, even existing for 3 commits. I guess it is caused by using Jupiter notebook a lot. I will be cautious next time.
comment:36 in reply to: ↑ 35 Changed 5 weeks ago by
comment:37 Changed 5 weeks ago by
Don't worry about it. It's easy to clean up the commit history later
comment:38 Changed 5 weeks ago by
 Commit changed from 674d9827224beb4f481451e70b332f989635d367 to 8c1f0c1602b5163a61a08474623f084f319b9858
comment:39 in reply to: ↑ 26 Changed 5 weeks ago by
Replying to mkoeppe:
you will have to use
./sage b
to install the changes to the Python sources that you made.
I tried to use ./sage b
again after the meeting this morning, the real_set.py
file under the directory of SAGE_LOCAL/lib/python3.7/sitepackages/sage/sets
did be rebuilt. But, Sage still shows NameError: name 'RealSet_rtree' is not defined
. Did I miss any other commands?
comment:40 Changed 5 weeks ago by
You need to import it before using, see comment:34
comment:41 Changed 5 weeks ago by
 Commit changed from 8c1f0c1602b5163a61a08474623f084f319b9858 to d160842d5b2c1f129ef843377e443636393b8027
Branch pushed to git repo; I updated commit sha1. New commits:
d160842  Doctests for class RealSet_rtree() have been adjusted and passed the test.

comment:42 in reply to: ↑ 34 Changed 5 weeks ago by
Replying to mkoeppe:
The doctests of the new class
RealSet_rtree
are copypaste fromRealSet
and do not test the new class. Best to replace them by the tests that you have in your worksheet.
I have updated the doctests of the new class RealSet_rtree
. And those doctests have been passed successfully. However, according to this documentation, https://doc.sagemath.org/pdf/en/reference/doctest/doctest.pdf, "Sage currently accepts backslashes as indicating that the end of the current line should be joined to the
next line. This feature allows for breaking large integers over multiple lines but is not standard for Python doctesting.", which infers the sage doctest can only deal with single line, except the cases with for
, while
and ending with :
. Therefore, I have to use ;
to make it into one line, which is not beautiful and clear.
comment:43 followup: ↓ 44 Changed 2 weeks ago by
In order to make the projecting directions more separable with respect to cones of a given fan, we could generate certain directions based on the inclination of cones of a given fan instead of randomly.
Suppose the dimension of universal space is $n$. Concretely,
 We will compute the inclination first. We measure this inclination of the fan with "average point" $q$. The average point is defined by $q = \sum p_i$, $\{p_i\}$ are the points along with the generating rays of the fan and whose 1norm is exactly 1.
 Then, choose a set of orthogonal basis $\{v_j\}$ coming from the complementary space of the space generated by $q$, i.e. $\forall i$, $v_j \perp q$. Note that, since the average point $q$ could be zero vector, in that case, we will choose $N(1,0,...0)$, $N(0,1,..., 0)$,..., $N(0,0,...,1)$ as the basis. The chosen set of orthogonal basis $\{v_j\}$ could be good candidates for seperating the cones of fan. One way of generating the orthogonal basis $\{v_j\}$ will be using [Gram–Schmidt process](https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process).
 After generating an orthogonal basis $\{v_j\}$, the basis will be stored in the class
RationalPolyhedralFan
as attributes of the class. and those attributes will be invoked for generating new boxes for function_contains()
.
comment:44 in reply to: ↑ 43 Changed 2 weeks ago by
Replying to ghDRKWang:
In order to make the projecting directions more separable with respect to cones of a given fan, we could generate certain directions based on the inclination of cones of a given fan instead of randomly.
Suppose the dimension of universal space is $n$. Concretely,
 We will compute the inclination first. We measure this inclination of the fan with "average point" $q$. The average point is defined by $q = \sum p_i$, $\{p_i\}$ are the points along with the generating rays of the fan and whose 1norm is exactly 1.
 Then, choose a set of orthogonal basis $\{v_j\}$ coming from the complementary space of the space generated by $q$, i.e. $\forall i$, $v_j \perp q$. Note that, since the average point $q$ could be zero vector, in that case, we will choose $N(1,0,...0)$, $N(0,1,..., 0)$,..., $N(0,0,...,1)$ as the basis. The chosen set of orthogonal basis $\{v_j\}$ could be good candidates for seperating the cones of fan. One way of generating the orthogonal basis $\{v_j\}$ will be using [Gram–Schmidt process](https://en.wikipedia.org/wiki/Gram%E2%80%93Schmidt_process).
 After generating an orthogonal basis $\{v_j\}$, the basis will be stored in the class
RationalPolyhedralFan
as attributes of the class. and those attributes will be invoked for generating new boxes for function_contains()
.
I have no idea that it does not support the Latex display here. For convenience, you may check the Latex display from the section "Generating projecting directions based on the inclination of cones of a given fan" https://github.com/DRKWang/cutgeneratingfunctions_new/blob/develop/log44.ipynb.
comment:45 followup: ↓ 46 Changed 2 weeks ago by
Another way to select a set of projection directions:
 A subset of the set of normal vectors of all facets of all cones.
 Perhaps there is a way to measure how "separating" one of these candidate normal vectors is?
comment:46 in reply to: ↑ 45 ; followup: ↓ 47 Changed 13 days ago by
Replying to mkoeppe:
Another way to select a set of projection directions:
 A subset of the set of normal vectors of all facets of all cones.
 Perhaps there is a way to measure how "separating" one of these candidate normal vectors is?
I agree that "the normal vectors of all facets of all cones could also be good candidates for separating these cones". In some sense, those facets can be basically categorized into 2 types, either a facet including an intersection of two maximal cones, or just a pure facet without any intersection. Comparing those two types with each other, the first one seems to be more important and "separating", since the first one focuses on neighbor separating, whereas the second one does not show any information. Thus, we could generate some projecting directions based on the normal vectors of some facets of cones.
Generally, the facets including an intersection can still be categorized into more subtypes based on what is the dimension of the intersection. And heuristically, the higher dimension of the intersection is, the more efficient this normal vector is. But, due to the complexity problem, I feel it is not necessary to categorize those facets at that level.
Even though the above method seems to be efficient, it could be weak when it comes to the case where all cones of a given do not have any intersection except the original points. In that case, there is no "first type of facets" at all. Thus, if that happens, we may need other ways to generate those directions. A good way is to use the direction coming from 2 center points of the tokens of two neighboring cones as supplied directions.
comment:47 in reply to: ↑ 46 Changed 13 days ago by
comment:48 Changed 6 days ago by
For the rtree.intersection()
function, we can make the return objects of it either a list of boxes with an rtree
data structure or a set of boxes without any structure. we can build a flag, like rtree_rebuilt = true/false
to indicate the intention of it. Concretely, the boxes with an rtree
data structure will be rebuilt by the class of rtree
that we defined.
comment:49 Changed 6 days ago by
For the static construction, we may build the rtree by 3 ways, topdown, bottomup or hybrid. Supposedly, there are $n$ boxes in $k$ dimension for construction.
The idea of topdown should be made by recursive operation, which means the parent tree will be built after their children's trees were built. My expectation will be something like an H tree (https://en.wikipedia.org/wiki/H_tree), which has a property of selfsimilar fractal, or like a $kD tree$. Anyway, the similarity of the subsets of boxes can be displayed on the rtree. Concretely, one way to do this is as follows:
 Separate the whole boxes into 2 subsets, equivalently in terms of numbers, based on their Xcoordinate. And take those 2 subsets as the materials for building the left subrtree and right subrtree.
 Cyclically switch the criteria from xcoordinate to ycoordinate, to zcoordinate with separating the subsets smaller and smaller until the number of subsets equal 1. For this case, the complexity of it will be $nlog(n)+2* n/2 log(n/2) + 4* n/2 log(n/4) + ... + n*1 log(1) <= n log(n) * log(n) $
For the bottomup, we may use the greedy strategy to do the local optimization on each step. We can always merge two boxes based on some criteria, like merging the two boxes that have minimal 1norm distance. And replace the 2 boxes with merged box, so that the size of the input will be decreased by 1, (now the number of boxes will be $n1$). Thus, after $n1$ steps, the rtree will be built completely.
The complexity will be at least n^2
, since we will compute the distance at least.
For the hybrid one, we can build the rtree partially by topdown and partially by bottomup in certain steps. But, how to combining them adaptively is still a hard problem for me. I would like to implement the first two before thinking about the third one.
The precision of interval values in Package
libspatialindex
isdouble
.They did not claim it explicitly, but I found it from
https://rtree.readthedocs.io/en/latest/class.html#rtree.index.Index
https://github.com/libspatialindex/libspatialindex/blob/master/include/spatialindex/capi/DataStream.h
.