A class of queries that are of interest and that are not trivial to treat efficiently with standard
relational database design are those that aim to find objects that are (or have been) spatially
close to each other.
A typical question is:
Find all halos within a distance of
(a parameter in SQL is indicated by a name prefixed with a colon),
This can be trivially translated to SQL as follows:
:R> Mpc from a given position
To execute this query though the database can not make use of standard techniques for performance improvement.
Such techniques generally require a representation of the data in which desired objects are placed nearby on disk.
Disks are basically 1-dimentsional structures, so an ordering in some quantity is generally used.
For spatial structures in 2 dimensions or higher, there is no exact way to accomplish this.
This implies that the database will have to scan through the whole table and evaluate the expression in the WHERE
clause. During this scan the whole table will have to be read from disk and it is this IO part of the query execution which
is really the expensive step and should if possible be minimized.
FROM MPAHaloTrees..MHalo H
WHERE SQRT((H.X-:X)*(H.X-:X)+(H.Y-:Y)*(H.Y-:Y)+(H.Z-:Z)*(H.Z-:Z)) <= :R
To support spatial queries that perform reasonably well we provide a number of options, some of
which are still under development. These are described below.
To an object we associate integer valued positional coordinates, obtained from binning the simulation box by a relatively course grid.
We have chosen grid cells of size about 10 h-1 Mpc, 503 cells in the Millennium, 73 in the millimil.
When the size of the grid cell is
L, the index columns are defined as
ix=floor(x/L) and similar for
For an interesting general discussion of this problem and various approaches solutions see
thesis by Volker Markl and references therein.
A very general reference to spatial indexes is