Here is a brief description of the layout of the Objects in my Project and how the binary search partition is built.
Overview of the Process
- At the bottom we have points that are either read in or generated randomly
- The points are collected into cells and the cells make up a grid. Each cell has a density(that can be calculated), and number of points associated with it.
- The grid is put into one bucket whose lowerBound and upperBound are not points, but are designated by cells in the grid. Therefore the lowerBound will always be (0,0,0,0) and the upperBound will be (X,X,X,X) for the first bucket where X is the last cell in each dimension.
- From here on, we deal only with cells
- A bucket is created to contain all the cells
- The bucket is split into two buckets until the desired number of buckets is reached.
Defining the Cells
The question is what parameters decide the cell size/shape and grid size and shape? We determine this in the following way:
- Find the lower and upper bound points in each dimesion are lowerBound = (wl,xl,yl,zl) and upperBound = (wu,xu,yu,zu). This may be given by the random point generating function, but most often is determined as we read the points.
- Given the resolution of the grid we can now determine the size/shape of the cells.
From bucket to buckets
Next the question is: how is the first bucket made and how do we split the into more buckets?
Each bucket has:
- lowerBound and upperBound cells
- cell partition
- number of points
- skew
- skew reduction
The first bucket is made up of all the cells in the Grid4D object. Bucket = lowerBound(0,0,0,0) and upperBound(X,X,X,X). Notice that these are cell numbers not actual positions.
When a bucket is created it is given cell bounds and the following are calculated:
- Number of points
- Skew
Note that we CANNOT CALCULATE THE:
- partition point - because this creates buckets and that would start an endless loop.
- skew reduction - because this requires the partition point.
Partitioning the Bucket
A bucket must be refreshed - that is we must calculate the partitionPoint and skewReduction by calling refresh();
The bucket then has a partitionCell. The partitionCell is the upper bound cell execept in the dimension it is split. When a bucket is split the lowerBucket and upperBucket are created as follows:
lowerBucket = [lowerBoundCell, partitionCell] upperBucket = [lowerBoundCell except in the split dimension substitute partitionCell+1, upperBound]
Clearly than the calculation of the skew reduction must be done along this same split.
Running the Simulation
The following steps must be done in order:
Create a new DataGrid4D object called g = new DataGrid4D(fileString,resolution)
Determine the two query points: Point4D q1, q2
Set the number of buckets in the input DataGrid4D.setBuckets(int NumBuckets)
(Here we start our actual contribution: )Build the parametric functions: DataGrid4D.buildParametricFunctions(Point4D q1,q2)
Run: BSP4D.MaxCount(float from, to)
Collecting statistics
For each run of the program we will collect in a comma seperated values list:
- Number of buckets (20, 40, 80)
- Number of points (10K, 20K, 30K, 40K)
- The grid is going 100x100x100x100 or a volume of 100,000,000.
- This is broken up into cells that are 2x2x2x2 (volume = 16) making a grand total of 6.25 Million hypercubes in the input to the indexing algorithm.
- Distance between query points, see the list of query points before.
- Dense vs. Sparse regions listed next to the points
- Time instances we use
Query Points: query1.add(0,new Point4D(95,95,95,95)); Dense query2.add(0,new Point4D(98,98,98,98)); query1.add(1,new Point4D(80,80,82,81)); Moderately Dense query2.add(1,new Point4D(84,83,88,87)); query1.add(2,new Point4D(60,55,62,59)); Mid Dense query2.add(2,new Point4D(65,58,68,59)); query1.add(3,new Point4D(30,31,30,31)); Moderately Sparse query2.add(3,new Point4D(40,40,40,40)); query1.add(4,new Point4D(2,5,2,5)); Sparse query2.add(4,new Point4D(5,10,9,10)); Time instances tested (.01, 10) (-.01, -10)
Notes
Grid Size Consideration
If the resolution is larger than the width in any single dimension then the cellWidth < 1 --> cellWidth=1. The problem here is that we don't recalculate the number of cells and cell boundaries. Thus
(upperBoundCell - lowerBoundCell) * cellWidth > DataGrid in that dimension.
and as a consequence we have a large number of extra empty cells. These contribute to the spatial skew of the bucket by lowering it. Thus having two small a grid can adversely affect the the ability to reduce the spatial skew in a bucket.
Max Count Formulas
Given the following information:
Given the query points: Q1 = (qw1, qx1, qy1, qz1) and Q2 = (qw2, qx2, qy2, qz2) Point that we wish to check: P = (w,x,y,z) Slope = to -t: m = -t
We can check to see if the point P is dominated by Q2 and dominates Q3 by the following inequalities:
Q2 dominates P x >= m(w-q1w) + q1x z >= m(y-q1y) + q1z P dominates Q1 x <= m(w-q2w) + q2x z <= m(y-q2y) + q2z