Differences between revisions 14 and 15

Dynamic Max Count

This contains the ideas and notes for a Dynamic Max Count (Dynamic Max-in-time) aggregate operator

Concept

Instead of using Hyper-buckets that have constant densities which can not be updated reasonably using the MaxCountProgramNotes ideas, we propose a probabilistic method where by we define a probability density function in each hyper-bucket. In a sense we are not trying to minimize skew in the bucket creation process, but recognizing and modeling skew in each bucket. The added advantage to this concept is that a region equivalent to a hyper-bucket containing no points may be excluded from the index. Consequently the algorithm searches a smaller space. For example if it becomes necessary to shrink the size of a hyper-bucket to latex2($10~unit^{6}$) size in a latex2($10000~unit^6$) space we will have latex2($1 \times 10^{18}$) buckets. With points concentrated in specific regions we will only have to track a fraction of these in the index.

Here is a description of the index:

Index defines
1. Spatial dimensions
2. Bucket Dimensions
3. Histogram divisions that determine the level of approximation in each hyper-bucket (see below)
4. A multi-dimensional probability function preferably a function that uses functions as parameters e.g. latex2( $p (x_{u} (t), x_{l} (t), y_{u} (t), y_{l} (t) [, z_{u} (t), z_{l} (t)])$ )
A theory to update, delete or insert points and the distributions based on changes to points.

The above has been implemented in C# 8/15/2005.

Thus we maintain a database of 4-dimensional points that we index using 6-dimensional, probability buckets.

-  ⇤ ← Revision 14 as of 2005-08-17 17:32:36 → 
  Size: 1600
  Editor: yakko
  Comment:
+   ← Revision 15 as of 2005-08-17 17:38:09 → ⇥
  Size: 1688
  Editor: yakko
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 8:
-Instead of using Hyper-buckets that have constant densities which can not be updated reasonably using the MaxCountProgramNotes ideas, we propose a  probabilistic method where by we define a probability density function in each hyper-bucket. In a sense we are not trying to minimize skew in the bucket creation process, but recognizing and modeling skew in each bucket. The added advantage to this concept is that a region equivalent to a hyper-bucket containing no points may be excluded from the index. Consequently the algorithm searches a smaller space. For example if it becomes necessary to shrink the size of a hyper-bucket to [[latex2($10~unit^{6}$)]] size in a [[latex2($10000~unit^6$)]] space we will have [[latex2($1 \times 10^18$)]] buckets. and points concentrated in specific inEach probability density will need the following properties:
+Instead of using Hyper-buckets that have constant densities which can not be updated reasonably using the MaxCountProgramNotes ideas, we propose a  probabilistic method where by we define a probability density function in each hyper-bucket. In a sense we are not trying to minimize skew in the bucket creation process, but recognizing and modeling skew in each bucket. The added advantage to this concept is that a region equivalent to a hyper-bucket containing no points may be excluded from the index. Consequently the algorithm searches a smaller space. For example if it becomes necessary to shrink the size of a hyper-bucket to [[latex2($10~unit^{6}$)]] size in a [[latex2($10000~unit^6$)]] space we will have [[latex2($1 \times 10^{18}$)]] buckets. With points concentrated in specific regions we will only have to track a fraction of these in the index.
 Line 10:
-. Parameters that define the distribution e.g.
      1. Center location
      1. Spatial size
      1. Standard deviation
      1. A measure of symmetry or skew
   1. A multi-dimensional probability function preferably a function that uses types functions as parameters e.g. [[latex2( $p (x_{u} (t), x_{l} (t), y_{u} (t), y_{l} (t) [, z_{u} (t), z_{l} (t)])$ )]]
+Here is a description of the index:

   1. Index defines
      1. Spatial dimensions
      1. Bucket Dimensions
      1. Histogram divisions that determine the level of approximation in each hyper-bucket (see below)
      1. A multi-dimensional probability function preferably a function that uses functions as parameters e.g. [[latex2( $p (x_{u} (t), x_{l} (t), y_{u} (t), y_{l} (t) [, z_{u} (t), z_{l} (t)])$ )]]
-Line 18:
+Line 19:
-Based on this last item, we must maintain a database of
4-dimensional points that we index using 4-dimensional, probability
buckets.
+'''The above has been implemented in C# 8/15/2005.'''

Thus we maintain a database of 4-dimensional points that we index using 6-dimensional, probability buckets.

Diff for "DynamicMaxCount"

Dynamic Max Count

Concept