Monday, June 6, 2011

Infinity: Hypercubes

We all know what a square and a cube is, and their generalization for higher dimensions is called Hypercube.

Now, let's see... A square has four vertices (0,0) , (0,1), (1,0), (1,1), thus representing 2 bits of information in it's vertices. A cube has 8 = 2^3 vertices and so on... You get the image.

According to wikipedia, the human genome has 3 billion DNA base pairs. We ll know there are four bases, thus every base encodes 2 bits of information. Any human's DNA is therefore one vertex of a 6 billion dimensional Hypercube. Somewhat scary, isn't it? Even more if we realize that all humans together, wo ever lived and will live, only cover an incredibly small subset of the vertices of that hypercube.

Now, let's get to something bigger - Infinity. We can encode every natural number as an infinite sequence of 0's and 1's. Therefore, every natural number is a vertex of an infinite dimensional hypercube. Note that the power set of the natural numbers can as well be expressed as an infinite sequence of 0's and 1's showing whether a number is in the subset or not. However, this does not show N = P(N), as N does not cover any vertices with an infinite amount of 1's in the sequence, while P(N) does. P(N) actually covers ALL vertices of that hypercube.

Now to an even fancier Vector space: The space of R -> R is also a vector space, but this space has uncountably infinite base vectors. R -> {0,1} is a subset of that space, our hypercube again (this time with oncountably infinite dimensions).

... Wait a second, didn't the last one look familiar? Right, looks like binary classification, doesn't it? Binary classifiers are the vertices of that last hypercube. This can be generalized to R^d -> R, but i won't discuss that further. Can we do search in such a space, maybe to find a binary classifier? Of course this is not as good as an SVM or even a Neural Network, but for the heck of it: Why not build a Bounding Volume Hierarchy (Bounding Squares, that is)? Here is how i imagine such an algorithm:

let train(i) be the i'th train data point. Let f(]x,y[) be defined such that for all x' in ]x,y[ f(x') = f(]x,y[) -> we are building some sort of hierarchical structure

1.) set f(]-∞,∞[) = 0
2.) partition(f,-∞,∞,0,n)

partition(f,a,b,i,j):-
3.) let m = (i - j)/2
3.) set f(]a,b[) = label(train(m))
4.) partition(a,train(m),i,m - 1)
5.) partition(train(m),b,m+1,j)

This should run in O(n) or O(n log n) depending on our data structures.
This can be generalized to more than 1 dimension, but it's obvious that this yields a classifier prone to overfitting (and also, i didn't want to invest too much time in something obviously bad).

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.