Few people claim they can imagine more than 3-dimensions; most say it can’t be done. We’re stuck in the 400-year-old Cartesian box of Rene Descartes, the (x, y, z) coordinate frame we all learned in math.
We see a maximum of three dimensions because there are only three independent spatial directions. Try as we may, we can’t draw another axis perpendicular to the first three. Trying to imagine higher dimensional space this way will forever crash our brains. And yet most data problems worth considering have more than three input variables.
What if I described Wisconsin’s April weather as: “somewhat cool, quite rainy, moderately humid, and very cloudy”? Notice this description is 4-dimensional--something we just claimed we couldn’t imagine! Yet somehow, stringing a set of adverb-adjective pairs together makes sense to people. What gives?
There is a direct parallel between math and language. In language, we use adjectives (or nouns) to refer to dimensions, and we replace numbers with adverbs that connote the degree, extent, or amount of something. Yet everyday language isn’t bound to the low-dimensional world of Descartes. We think and talk about high-dimensional notions every day; we just don’t think of it geometrically.
Marvin Minsky, a pioneer of Artificial Intelligence, long ago pointed out that the way we represent information changes everything. It changes how we think about it, what it is suitable for, how we process it, and the form of the insights and answers it yields.
It’s easy to see why brains and school math work so differently—they are entirely different computational systems. The inputs, processes, and answers they spit out are different. One uses words, the other numbers. But the belief that we can’t imagine more than three quantitative dimensions is not valid. It’s just that Cartesian Think, the x,y,z geometric frame, doesn’t reach the high-D realm of human thought!
Can the way we describe things teach us about how to see high-dimensional data spaces? I think so. This is important because it would allow us to directly see and share data patterns found within them. It plays to a powerful human strength; our eyes are the best pattern detectors around! We call this approach Polynary.
To learn more stay tuned for next week’s post, or click here to download a white paper.
The Nuts and Bolts of Polynary Strings and their Graphs
Let’s begin by talking about how a set of an object’s quantitative features become represented by a single Polynary string. You can think of it as a kind of generalization of binary. It is easiest to develop this notation intuitively through the Cartesian geometry we already know.
In the 2-D Cartesian plane a point is determined by two numbers (x,y). We can divide this plane into graph paper like cells.If we divide each dimension in half, we get 4 cells (below-left); divided in thirds we get 9 cells (below-right), and so on.We can informally describe these cells in language by citing a noun-adverb combination for each dimension.
But there are other ways of dividing it up. The first division in Polynary (schematically shown in left figure below) divides the (x,y) Cartesian plane into 3 cells . We call them basins because every point will fall into one of these geometric regions. Each basin is denoted by a single letter (symbol), either X, Y, or 0, as the first letter of its Polynary string.
Notice that basin 0 is described by the points where both x and y have relatively small values—the bottom-left corner of the graph. Basin X represents the points where x is relatively large and larger than y. Basin Y represents points where y is relatively large and larger than x. Every object is a point that falls into one of these three regions based on the single thing that most stands out about it.
The figure (above right) shows the second partition of the plane shown as 9 basins denoted by unique Polynary strings two letters long. Here each previous region is sub-divided into three sets according to the next relatively largest coordinate within each of the first divisions.
And we might connote these basin regions through descriptions as, for examples: 00 –both are extremely to fairly low;0X—X is moderate and Y very to somewhat low; X0—X is somewhat high and Y very to barely low; XX—X is extremely high and Y is low; and XY—X is somewhat to extremely high, Y quite to moderate to barely high.All the Points within a basin share a common description.Within the context of a dimensional frame, descriptions point to the location, shape, and size of a sub-region of space.It’s brain math.
The way Polynary divides space is visually more complex than the graph-paper grids we leaned in school. But the logic of this sequential division principle can be applied to any number of dimensions in a completely analogous way. Importantly, there is an extremely simple algorithm that converts an N-dimensional set of numbers into its corresponding Polynary string—and its all done by a computer. We will never have to think about these odd geometric partitions again.
The only geometric notions we need to carry forward is that:
1) A Polynary string is a unique address to a solid chunk of space, the spelling of this address contains information about where it is within this space.
2) The longer the string, the smaller the chunk,
3) Each N-dimensional chunk has from N to 2N identifiable neighboring chunks, and
4) The collection of chunks accounts for the entire space.
These geometric properties are all we need to extend spatial reasoning into high-dimensional spaces. The dimensionality of the chunk and its shape makes no difference; a chunk of space is just a chunk of space. All the points in the same basin characterize objects that are similar and share a common description. This similarity increases as we increase the length of the string and their corresponding descriptions become more specific.
Objects are multi-dimensional creatures; welding their numeric properties into a single string maintains their logical integrity as a single object. One object, one string, one description. This is the level at which brain math works. In order to visualize these high-dimensional objects we need a coordinate system that displays all the potential Polynary strings.
We can do this through a coordinate system based on a two-dimensional fractal. For 2-D problems we need a fractal based on a triangle—one for each different symbol we used for the Polynary strings. For 3-D we use a square fractal, for 4-D a pentagon—a geometric figure with one more vertex than the number of formal dimensions.
Let’s show this for our earlier example where we partitioned a 2-dimensional plane. We will sequentially show how its corresponding Polynary strings are laid out on its fractal.
With the graph above we can start to see the fractal nature of the coordinate system and some of its properties. Notice, for example, the ordering of Polynary strings along the outer edges of the graph. The basins from 00…XX and from 00…YY appear in ‘binary’ order. And the basins YY…XX along the top show a ‘binary’ exchange in the values of x and y.
The ideas developed here using the 2-D plane are general. They apply to higher dimensional spaces in directly analogous ways. For example, a 5-dimensional space uses six symbols; one for each formal dimension, and a symbol to indicate ‘lower values’ across the 5 formal dimensions. This results in a fractal based on a hexagon like that shown below.
Let’s apply this coordinate system to the simplest mathematical functions—addition=x + y, and subtraction= x - y. The X and Y values define the coordinate system and the color-coded height of the bar indicates the result. What does the visual interpretation of these graphs tell you?
These examples show how different relationships reveal different patterns on the coordinate system and how the organization of the graph allows their visual interpretation—you can see the contingencies—the combination of X and Y inputs--where the outcome result is high, low, and in between.
While this is a simple example, this same technique can be applied to understand the behavior of any high-dimensional function. Otherwise incomprehensible equations can be turned over to the visual system for scrutiny—making it easy for anyone to grasp.
The visual system can glean a lot of detailed information from a graph. But directly expressing this information in words would outstrip working memory. A verbal description must summarize a graph more succinctly.
For example, the previous graph of Subtraction is simplified by looking at Polynary strings only three letters long, as shown below.The outcome of X-Y was divided into two categories—where the result is low and where it is high.We then clustered basins with similar outcome values within these categories.This leads to 5 verbal generalizations that state the contingencies under which these two outcomes occur.
Each cluster (color) forms a single spatial region whose location, size, and shape are implied by its natural language description. Just how clustering and their descriptions are derived is beyond the scope of this presentation. It is through language that we can make sense of the odd way we partition Cartesian space in Polynary. It is necessary to align with how people characterize high-dimensional objects in a few adverb-adjective phrases. This would be impossible if brain math was Cartesian.
The point here is this. Whether graphing data or functions, we can directly see the nature of the patterns and relationships in high-dimensional spaces. Visualization bypasses language. But what we come to learn from such pictures, the conclusions drawn from them, must typically be stated and shared in words. Polynary provides a way to interpret the visual picture into words.
Polynary can be applied in many domains. A critical one is data analysis. We will demonstrate the application of Polynary to major analysis problems elsewhere.