Binary traversal

From Organic Design wiki
Revision as of 00:02, 26 May 2016 by Nad (talk | contribs)

Binary traversal is the algorithm that the nodal model uses to give a simple list space functionality of an associative array. It does it by mapping the binary digits of an list-item's binary address to a traversal path between list-items in the space, effectively forming a bitwise trie data structure.

Although this process might sound slow, it is very cache-local and highly parallelisable due to the lack of register dependencies and therefore in fact performs excellently on modern out-of-order execution CPUs. A red-black tree for example performs much better on paper, but is highly cache-unfriendly and causes multiple pipeline and TLB stalls on modern CPUs which makes that algorithm bound by memory latency rather than CPU speed.

In comparison, a bitwise trie rarely accesses memory and when it does it does so only to read, thus avoiding SMP cache coherency overhead, and hence is becoming increasingly the algorithm of choice for code which does a lot of insertions and deletions such as memory allocators (e.g. recent versions of the famous Doug Lea's allocator (dlmalloc) and its descendents).

Associations

Association

Making a unique bit sequence from a node index

The nodes of a node space are all identified by an integer index which starts at 0 for the first element created (the root node), and 1 for the next and so on. All integers can be represented as a unique binary sequence which allows them to all exhibit an address based on a unique path, this kind of data structure is called trie. Here is a table showing some integers and their binary and path representations:

Decimal   Binary   Nodal Path
0    0    -
1    1    -
2    10    0
3    11    1
4    100    0-0
⋮    ⋮   
100    1100100    0-0-1-0-0-1
⋮    ⋮   
1000    1111101000    0-0-0-1-0-1-1-1-1
⋮    ⋮   
1023    1111111111    1-1-1-1-1-1-1-1-1
1024    10000000000    0-0-0-0-0-0-0-0-0-0
1025    10000000001    1-0-0-0-0-0-0-0-0-0

Both the decimal and the binary numbers really have an infinite number of leading zeros before them, but we can still maintain a unique sequence of digits to represent every integer even if we omit all leading zeros as we have in this table. The point is that this rule means that every single number except for 0 always ends with a non-zero digit (numbers are read from right to left). While this may not be of much interest for most number systems, you will notice that with binary it means that every number except zero ends with a 1 - and if we forget about the two first items in the list, then we could drop the last bit altogether. As can be seen from the Nodal Path column in the list, that's exactly what we do in path traversal. This can't usually work with normal numeric data types because 0 and 00 evaluate to the same, but a sequence of binary digits is more like a string comparison of "0" and "00".

In the actual implementations of the listTraverse function, a multiplication by three must also occur to account for the fact that list items are groups of three node references.

Zero length traversal

In the table above showing the conversion of integer node references into binary paths, we see that the first two indexes, 0 and 1 both translate to a null path which involves no traversal at all, and then index 2 and 3 translate to the paths of "0" and "1" which involve a single iteration of traversal. In the actual implementations of the listTraverse function an addition of 1 occurs before translation to a binary sequence so that only index 0 results in no traversal, and indexes 1 and 2 result in the two single iteration traversals.

This zero length traversal is still valid because the subject node from which traversal would start has a value just like every other node or list item. If a non-zero value is returned from nodeGetValue when passed a zero key, the value is the focus of the subject node's loop.

Binary traversal function in C

// Start at subject listItem and traverse the key as an association to a new listItem
// - key is also a listItem reference and its binary address is used as the traversal path
// - subject and key (all list-item references are ints starting at item 0)
// - if key is 0 the subject is returned unchanged, key 1 and 2 are the two single-iteration traversals
item listTraverse(item subject, item key) {
	if(key++ > 0) {
		int i,j;
		for(i = 1; i <= key >> 1; i <<= 1) {
			j = subject * 3;
			if(key & i) j++;
			if(space[j] == 0) space[j] = listInsert();
			subject = space[j];
		}
	}
	return subject;
}

Eastern philosophy

As mentioned above, the meaning of the root node (the node at address 0) should be reserved for the meaning that is most commonly traversed, then the two single-step traversals reserved for the next most popular. The most popular traversal is the one that goes withinward, called loop or within, and the next two most popular are next and previous.

Another more philosophical reason to make the first node be the movement withinward is that if the size of the space is reduced to just a single item having address zero, then the nodal reduction process still makes sense. In such a space, references occupy no bits (a space of size two has single-bit-references), and the single only reference refers to itself.

The first thing nodal reduction does is start at node 0 and go withinward by following the zero-length traversal, this is set to zero (itself) which is an infinite process. This fits perfectly with the eastern traditions of Taoism, Zen and Advaita in which the ultimate source of everything is nothing and exhibits the attribute of self-reference or awareness-of-awareness.

Notes

  • It may be possible to make traversal work as a Patricia trie which combines all path segments that have only one direction

See also