Popular Posts

Tuesday, August 20, 2013

Types and Calling Orders


Assuming that when a blob of code reaches some length it is re-factored into smaller functions, having no type system means that you can't cut off the search for what could go wrong at the inputs, outputs, and very limited side-effects; which would be like being able to prove that something is right with O(n) code review rather than from O(n lg n) code review up to O(n^2) for code full of side-effects.

If an API has M methods that only have a small number of valid calling orders, then there might be O(M!) wrong orders that kind of work well enough to be ticking time bombs.  Types can be used to document/enforce calling orders almost up to a regular expression; but not a grammar, ie: provably balanced alloc/free, push/pop, etc.  (todo: Enforcing up to a grammar may be possible with a linear type system, ie: Rust.)  Documenting calling orders is possibly more important than input/output type enforcement for this reason; which is why people read examples rather than auto-generated docs.

This extra sensitivity to ordering is what makes shared memory multi-threaded code so much harder than share-nothing concurrent code; in a measurable exhaustive bug-search sense.  It almost always has to do with two things going on that 'dont commute'.  It's a common phenomenon (commutators) in mathematics as the launching point from something that's trivial suddenly becoming hard:

A * inverse(A) = 1, and B * inverse(B) = 1

Yet:

A * B * inverse(A) * inverse(B) != 1

Because A and B don't completely commute.  It's why a Rubiks cube is trivial to solve no matter how many times you scramble opposite faces, but suddenly becomes a real problem once you start mixing faces that share an edge.  A lot of important mathematical problems are rooted in this.  It's exactly this that causes two tasks that share resources to be more complicated than the two tasks in isolation.


In addition to the actual ordering, a lot of type systems are kind of lacking in not supporting subset types, with the most important one being that every pointer is a non-null value by default (and casting a nullable X ptr to a non-null X ptr throws an exception at cast time when null rather than ever allowing nullable X ptr de-reference attempts (at random locations!) in the code.)  Other examples would be simple ranges like [0..12] as types so that there need not be runtime checks (ie: cast 13 to one of these and it throws an exception...it's not possible to find an out of range value assigned to one).


Not all type systems are perfect, but I gotta shake my head when I hear that not having a type system is somehow more productive.  It's procrastination.  You end up revisiting whipped-up code later as it breaks during major re-factoring efforts and writing and maintaining unit tests that essentially check types, assertions, and calling orders.