Semantic patterns

Processing’s PVector, JBox2D’s Vec2, java.awt.geom.Point2D, Geomerative’s RPoint, JTS’s Coordinate, point2line’s Vect2, toxiclibs’ Vec2D/Vec3D… the list goes on. I’d conservatively say that there are no less than 10 slightly different implementations of vector/coordinate/point classes on my laptop now. Each one of them has an x/y field, each of them has an “add” and “subtract” equivalent; all of the vectors have scale, magnitude, etc. type operations.

Why are there so many versions of what is basically the same idea? Why is there so much code repetition? I think the most straightforward answer is that each library has a different ideology, a different code style, and with that a need for a different implementation. Some (most) implementations use mutable floats for the x/y values; others make immutable classes whose operations return new objects. Some methods are named “scale”, and some named “mult”. Coordinates and points might be in two dimensions, or three dimensions, or N dimensions. Depending on the library it was built for, each class will have different features and shortcomings.

But there is obviously some overlap. I understand that there’s a “difference” between points, vectors, and coordinates in the theoretical sense, but as far as the computer is concerned, we’re working with two or three floats and a set of methods that operate on these floats. I can assure you that the method for adding vector a to vector b will look like:

public Vec2 add(Vec2 b) {
     return new Vec2(x+b.x, y+b.y);

The method might instead add the given vector’s coordinates to the local variables, like:

public void add(Vec2 b) {
     x += b.x; y += b.y;

Regardless of the exact implementation, the abstract idea is the same. We’re trying to add two vectors together. So, being a computer scientist and obsessed with efficiency and uniformity, I think there should be some way to tell a program that these multiple classes represent the same thing. There needs to be a way to give semantic information about what a class represents to a program, and then a way to let that program use the information. This raises deeper philosophical questions as to how to tell if two ideas are “the same” in the first place, but I think the inherent subjectivity in meaning could be circumvented using the formality of mathematics and logic. At this point in time I see the very bottom of math being composed of two things: numerals and sets. Numerals could probably be encoded as sets (0 = empty set, 1 = set containing empty set, 2 = set containing 1, etc) but that would probably create a lot of tedium and too much rigidity; making numerals arbitrary is fine. Perhaps to build a programming language off these extremely basic ideas, and keep building until you have today’s useful data-types and structures, would allow for a whole new level of abstraction. But I’m getting away from myself.

Personally, I really like how JBox2D’s Vec2 class contains operations that always returns the result (aka the first “add” method as seen above). But I’m also writing a 2D pan-and-zoom camera class for Processing, and I don’t want to add a dependency on JBox2D just to get at his Vec2 class. I could copy and paste and write yet another vector implementation, but then the two classes would be incompatible with each other even though they are effectively the same class and I’d get frustrating “incompatible type” errors if I used JBox2D with Processing (which happens a lot). I settled for using Processing’s PVector because it was included in the library. Unfortunately, the PVector’s add, scale, sub, etc. operations all return void (meaning they mutate the variables locally). There are static methods that take two vectors and add/sub/cross/etc. them together and return the resultant vector, but it leads to ugly code like

        PVector sub = modelPoint.get(); sub.sub(corner); sub.div(s);
        corner.set(PVector.sub(modelPoint, sub));

Something like corner = modelpoint.sub(modelPoint.sub(corner).div(s)) (or even better corner = modelPoint - ((modelPoint - corner) / s) as in Scala) is much more concise. Of course, the methods with those signatures exist only in Vec2, and not in PVector. But why? Why couldn’t the computer use the sequence of operations as prescribed in Vec2 on PVector? Specific computer details aside (aka allocating memory, reading and writing the floats), it makes sense to be able to use Vec2’s signature with PVector. I could write a Vec2 version of add, mult, scale, normalize, etc. in PVector with absolutely no problems, because the conceptual ideas of the two classes are equal. I want to say that the problem is that there is an extremely closeknit coupling between a class’s bytecode and the semantic idea behind it, but I feel like I’m mis-using terminology in there. Subtyping a common abstract superclass could solve this problem but subtyping is dependent a) on the programmer recognizing that the supertype exists (or on a supertype existing at all), and b) the programmer deciding that the supertype is well suited to fit the implementation. Rarely does this actually happen outside of class hierarchies built all at once, under the same project.

This idea is not restricted to vectors. Every class (and even every package and every project) has certain concepts behind it that defines exactly what the class should be doing. One might percieve a mathematical function that takes concepts as input, and with some further configurations and considerations taken into account, outputs a class or set of classes. You can perceive programming as the act of computing this function. This function is certainly not one-to-one, as my vector example showed. Different configurations for the function will have it output different classes but these classes should share a common type, a common theme.